Startseite Study of triplet periodicity differences inside and between genomes
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Study of triplet periodicity differences inside and between genomes

  • Yulia M. Suvorova EMAIL logo und Eugene V. Korotkov
Veröffentlicht/Copyright: 24. Februar 2015

Abstract

Triplet periodicity (TP) is a distinctive feature of the protein coding sequences of both prokaryotic and eukaryotic genomes. In this work, we explored the TP difference inside and between 45 prokaryotic genomes. We constructed two hypotheses of TP distribution on a set of coding sequences and generated artificial datasets that correspond to the hypotheses. We found that TP is more similar inside a genome than between genomes and that TP distribution inside a real genome dataset corresponds to the hypothesis which implies that a common TP pattern exists for the majority of sequences inside a genome. Additionally, we performed gene classification based on TP matrixes. This classification showed that TP allows identification of the genome to which a given gene belongs with more than 85% accuracy.


Corresponding author: Yulia M. Suvorova, Bioinformatics Laboratory, Centre of Bioengineering of the Russian Academy of Sciences, 117312, Prospect 60-tya Oktyabrya, Moscow, Russian Federation, e-mail:

Acknowledgments

The work was supported by the Russian Foundation for Basic Research (RFBR) grant 2014-04-00164.

References

Antezana, M. A. and M. Kreitman (1999): “The nonrandom location of synonymous codons suggests that reading frame-independent forces have patterned codon preferences,” J. Mol. Evol., 49, 36–43.Suche in Google Scholar

Bernaola-Galván, P., I. Grosse, P. Carpena, J. L. Oliver, R. Román-Roldán and H. E. Stanley (2000): “Finding borders between coding and noncoding DNA regions by an entropic segmentation method,” Phys. Rev. Lett., 85, 1342–1345.Suche in Google Scholar

Bohlin, J. and E. Skjerve (2009): “Examination of genome homogeneity in prokaryotes using genomic signatures,” PLoS One, 4, 12.10.1371/journal.pone.0008113Suche in Google Scholar PubMed PubMed Central

Bohlin, J., E. Skjerve and D. W. Ussery (2009): “Correction: investigations of oligonucleotide usage variance within and between prokaryotes,” PLoS Comput. Biol., 5, 9.Suche in Google Scholar

Bradley, J. V. (1968): Distribution-free statistical tests, Chapter 12, Prentice-Hall, Englewood Cliffs, NJ, USA.Suche in Google Scholar

Chen, B. and P. Ji (2012): “Numericalization of the self adaptive spectral rotation method for coding region prediction,” J Theor. Biol., 296, 95–102.Suche in Google Scholar

Cover, T. and P. Hart (1967): “Nearest neighbor pattern classification,” IEEE Trans. Inform. Theor., 13, 21–27.Suche in Google Scholar

Eskesen, S. T., F. N. Eskesen, B. Kinghorn and A. Ruvinsky (2004): “Periodicity of DNA in exons,” BMC Mol. Biol., 5, 12.Suche in Google Scholar

Fickett, J. W. (1982): “Recognition of protein coding regions in DNA sequences,” Nucleic Acids Res., 10, 5303–5318.Suche in Google Scholar

Fickett, J. W. and C. S. Tung (1992): “Assessment of protein coding measures,” Nucleic Acids Res., 20, 6441–6450.Suche in Google Scholar

Frenkel, F. E. and E. V. Korotkov (2008): “Classification analysis of triplet periodicity in protein-coding regions of genes,” Gene, 421, 52–60.10.1016/j.gene.2008.06.012Suche in Google Scholar PubMed

Frenkel, F. E. and E. V. Korotkov (2009): “Using triplet periodicity of nucleotide sequences for finding potential reading frame shifts in genes,” DNA Res., 16, 105–114.Suche in Google Scholar

Gao, J., Y. Qi, Y. Cao and W. Tung (2005): “Protein coding sequence identification by simultaneously characterizing the periodic and random features of DNA sequences,” J. Biomed. Biotechnol., 2005, 139–146.Suche in Google Scholar

Jose, M. V., T. Govezensky and J. R. Bobadilla (2005): “Statistical properties of DNA sequences revisited: the role of inverse bilateral symmetry in bacterial chromosomes,” Physica A Stat. Mech. Appl., 351, 477–498.Suche in Google Scholar

Konopka, A. K. (1994): Sequences and codes: fundamentals of biomolecular cryptology. In: Smith, D. (Ed.), Biocomputing: Informatics and Genome Projects. Academic Press, San Diego, pp. 119–174.10.1016/B978-0-08-092596-7.50008-3Suche in Google Scholar

Korotkov, E. V., M. A. Korotkova and V. M. Rudenko (1999): “Latent periodicity of protein sequences,” J. Mol. Model, 5, 103–115. doi:10.1007/s008940050122.10.1007/s008940050122Suche in Google Scholar

Korotkov, E. V., M. A. Korotkova and N. A. Kudryashov (2003): “Information decomposition of symbolic sequences,” Phys. Lett. A, 312, 198–210.Suche in Google Scholar

Korotkova, M. A., E. V. Korotkov and N. A. Kudryashov (2011): “An approach for searching insertions in bacterial genes leading to the phase shift of triplet periodicity,” Genom. Proteom. Bioinform., 9, 158–170.Suche in Google Scholar

Kullback, S. (1997): Information theory and statistics, Dover Publications, New York.Suche in Google Scholar

Li, W. (1997): “The study of correlation structures of DNA sequences: a critical review,” Comput. Chem., 21, 257–271.Suche in Google Scholar

López-Villaseñor, I., M. V. José and J. Sánchez (2004): “Three-base periodicity patterns and self-similarity in whole bacterial chromosomes,” Biochem. Biophys. Res. Commun., 325, 467–478.Suche in Google Scholar

Makeev, V. J. and V. G. Tumanyan (1996): “Search of periodicities in primary structure of biopolymers: a general Fourier approach,” Comput. Appl. Biosci., 12, 49–54.Suche in Google Scholar

Mena-Chalco, J. P., H. Carrer, Y. Zana and R. M. Cesar (2008): “Identification of protein coding regions using the modified Gabor-wavelet transform,” IEEE/ACM Trans. Comput. Biol. Bioinform., 5, 198–207. doi:10.1109/TCBB.2007.70259.10.1109/TCBB.2007.70259Suche in Google Scholar PubMed

Ogata, H., S. Goto, K. Sato, W. Fujibuchi, H. Bono and M. Kanehisa (1999): “KEGG: Kyoto encyclopedia of genes and genomes,” Nucleic Acids. Res., 27, 29–34.Suche in Google Scholar

Pinho, A. J., S. P. Garcia, P. J. S. G. Ferreira, V. Afreixo and J. R. Neves (2010): “Exploring homology using the concept of three-state entropy vector,” LNBI 6282, 161–170.10.1007/978-3-642-16001-1_14Suche in Google Scholar

Plotkin, J. B. and G. Kudla (2011): “Synonymous but not the same: the causes and consequences of codon bias,” Nat. Rev. Genet., 12, 32–42.Suche in Google Scholar

Sanchez, J. and M. V. Jose (2002): “Analysis of bilateral inverse symmetry in whole bacterial chromosomes,” Biochem. Biophys. Res. Commun., 299, 126–134.Suche in Google Scholar

Sánchez, J. and I. López-Villaseñor (2006): “A simple model to explain three-base periodicity in coding DNA,” FEBS Lett., 580, 6413–6422.Suche in Google Scholar

Sharp, P. M., E. Cowe, D. G. Higgins, D. C. Shields, K. H. Wolfe and F. Wright (1988): “Codon usage patterns in Escherichia coli, Bacillus subtilis, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Drosophila melanogaster and Homo sapiens; a review of the considerable within-species diversity,” Nucleic Acids Res., 16, 8207–8211.Suche in Google Scholar

Shepherd, J. C. (1981): “Periodic correlations in DNA sequences and evidence suggesting their evolutionary origin in a comma-less genetic code,” J. Mol. Evol., 17, 94–102.Suche in Google Scholar

Suvorova, Y. M., V. M. Rudenko and E. V. Korotkov (2012): “Detection change points of triplet periodicity of gene,” Gene, 491, 58–64.10.1016/j.gene.2011.08.032Suche in Google Scholar PubMed

Suzuki, H., C. J. Brown, L. J. Forney and E. M. Top (2008): “Comparison of correspondence analysis methods for synonymous codon usage in bacteria,” DNA Res., 15, 357–365.Suche in Google Scholar

Team, R. C. D. (2011): R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria.Suche in Google Scholar

Tiwari, S., S. Ramachandran, A. Bhattacharya, S. Bhattacharya and R. Ramaswamy (1997): “Prediction of probable genes by Fourier analysis of genomic sequences,” Comput. Appl. Biosci., 13, 263–270.Suche in Google Scholar

Trifonov, E. N. (1998): “3-, 10.5-, 200- and 400-base periodicities in genome sequences,” Physica A Stat. Mech. Appl., 249, 511–516.Suche in Google Scholar

Trifonov, E. N. (1999): “Elucidating sequence codes: three codes for evolution,” Ann. NY Acad. Sci., 870, 330–338.Suche in Google Scholar

Trifonov, E. N. and J. L. Sussman (1980): “The pitch of chromatin DNA is reflected in its nucleotide sequence,” Proc. Natl. Acad. Sci. USA, 77, 3816–3820.10.1073/pnas.77.7.3816Suche in Google Scholar PubMed PubMed Central

Trotta, E. (2011): “The 3-base periodicity and codon usage of coding sequences are correlated with gene expression at the level of transcription elongation,” PLoS One, 6, 11.10.1371/journal.pone.0021590Suche in Google Scholar PubMed PubMed Central

Tsonis, A. A., J. B. Elsner and P. A. Tsonis (1991): “Periodicity in DNA coding sequences: implications in gene evolution,” J. Theor. Biol., 151, 323–331.Suche in Google Scholar

Vinga, S. and J. Almeida (2003): “Alignment-free sequence comparison – a review,” Bioinformatics, 19, 513–523.10.1093/bioinformatics/btg005Suche in Google Scholar PubMed

Wang, L. and L. D. Stein (2010): “Localizing triplet periodicity in DNA and cDNA sequences,” BMC Bioinform., 11, 550.Suche in Google Scholar

Yan, M., Z. S. Lin and C. T. Zhang (1998): “A new Fourier transform approach for protein coding measure based on the format of the Z curve,” Bioinformatics, 14, 685–690.10.1093/bioinformatics/14.8.685Suche in Google Scholar PubMed

Yin, C. and S. S.-T. Yau (2007): “Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence,” J. Theor. Biol., 247, 687–694.Suche in Google Scholar

Zoltowski, M. (2007): “Is DNA code periodicity only due to CUF-codons usage frequency?” Conf. Proc. Int. Conf. IEEE Eng. Med. Biol. Soc., 2007, 1383–1386.Suche in Google Scholar

Published Online: 2015-2-24
Published in Print: 2015-4-1

©2015 by De Gruyter

Heruntergeladen am 30.9.2025 von https://www.degruyterbrill.com/document/doi/10.1515/sagmb-2013-0063/html
Button zum nach oben scrollen