Abstract
In the present paper an attempt has been made to determine the mathematical model for the frequencies of occurrence of letters in the corpora, in the word types of the corpora and in the initial positions of words of the corpora while both the word tokens and word types have been taken into account. In the current study corpora written in American English have been used by the selection of the entities from ‘The Open American National Corpus (OANC)’.
References
Bourne, Charles P. & Donald F. Ford. 1961. A study of the statistics of letters in English words. Information and Control 4(1). 48–67. https://doi.org/10.1016/s0019-9958(61)80036-3.Suche in Google Scholar
Broerse, Aleid C. & E. J. Zwaan. 1966. The information value of initial letters in the identification of words. Journal of Verbal Learning and Verbal Behavior 5. 441–446. https://doi.org/10.1016/s0022-5371(66)80058-0.Suche in Google Scholar
Eftekhari, Ali. 2006. Fractal geometry of texts: An initial application to the works of Shakespeare. Journal of Quantitative Linguistics 13(2–3). 177–193. https://doi.org/10.1080/09296170600850106.Suche in Google Scholar
Grzybek, Peter & Emmerich Kelih. 2005. Towards a general model of grapheme frequencies in Slavic languages. In Radovan Garabík (ed.), Computer treatment of Slavic and East European languages, 73–87. Bratislava: Veda.Suche in Google Scholar
Grzybek, Peter & Milan Rusko. 2009. Letter, grapheme and (allo-)phone frequencies: The case of Slovak. Glottotheory 2(1). 30–48. https://doi.org/10.1515/glot-2009-0004.Suche in Google Scholar
Grzybek, Peter, Emmerich Kelih & Ernst Stadlober. 2009. Slavic letter frequencies: A common discrete model and regular parameter behavior? In Reinhard Köhler (ed.), Issues in quantitative linguistics. 17–33. Lüdenscheid: RAM-Verlag.Suche in Google Scholar
Grzybek, Peter. 2005. A study on Russian graphemes. http://www.peter-grzybek.eu/science/publications/2005/grzybek_2005_russian_graphemes.pdf. (accessed 28 July 2017).Suche in Google Scholar
Grzybek, Peter. 2007. On the systematic and system-based study of grapheme frequencies: A re-analysis of German letter frequencies. Glottometrics 15. 82–91. https://pdfs.semanticscholar.org/124d/a239d6b1f2c424fb518e5d4b252704892c56.pdf.Suche in Google Scholar
Li, Wentian & Pedro Miramontes. 2011. Fitting ranked English and Spanish letter frequency distribution in U.S. and Mexican presidential speeches. Journal of Quantitative Linguistics 18(4). 359–380. https://doi.org/10.1080/09296174.2011.608606.Suche in Google Scholar
Mačutek, JÁN. 2008. A generalization of the geometric distribution and its application in quantitative linguistics. Romanian Reports in Physics. 60(3). 501–509. http://www.rrp.infim.ro/2008_60_3/09-501-509.pdf.Suche in Google Scholar
Martindale, Colin, S. M. Gusein-Zade, Dean McKenzie & Mark Yu Borodovsky. 1996. Comparison of equations describing the ranked frequency distributions of graphemes and phonemes. Journal of Quantitative Linguistics. 3(2). 106–112. https://doi.org/10.1080/09296179608599620.Suche in Google Scholar
Mikros, George, Nick Hatzigeorgiu & George Carayannis. 2005. Basic quantitative characteristics of the modern Greek language using the Hellenic National corpus. Journal of Quantitative Linguistics. 12(2–3). 167–184. https://doi.org/10.1080/09296170500172478.Suche in Google Scholar
Ohlman, Herbert M. 1959. Subject-word letter frequencies with applications to superimposed coding. In Proceedings of the international conference on scientific information. Available at: http://books.nap.edu/openbook.php?record_id=10866&page=903.Suche in Google Scholar
Pande, Hemlata & Hoshiyar S. Dhami. 2009. Generation of a model for grapheme frequencies and its refinement and validation by group theoretic aspects, Journal of Quantitative Linguistics 16(4). 307–326. https://doi.org/10.1080/09296170903211485.Suche in Google Scholar
Pande, Hemlata & Hoshiyar S. Dhami. 2010. Mathematical modelling of occurrence of letters and word’s initials in texts of Hindi language. SKASE Journal of Theoretical Linguistics 7(2). 19–38. https://doi.org/10.1080/09296174.2012.754596.Suche in Google Scholar
Popescu, Ioan-Iovitz, Ján Mačutek & Gabriel Altmann. 2009. Aspects of word frequencies. Studies in Quantitative Linguistics 3. http://library2.nipne.ro/sites/default/files/iovitzubook2-aspects_of_word_frequencies-july_2009.pdf.Suche in Google Scholar
Riyal, Manoj Kumar, Nikhil Kumar Rajput, Vinod Prasad Khanduri & Laxmi Rawat. 2016. Rank-frequency analysis of characters in Garhwali text: Emergence of Zipf’s law. Current Science 110(3). 429–434. https://doi.org/10.18520/cs/v110/i3/429-443.Suche in Google Scholar
Rubin, David C. 1978. Word—initial and word—final ngram frequencies. Journal of Literacy Research 10(2). 171–183. https://doi.org/10.1080/10862967809547266.Suche in Google Scholar
Solso, Robert L., Connie Juel & David C. Rubin. 1982. The frequency and versatility of initial and terminal letters in English words. Journal of Verbal Learning and Verbal Behavior 21. 220–235. https://doi.org/10.1016/s0022-5371(82)90581-3.Suche in Google Scholar
Wilson, Andrew. 2013. Probability distributions of grapheme frequencies in Irish and Manx. Journal of Quantitative Linguistics 20(3). 169177. https://doi.org/10.1080/09296174.2013.799919.Suche in Google Scholar
Wimmer, Gejza & Gabriel Altmann. 1999. Thesaurus of univariate discrete probability distributions. Essen: Stamm.Suche in Google Scholar
Ycart, Bernard 2012. Letter counting: A stem cell for cryptology, quantitative linguistics, and statistics. https://arxiv.org/ftp/arxiv/papers/1211/1211.6847.pdf. (accessed 16 March 2018).Suche in Google Scholar
© 2020 Walter de Gruyter GmbH, Berlin/Boston
Artikel in diesem Heft
- Frontmatter
- Nachruf
- Gabriel Altmann (1931–2020)
- Articles
- Das Beziehungsgeflecht zwischen Sprache und Kultur: Forschungsrückblick, Zugänge und Beschreibungstendenzen
- Die Häufigkeit von Verben in fachlichen und öffentlichen Texten in deutscher Sprache
- Mathematical modeling of the frequencies of letters for their occurrence in corpora, words (types) and in the initial positions of words of corpora
- Hedging devices in applied linguistics research papers: Do gender and nativeness matter?
- Book Review
- Insubordination. Theoretical and empirical issues
Artikel in diesem Heft
- Frontmatter
- Nachruf
- Gabriel Altmann (1931–2020)
- Articles
- Das Beziehungsgeflecht zwischen Sprache und Kultur: Forschungsrückblick, Zugänge und Beschreibungstendenzen
- Die Häufigkeit von Verben in fachlichen und öffentlichen Texten in deutscher Sprache
- Mathematical modeling of the frequencies of letters for their occurrence in corpora, words (types) and in the initial positions of words of corpora
- Hedging devices in applied linguistics research papers: Do gender and nativeness matter?
- Book Review
- Insubordination. Theoretical and empirical issues