Home Heuristic theory in corpus compilation
Article
Licensed
Unlicensed Requires Authentication

Heuristic theory in corpus compilation

探试理论在语料库研制中的应用
  • John Patkin

    John Patkin is a Hong Kong-based researcher. He is Chief Transcriber of the Asian Corpus of English and the Hong Kong Archive of Language Learning. His doctoral studies focused on the use of English as a lingua franca in public service broadcasting. He is currently exploring the role of transcription in research and the Internet’s disruption to traditional audio broadcasting platforms.

    EMAIL logo
Published/Copyright: September 14, 2016

Abstract

Building the Asian Corpus of English (ACE, 2014) involved complex interactions between researchers, participants, transcription conventions, software and hardware. The gathering and transcribing of naturally occurring conversations of English among Asian multilinguals was undertaken by a team of more than twenty researchers in nine locations across Asia. Modelled on the Vienna Oxford Corpus of English (VOICE), ACE faced unique challenges due to linguistic, cultural and geographical differences. These problems were solved through procedures and tools known as heuristics which were built on prior experience and also trial and error. The process of developing and categorising these skills are presented with experiences shared by ACE researchers and the author along with examples from the corpus.

摘要

亚洲英语语料库(ACE)的研制涉及研究者,参与人员,转写规范,软件和硬件相互之间复杂的交流。共有二十多名研究者在亚洲九个地方收集和转写亚洲多语使用者的自然英语对话。基于维也纳牛津英语语料库(VOICE)的模式,ACE因语言、文化和地域的差异面临独特的挑战。这些问题通过探试法这个工具,即通过先前的经验和反复试验得以解决。这些技能的发展和分类过程 体现在ACE研究者和作者在实践中共享的经验以及语料库中的例子。

About the author

John Patkin

John Patkin is a Hong Kong-based researcher. He is Chief Transcriber of the Asian Corpus of English and the Hong Kong Archive of Language Learning. His doctoral studies focused on the use of English as a lingua franca in public service broadcasting. He is currently exploring the role of transcription in research and the Internet’s disruption to traditional audio broadcasting platforms.

Appendix 1: Summary of VoiceScribe mark-up conventions used in this paper (Figure 7)

Figure 7: VoiceScribe conventions.
Figure 7:

VoiceScribe conventions.

Appendix 2: ACE researcher survey questions

  1. How would you describe your experience doing data collection for ACE?

  2. In your opinion, what makes a good data collector?

  3. How would you describe your experience with VoiceScribe?

  4. How has your field of study helped you with transcribing for ACE?

  5. In your opinion, what makes a good transcriber?

References

ACE. 2014. The Asian Corpus of English. Director: Andy Kirkpatrick; Researchers: Wang Lixun, John Patkin, Sophiann Subhan. http://corpus.ied.edu.hk/ace/ (accessed 31 March 2016).Search in Google Scholar

Amodei, Dario, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Jingdong Chen et al. 2015. “Deep Speech 2: End-to-End Speech Recognition in English and Mandarin.” arXiv preprint arXiv:1512.02595.Search in Google Scholar

Breiteneder, Angelika, Marie-Luise Pitzl, Stefan Majewski & Theresa Klimpfinger. 2006. VOICE recording – Methodological challenges in the compilation of a corpus of spoken ELF. Nordic Journal of English Studies 5(2). 161–188.10.35360/njes.16Search in Google Scholar

Breiman, L., J. Friedman, C. J. Stone & R. A. Olshen. 1984. Classification and regression trees. New York: CRC press.Search in Google Scholar

Bucholtz, Mary. 2000. The politics of transcription. Journal of Pragmatics 32(10). 1439–1465.10.1016/S0378-2166(99)00094-6Search in Google Scholar

Chiari, Isabella. 2006. Slips and errors in spoken data transcription. In Proceedings of 5th International Conference on Language Resources and Evaluation LREC2006, Genova. Genova: CD-ROM, ELDA (pp. 1596–1599).Search in Google Scholar

Edwards, Jane A. & Martin D. Lampert. 2015 [1993]. Talking data: Transcription and coding in discourse research, Kindle edn. Taylor and Francis.Search in Google Scholar

Fok, May Sin-mai & Doris Yuk-shim Wong. 2005. A pilot study on enhancing positive coping behaviour in early adolescents using a school-based project. Journal of Child Health Care 9(4). 301–313.10.1177/1367493505056483Search in Google Scholar

Gigerenzer, Gerd, Peter M. Todd & ABC Research Group. 1999. Simple heuristics that make us smart (Evolution and Cognition) Kindle Edition.Search in Google Scholar

Halcomb, Elizabeth J. & Patricia M. Davidson. 2006. Is verbatim transcription of interview data always necessary? Applied Nursing Research 19(1). 38–42.10.1016/j.apnr.2005.06.001Search in Google Scholar

HALL. 2016. Principal investigator: Klaudia Lee; Researchers: Gao Xuesong, John Trent, John Patkin. http://www.narratives.hk (accessed 31 March 2016).Search in Google Scholar

Hofstede, Geert H. 2001. Culture’s consequences: Comparing values, behaviors, institutions and organizations across nations. Thousand Oaks, Calif: Sage.Search in Google Scholar

Jefferson, Gail. 2004. Glossary of transcript symbols with an introduction. Pragmatics and Beyond New Series 125. 13–34.10.1075/pbns.125.02jefSearch in Google Scholar

Jenks, Christopher Joseph. 2011. Transcribing talk and interaction: Issues in the representation of communication data. Amsterdam & Philadelphia: John Benjamins Pub. Co.10.1075/z.165Search in Google Scholar

Kirkpatrick, Andy & Sophiaan Subhan. 2014. Non-standard or new standards or errors? The use of inflectional marking for present and past tenses in English as an Asian lingua franca. In A. Kautsch, S. Buschfeld, T. Hoffman & M. Huber (eds.), The evolution of Englishes, 386–400. Amsterdam: John Benjamins.10.1075/veaw.g49.22kirSearch in Google Scholar

Lapadat, Judith C. 2000. Problematizing transcription: Purpose, paradigm and quality. International Journal of Social Research Methodology 3(3). 203–219.10.1080/13645570050083698Search in Google Scholar

Lindsay, Jean & Daniel C. O’Connell... 1995. How do transcribers deal with audio recordings of spoken discourse? Journal of Psycholinguistic Research 24(2). 101–115.10.1007/BF02143958Search in Google Scholar

MacWhinney, Brian. 2014. The childes project: Tools for analyzing talk, volume I: Transcription format and programs: 1, Kindle edn. Taylor and Francis.10.4324/9781315805672Search in Google Scholar

Mondada, Lorenza. 2013. The conversation analytic approach to data collection. The Handbook of Conversation Analysis, 32–56.10.1002/9781118325001.ch3Search in Google Scholar

Norman, Don 2013. The design of everyday things: Revised and expanded edition. Basic Books. Kindle Edition.Search in Google Scholar

O’Connell, Daniel C. & Sabine Kowal. 2008. Rhetoric. In Communicating with One Another (pp. 1–10). Springer New York.10.1007/978-0-387-77632-3_5Search in Google Scholar

Patkin, John Gideon. 2011. The ACE manual: Data collection and transcription for the Asian corpus of English.Search in Google Scholar

Patkin, Michael. 2008. Surgical heuristics. ANZ Journal of Surgery 78(12). 1065–1069. doi:10.1111/j.1445–2197.2008.04752.x.Search in Google Scholar

Poland, Blake D. 1995. Transcription quality as an aspect of rigor in qualitative research. Qualitative inquiry 1(3). 290–310.10.1177/107780049500100302Search in Google Scholar

Powers, Willow Roberts. 2005. Transcription techniques for the spoken word, Kindle edn. AltaMira Press.Search in Google Scholar

Reason, James. 1990. Human error, Kindle edn. Cambridge University Press.10.1017/CBO9781139062367Search in Google Scholar

Roberts, Celia. 1997. The politics of transcription. Transcribing talk: Issues of representation. TESOL Quarterly 31(1). 167–71.10.2307/3587983Search in Google Scholar

Rudin, Scott, Dana Brunetti, Michael De Luca, Cean Chaffin (Producers) & David Fincher (Director). 2010. The social network [Motion Picture]. Culver City, California: Columbia Pictures.Search in Google Scholar

Sapir, Edward. 2011 [1921]. Language an introduction to the study of speech. Kindle edn.Search in Google Scholar

Seale, Clive & David Silverman. 1997. Ensuring rigour in qualitative research. The European Journal of Public Health 7(4). 379–384.10.1093/eurpub/7.4.379Search in Google Scholar

Shek, Daniel. T.L., Rachel C.F. Sun & Christina Y.P. Tang 2009. Focus group evaluation from the perspective of program implementers: Findings based on the secondary 2 program. The Scientific World Journal 9. 992–1002.10.1100/tsw.2009.117Search in Google Scholar

VOICE. 2013. The Vienna-Oxford International Corpus of English (version 2.0 online). Director: Barbara Seidlhofer; Researchers: Angelika Breiteneder, Theresa Klimpfinger, Stefan Majewski, Ruth Osimk-Teasdale, Marie-Luise Pitzl, Michael Radeka. http://voice.univie.ac.at (accessed 31 March 2016).Search in Google Scholar

Whitla, Paul, Peter G. Walters & Howard Davies. 2007. Global strategies in the international hotel industry. International Journal of Hospitality Management 26(4). 777–792.10.1016/j.ijhm.2006.08.001Search in Google Scholar

White, Elwyn Brooks. 2015 [1973]. Stuart little (A Harper Trophy Book), Kindle edn. HarperCollins.Search in Google Scholar

Published Online: 2016-9-14
Published in Print: 2016-9-1

©2016 by De Gruyter Mouton

Downloaded on 17.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/jelf-2016-0023/html
Scroll to top button