Abstract
Computational motif detection in folk narratives is an unresolved problem, partly because motifs are formally fluid, and because test collections to teach machine learning algorithms are not generally available or big enough to yield robust predictions for expert confirmation. As a result, standard tale typology based on texts as motif strings renders its computational reproduction an automatic classification exercise. In this brief communication, to report work in progress we use the Support Vector Machine algorithm on the ten best populated classes of the Annotated Folktales test collection, to predict text membership in their internationally accepted categories. The classification result was evaluated using recall, precision, and F1 scores. The F1 score was in the range 0.8–1.0 for all the selected tale types except for type 275 (The Race between Two Animals), which, although its recall rate was 1.0, suffered from a low precision.
About the authors
Senior Lecturer
Professor
Acknowledgements
The authors are grateful to X anonymous reviewers for helpful comments on the manuscript.
6 References
Abello, James/Broadwell, Peter/Tangherlini, Timothy R.: Computational Folkloristics. In: Communications of the ACM 55,7 (2012) 60–70. https://doi.org/10.1145/2209249.2209267.10.1145/2209249.2209267Search in Google Scholar
Berezkin, Yuri: Spread of folklore motifs as a proxy for information exchange. Contact zones and borderlines in Eurasia. In: Trames 19,1 (2015) 3–14. https://doi.org/10.3176/tr.2015.1.01.10.3176/tr.2015.1.01Search in Google Scholar
Berezkin, Yuri: Peopling of the New World from Data on Distributions of Folklore Motifs. In: Maths Meets Myths. Quantitative Approaches to Ancient Narratives, 71–89. Eds. Ralph Kenna/Máirín MacCarron/Pádraig MacCarron. Heidelberg 2017. https://doi.org/10.1007/978-3-319-39445-9_5.10.1007/978-3-319-39445-9_5Search in Google Scholar
Boehmke, B./Greenwell, B. M: Hands-On Machine Learning with R. New York 2019. https://doi.org/10.1201/9780367816377.10.1201/9780367816377Search in Google Scholar
Bortolini, Eugenio/Pagani, Luca/Crema, Enrico R./Sarno, Stefania/Barbieri, Chiara/Boattini, Alessio/Sazzini, Marco/Silva, Sara G. da/Martini, Gessica/Metspalu, Mait/Pettener, Davide/Luiselli, Donata/Tehrani, Jamshid J.: Inferring patterns of folktale diffusion using genomic data. In: Proceedings of the National Academy of Sciences 114,34 (2017) 9140–9145. https://doi.org/10.1073/pnas.1614395114.10.1073/pnas.1614395114Search in Google Scholar
Chambers, Nathanael/Jurafsky, Dan: Unsupervised learning of narrative event chains. In: Proceedings of ACL-08: HLT, 789–797 (June 2008).Search in Google Scholar
Christiansen, Reidar Th.: The Migratory Legends: A Proposed List of Types with a Systematic Catalogue of the Norwegian Variants. Helsinki 1992.Search in Google Scholar
Declerck, Thierry/Aman, Anastasia/Banzer, Martin/Macháček, Dominik/Schäfer, Lisa/Skachkova, Natalia: Multilingual Ontologies for the Representation and Processing of Folktales. In: Proceedings of the First Workshop on Language Technology for Digital Humanities in Central and (South-)Eastern Europe (2017) 20–23. https://doi.org/0.26615/978-954-452-046-5_003.Search in Google Scholar
Declerck, Thierry/Kostova, Antónia/Schäfer, Lisa: Towards a linked data access to folktales classified by Thompson’s motifs and Aarne-Thompson-Uther’s types. In: Proceedings of Digital Humanities (2017b) 1–4. https://www.dfki.de/fileadmin/user_upload/import/9028_Dh2017_LOD_TMI-ATU_final.pdf.Search in Google Scholar
Declerck, Thierry/Schäfer, Lisa: Porting past classification schemes for narratives to a linked data framework. In: Proceedings of DATeCH2017 (2017) 123–127. https://doi.org/10.1145/3078081.3078105.10.1145/3078081.3078105Search in Google Scholar
Eisenberg, Joshua D./Yarlott, Victor W./Finlayson, Mark A. Comparing Extant Story Classifiers. Results & New Directions. In: Proceedings of the 7th Workshop on Computational Models of Narrative (CMN 2016). eds. Ben Miller/Antonio Lieto/Remi Ronfard/Stephan G. Ware/Mark A. Finlayson. Dagstuhl 2016, 1–10. https://drops.dagstuhl.de/opus/volltexte/2016/6707/pdf/OASIcs-CMN-2016-6.pdf.Search in Google Scholar
Finlayson, Mark A.: Inferring Propp’s functions from semantically annotated text. In: Journal of American Folklore 129,511 (2016) 55–77. https://doi.org/10.5406/jamerfolk.129.511.0055.10.5406/jamerfolk.129.511.0055Search in Google Scholar
Frenzel, Elisabeth: Stoffe der Weltliteratur: Ein Lexikon dichtungsgeschichtlicher Längsschnitte. Stuttgart 1992.Search in Google Scholar
Hagedorn, Josh/Darányi, Sándor: Bearing a Bag-of-Tales: An Open Corpus of Annotated Folktales for Reproducible Research. In: Journal of Open Humanities Data 8,16 (2022). http://doi.org/10.5334/johd.78.10.5334/johd.78Search in Google Scholar
Ilyefalvi, Emese: The theoretical, methodological and technical issues of digital folklore databases and computational folkloristics. In: Acta Ethnographica Hungarica 63,1 (2018) 209–258. https://doi.org/10.1556/022.2018.63.1.11.10.1556/022.2018.63.1.11Search in Google Scholar
Karsdorp, Folgert: Retelling Stories: A Computational-Evolutionary Perspective. Radboud Universiteit 2016. https://repository.ubn.ru.nl/bitstream/handle/2066/162268/162268.pdf.Search in Google Scholar
Karsdorp, F./Fonteyn, L.: Cultural entrenchment of folktales is encoded in language. In: Palgrave Communications 5,25 (2019). https://doi.org/10.1057/s41599-019-0234-9.10.1057/s41599-019-0234-9Search in Google Scholar
Karsdorp, Folgert/van den Bosch, Antal: Identifying motifs in folktales using topic models. In: Proceedings of the 22 Annual Belgian-Dutch Conference on Machine Learning (2013) 41–49.Search in Google Scholar
Kestemont, Mike/Karsdorp, Folgert/de Bruijn, Elisabeth/Driscoll, Matthew/Kapitan, Katarzyna A./Ó Macháin, Pádraig/Sawyer, Daniel/Sleiderink, Remco/Chao, Anne: Forgotten books. The application of unseen species models to the survival of culture. In: Science 375,6582 (2022) 765–769. https://doi.org/10.1126/science.abl7655.10.1126/science.abl7655Search in Google Scholar
Lô, Gossa/Boer, Victor de/Aart, Chris J. van: Exploring West African Folk Narrative Texts Using Machine Learning. In: Information 11,5 (2020) 236. https://doi.org/10.3390/info11050236.10.3390/info11050236Search in Google Scholar
Meder, Theo: From a Dutch Folktale Database towards an International Folktale Database. In: Fabula 51,1–2 (2010) 6–22. https://doi.org/10.1515/FABL.2010.003.Search in Google Scholar
Meder, Theo/Karsdorp, Folgert/Nguyen, Dong/Theune, Mariët/Trieschnigg, Dolf/Muiser, Iwe (2016). Automatic Enrichment and Classification of Folktales in the Dutch Folktale Database. In: The Journal of American Folklore 129,511 (2016) 78–96. https://doi.org/10.5406/jamerfolk.129.511.0078.10.5406/jamerfolk.129.511.0078Search in Google Scholar
Nguyen, Dong/Trieschnigg, Dolf/Meder, Theo/Theune, Mariët: Automatic classification of folk narrative genres. In: Proceedings of KONVENS 2012. ed. Jeremy Jancsary (2012) 378–382.Search in Google Scholar
Nguyen, Dong/Trieschnigg, Dolf/Theune, Mariët: Folktale Classification Using Learning to Rank. In: Advances in Information Retrieval. ECIR 2013. Lecture Notes in Computer Science. ed. Pavel Serdyukov/Pavel Braslavski/Sergei O. Kuznetsov/Japp Kamps/Stefan Rüger/Eugene Agichtein/Ilya Segalovich/Emine Yilmaz. Heidelberg 2013, 195–206. https://doi.org/10.1007/978-3-642-36973-5_17.10.1007/978-3-642-36973-5_17Search in Google Scholar
Pajupuu, Hille/Altrov, Rene/Pajupuu, Jaan: Identifying polarity in different text types. In: Folklore. Electronic Journal of Folklore 64 (2016) 125–142. http://dx.doi.org/10.7592/FEJF2016.64.polarity. https://www.folklore.ee/folklore/vol64/polarity.pdf (January 11, 2023).10.7592/FEJF2016.64.polaritySearch in Google Scholar
Pompeu, Duarte Pinto/Martins, Bruno/Matos, David: Interpretable Deep Learning Methods for Classifying Folktales According to the Aarne-Thompson-Uther Scheme. Master’s Thesis, Instituto Superior Técnico, Universidade de Lisboa 2019.Search in Google Scholar
Propp, Vladimir: Morphology of the Folktale. New York 1958.Search in Google Scholar
Reiter, Nils/Frank, Anette/Hellwig, Oliver: An NLP-based cross-document approach to narrative structure discovery. In: Literary and Linguistic Computing 29,4 (2014) 583–605. https://doi.org/10.1093/llc/fqu055.10.1093/llc/fqu055Search in Google Scholar
Seigneuret, Jean-Charles (ed.): Dictionary of literary themes and motifs. New York 1988.Search in Google Scholar
Silva, Sara G. da/Tehrani, Jamshid J.: Comparative phylogenetic analyses uncover the ancient roots of Indo-European folktales. In: Royal Society Open Science 3,1 (2016) 1–11. https://doi.org/10.1098/rsos.150645.10.1098/rsos.150645Search in Google Scholar
Tangherlini, Timothy R.: Big folklore: A special issue on computational folkloristics. In: The Journal of American Folklore 129,511 (2016) 5–13. https://doi.org/10.5406/jamerfolk.129.511.0005.10.5406/jamerfolk.129.511.0005Search in Google Scholar
Tangherlini, Timothy R./Leonard, Peter: Trawling in the Sea of the Great Unread: Sub-corpus topic modeling and Humanities research. In: Poetics 41,6 (2013) 725–749. https://doi.org/10.1016/j.poetic.2013.08.002.10.1016/j.poetic.2013.08.002Search in Google Scholar
Tehrani, Jamshid J.: The phylogeny of Little Red Riding Hood. In: PLoS ONE 8,11 (2013) e78871. https://doi.org/10.1371/journal.pone.0078871.10.1371/journal.pone.0078871Search in Google Scholar
Tehrani, Jamshid J./Nguyen, Quan/Roos, Teemu: Oral fairy tale or literary fake? Investigating the origins of Little Red Riding Hood using phylogenetic network analysis. In: Digital Scholarship in the Humanities 31,3 (2016) 611–636. https://doi.org/10.1093/llc/fqv016.10.1093/llc/fqv016Search in Google Scholar
Thompson, Stith: Motif-index of folk-literature: A classification of narrative elements in folktales, ballads, myths, fables, mediaeval romances, example, fabliaux, jest-books and local legends. 2nd ed. Copenhagen 1951.Search in Google Scholar
Thompson, Stith: The Folktale. Berkeley 1977.Search in Google Scholar
Thuillard, Marc/d’Huy, Julien/Berezkin, Yuri Y./Le Quellec, Jean-Loïc: A Large-Scale Study of World Myths. In: Trames Journal of the Humanities and Social Sciences 22,4 (2018) 407–424. https://doi.org/10.3176/tr.2018.4.05.10.3176/tr.2018.4.05Search in Google Scholar
Uther, Hans-Jörg: The Types of International Folktales: A Classification and Bibliography, Based on the System of Antti Aarne and Stith Thompson. Helsinki 2004.Search in Google Scholar
Vapnik, Vladimir N.: An overview of statistical learning theory. In: IEEE Transactions on Neural Networks 10,5 (1999) 988–999. https://doi.org/10.1109/72.788640.10.1109/72.788640Search in Google Scholar
White, John D.: The Analysis of Music. Prentice-Hall 1976.Search in Google Scholar
Appendix: ATU tale type distribution of the AFT collection
The data can be accessed at doi.org/10.5281/zenodo.6575263
1. ANIMAL TALES (1–299) |
46 |
|
Wild Animals 1–99 |
14 |
|
The Clever Fox (Other Animal) 1–69 |
1, 2, 15, 20C, 47A, 47B, 50, 57, 63, 66A, 68A |
11 |
Other Wild Animals 70–99 |
75, 91, 92 |
3 |
Wild Animals and Domestic Animals 100–149 |
101, 103, 105, 112, 113A, 122E, 122F, 124, 130 |
9 |
Wild Animals and Humans 150–199 |
150, 154, 155, 156, 160, 173, 175, 178A |
8 |
Domestic Animals 200–219 |
207C, 214A |
2 |
Other Animals and Objects 220–299 |
|
|
Birds 220–249 |
225, 231, 237, 243A, 244, 247 |
6 |
Fish 250–253 |
||
Other animals and objects 275–299 |
275, 278, 278A, 280A, 285A, 295, 298 |
7 |
2. TALES OF MAGIC (300–749) |
47 |
|
Supernatural Adversaries 300–399 |
303, 306, 310, 311, 312, 313, 325, 327, 328, 332, 333, 335, 361, 365, 366 |
15 |
Supernatural or Enchanted Wife (Husband) or Other Relative 400–459 |
6 |
|
Wife 400–424 |
402, 410 |
2 |
Husband 425–449 |
425C, 440, 441 |
3 |
Brother or Sister 450–459 |
451 |
1 |
Supernatural Tasks 460–499 |
480 |
1 |
Supernatural Helpers 500–559 |
500, 502, 503, 505, 510A, 510B, 545B, 555 |
8 |
Magic Objects 560–649 |
562, 563, 565, 570, 571B, 592, 613 |
7 |
Supernatural Power or Knowledge 650–699 |
650A, 670, 675 |
3 |
Other Tales of the Supernatural 700–749 |
700, 704, 706, 709, 720, 726, 737 |
7 |
3. RELIGIOUS TALES (750–849) |
10 |
|
God Rewards and Punishes 750–779 |
750A, 756, 763, 777, 779, 779J* |
6 |
The Truth Comes to Light 780–799 |
780, 782 |
2 |
Heaven 800–809 |
800 |
1 |
The Devil 810–826 |
|
|
Other Religious Tales 827–849 |
845 |
1 |
4. REALISTIC TALES (NOVELLE) (850–999) |
16 |
|
The Man Marries the Princess 850–869 |
850 |
1 |
The Woman Marries the Prince 870–879 |
875 |
1 |
Proofs of Fidelity and Innocence 880–899 |
882, 888 |
2 |
The Obstinate Wife Learns to Obey 900–909 |
900 |
1 |
Good Precepts 910–919 |
910B |
1 |
Clever Acts and Words 920–929 |
920E, 926 |
2 |
Tales of Fate 930–949 |
||
Robbers and Murderers 950–969 |
954, 955, 958E* |
3 |
Other Realistic Tales 970–999 |
980, 980D, 981, 982, 990 |
5 |
5. TALES OF THE STUPID OGRE (GIANT, DEVIL) (1000–1199) |
8 |
|
Labor Contract 1000–1029 |
||
Partnership between Man and Ogre 1030–1059 |
1030 |
1 |
Contest between Man and Ogre 1060–1114 |
||
Man Kills (Injures) Ogre 1115–1144 |
1137 |
1 |
Ogre Frightened by Man 1145–1154 |
||
Man Outwits the Devil 1155–1169 |
1157, 1161 |
2 |
Souls Saved from the Devil 1170–1199 |
1174, 1175, 1176, 1191 |
4 |
6. ANECDOTES AND JOKES (1200–1999) |
45 |
|
Stories about a Fool 1200–1349 |
1215, 1287, 1288A, 1317, 1319, 1335A, 1342, 1343 |
8 |
Stories about Married Couples 1350–1439 |
1351, 1353, 1362, 1365, 1377, 1381, 1381D, 1383, 1408, 1415, 1422, 1423, 1430 |
13 |
The Foolish Wife and her Husband 1380–1404 |
||
The Foolish Husband and his Wife 1405–1429 |
||
The Foolish Couple 1430–1439 |
1451 |
1 |
Stories about a Woman 1440–1524 |
||
Looking for a Wife 1450–1474 |
||
Jokes about Old Maids 1475–1499 |
|
|
Other Stories about Women 1500–1524 |
|
|
Stories about a Man 1525–1724 |
|
|
The Clever Man 1525–1639 |
1540, 1548, 1558, 1562A, 1586, 1592, 1592B, 1620, 1626 |
9 |
Lucky Accidents 1640–1674 |
1641, 1641C, 1645, 1645B, 1655 |
5 |
The Stupid Man 1675–1724 |
1675, 1676, 1678, 1696 |
4 |
Jokes about Clergymen and Religious Figures 1725–1849 |
|
|
The Clergyman Is Tricked 1725–1774 |
1730, 1741 |
2 |
Clergyman and Sexton 1775–1799 |
1791 |
1 |
Other Jokes about Religious Figures 1800–1849 |
||
Anecdotes about Other Groups of People 1850–1874 |
||
Tall Tales 1875–1999 |
1889B, 1965 |
2 |
7. FORMULA TALES (2000–2399) |
10 |
|
Cumulative Tales 2000–2100 |
2015, 2022, 2025, 2030, 2031C, 2032, 2034F, 2035, 2043 |
9 |
Catch Tales 2200–2299 |
2250 |
1 |
Other Formula Tales 2300–2399 |
|
© 2023 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Titelseiten
- I Editorial
- Computational folktale studies. A very brief history
- II Articles
- Cinderella’s Family Tree. A Phylomemetic Case Study of ATU 510/511
- Cinderella’s Body. A Quantitative Approach to Gender, Embodiment, and Folktale Plots
- Little Statisticians in the Forest of Tales: Towards a New Comparative Mythology
- Disentangling the Folklore Hairball
- Teaching Tale Types to a Computer: A First Experiment with the Annotated Folktales Collection
- The ISEBEL Project
- WossiDiA
- Polish Folk Tale Archive: From Analog to Digital
- A folklorist in Search of Cinderella’s Shoe
- III Reports, News, Announcements
- Review Essay
- IV Reviews
- Brill, Tony: Tipologia legendei populare romaneşti 1. Legenda etiologică. (Prefaţă Sabina Ispas. Ediţie îngrijită şi studiu introductiv de I. Oprişan.) Bucureşti: Editura Saeculum I.O., 2005. 655 pp.; Tipologia legendei populare romaneşti 2. Legenda mitilogică. Legenda religioasă. Legenda istorică. (Ediţie îngrijită şi prefaţă de I. Oprişan.) Bucureşti: Editura Saeculum I.O., 2006. 575 pp.
- IV Submitted Books
- Submitted Books
Articles in the same Issue
- Titelseiten
- I Editorial
- Computational folktale studies. A very brief history
- II Articles
- Cinderella’s Family Tree. A Phylomemetic Case Study of ATU 510/511
- Cinderella’s Body. A Quantitative Approach to Gender, Embodiment, and Folktale Plots
- Little Statisticians in the Forest of Tales: Towards a New Comparative Mythology
- Disentangling the Folklore Hairball
- Teaching Tale Types to a Computer: A First Experiment with the Annotated Folktales Collection
- The ISEBEL Project
- WossiDiA
- Polish Folk Tale Archive: From Analog to Digital
- A folklorist in Search of Cinderella’s Shoe
- III Reports, News, Announcements
- Review Essay
- IV Reviews
- Brill, Tony: Tipologia legendei populare romaneşti 1. Legenda etiologică. (Prefaţă Sabina Ispas. Ediţie îngrijită şi studiu introductiv de I. Oprişan.) Bucureşti: Editura Saeculum I.O., 2005. 655 pp.; Tipologia legendei populare romaneşti 2. Legenda mitilogică. Legenda religioasă. Legenda istorică. (Ediţie îngrijită şi prefaţă de I. Oprişan.) Bucureşti: Editura Saeculum I.O., 2006. 575 pp.
- IV Submitted Books
- Submitted Books