Abstract
In this paper, a method for evaluating the chronological age of adolescents on the basis of their voice signal is presented. For every examined child, the vowels a, e, i, o and u were recorded in extended phonation. Sixty voice parameters were extracted from each recording. Voice recordings were supplemented with height measurement in order to check if it could improve the accuracy of the proposed solution. Predictor selection was performed using the LASSO (least absolute shrinkage and selection operator) algorithm. For age estimation, the random forest (RF) for regression method was employed and it was tested using a 10-fold cross-validation. The lowest absolute error (0.37 year ± 0.28) was obtained for boys only when all selected features were included into prediction. In all cases, the achieved accuracy was higher for boys than for girls, which results from the fact that the change of voice with age is larger for men than for women. The achieved results suggest that the presented approach can be employed for accurate age estimation during rapid development in children.
Acknowledgement
We would like to thank Bruce Turner for the English language corrections.
Author Statement
Research funding: Authors state no funding involved.
Conflict of interest: Authors declare no conflict of interest.
Informed consent: Informed consent is not applicable.
Ethical approval: The conducted research is not related to either human or animal use.
References
[1] Russell M, Series RW, Wallace JL, Brown C, Skilling A. The STAR system: an interactive pronunciation tutor for young children. Comput Speech Lang 2000;14:161–75.10.1006/csla.2000.0139Suche in Google Scholar
[2] Kim HJ, Bae K, Yoon HS. Age and gender classification for a home-robot service. In: RO-MAN 2007 – The 16th IEEE International Symposium on Robot and Human Interactive Communication; 2007:122–6.10.1109/ROMAN.2007.4415065Suche in Google Scholar
[3] Bugdol MD, Bugdol MN, Lipowicz AM, Mitas AW, Bienkowska MJ, Wijata AM. Prediction of menarcheal status of girls using voice features. Comput Biol Med 2018;100:296–304.10.1016/j.compbiomed.2017.11.005Suche in Google Scholar PubMed
[4] Mirhassani SM, Zourmand A, Ting HN. Age estimation based on children’s voice: a Fuzzy-based decision fusion strategy. Sci World J 2014;2014:9.10.1155/2014/534064Suche in Google Scholar PubMed PubMed Central
[5] Muller C, Burkhardt F. Combining short-term cepstral and long-term pitch features for automatic recognition of speaker age. In: Interspeech 2007, 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium; 2007:2277–80.10.21437/Interspeech.2007-618Suche in Google Scholar
[6] Metze F, Ajmera J, Englert R, Bub U, Burkhardt F, Stegmann J, et al. Comparison of four approaches to age and gender recognition for telephone applications. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, Honolulu, HI, USA. vol. 4; 2007:IV1089–92. DOI: 10.1109/ICASSP.2007.367263.10.1109/ICASSP.2007.367263Suche in Google Scholar
[7] Mahmoodi D, Marvi H, Taghizadeh M, Soleimani A, Razzazi F, Mahmoodi M. Age estimation based on speech features and support vector machine. In: CEEC’11, 3rd Computer Science and Electronic Engineering Conference, Colchester, UK; 2011:60–4. DOI: 10.1109/CEEC.2011.5995826.10.1109/CEEC.2011.5995826Suche in Google Scholar
[8] Van Heerden C, Barnard E, Davel M, Van Der Walt C, Van Dyk E, Feld M, et al. Combining regression and classification methods for improving automatic speaker age recognition. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, Dallas, TX, USA; 2010:5174–7. DOI: 10.1109/ICASSP.2010.5495006.10.1109/ICASSP.2010.5495006Suche in Google Scholar
[9] Li M, Han KJ, Narayanan S. Automatic speaker age and gender recognition using acoustic and prosodic level information fusion. Comput Speech Lang 2013;27:151–67.10.1016/j.csl.2012.01.008Suche in Google Scholar
[10] Barkana BD, Zhou J. A new pitch-range based feature set for a speaker’s age and gender classification. Appl Acoust 2015;98:52–61.10.1016/j.apacoust.2015.04.013Suche in Google Scholar
[11] Iseli M, Shue YL, Alwan A. AGE- and gender-dependent analysis of voice source characteristics. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, Toulouse, France. vol. 1; 2006:I389–92. DOI: 10.1109/ICASSP.2006.1660039.10.1109/ICASSP.2006.1660039Suche in Google Scholar
[12] Bocklet T, Maier A, Bauer JG, Burkhardt F, Nöth E. Age and gender recognition for telephone applications based on GMM supervectors and support vector machines. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, Las Vegas, NV, USA; 2008:1605–8. DOI: 10.1109/ICASSP.2008.4517932.10.1109/ICASSP.2008.4517932Suche in Google Scholar
[13] Dobry G, Hecht RM, Avigal M, Zigel Y. Supervector dimension reduction for efficient speaker age estimation based on the acoustic speech signal. IEEE T Acoust Speech 2011;19:1975–85.10.1109/TASL.2011.2104955Suche in Google Scholar
[14] Meinedo H, Trancoso I. Age and gender classification using fusion of acoustic and prosodic features. In: Interspeech 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan; 2010:2818–21.10.21437/Interspeech.2010-745Suche in Google Scholar
[15] Minematsu N, Sekiguchi M, Hirose K. Automatic estimation of one’s age with his/her speech based upon acoustic modeling techniques of speakers. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, Orlando, FL, USA. vol. 1; 2002:I/137–40. DOI: 10.1109/ICASSP.2002.5743673.10.1109/ICASSP.2002.5743673Suche in Google Scholar
[16] Boersma P, Weenink D. Praat: doing phonetics by computer [Computer program] Version 6.0.05; 2015. Available from: http://www.praat.org.Suche in Google Scholar
[17] Datta AK, Singh SS, Ranjan S, Soubhik C, Kartik M, Anirban P. Signal Analysis of Hindustani Classical Music. Singapore: Springer; 2017.10.1007/978-981-10-3959-1Suche in Google Scholar
[18] Shmilovitz D. On the definition of total harmonic distortion and its effect on measurement interpretation. IEEE T Power Deliver 2005;20:526–8.10.1109/TPWRD.2004.839744Suche in Google Scholar
[19] Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. Springer Series in Statistics. New York, NY, USA: Springer New York Inc.; 2001.10.1007/978-0-387-21606-5Suche in Google Scholar
[20] Breiman L. Random forests. Mach Learn 2001;45:5–32.10.1023/A:1010933404324Suche in Google Scholar
©2020 Walter de Gruyter GmbH, Berlin/Boston
Artikel in diesem Heft
- Frontmatter
- Research articles
- Evaluation of potential auras in generalized epilepsy from EEG signals using deep convolutional neural networks and time-frequency representation
- Matching pursuit algorithm for enhancing EEG signal quality and increasing the accuracy and efficiency of emotion recognition
- Properties of different types of dry electrodes for wearable smart monitoring devices
- Dual-frequency bioelectrical phase angle to estimate the platelet count for the prognosis of dengue fever in Indian children
- Adolescent age estimation using voice features
- A comprehensive evaluation for the prediction of mortality in intensive care units with LSTM networks: patients with cardiovascular disease
- A Matlab toolbox for analyzing repetitive movements: application in gait and tapping experiments
- Electrical stimulator with mechanomyography-based real-time monitoring, muscle fatigue detection, and safety shut-off: a pilot study
- Objectivization of vacuum-compression therapy effects on micro- and macrovascular perfusion in type 2 diabetic patients
- Optimisation of the drill-in behaviour of the EcoFit® SC threaded cup
- Realisation and assessment of a low-cost LED device for contact lens disinfection by visible violet light
- Developing a novel resorptive hydroxyapatite-based bone substitute for over-critical size defect reconstruction: physicochemical and biological characterization and proof of concept in segmental rabbit’s ulna reconstruction
Artikel in diesem Heft
- Frontmatter
- Research articles
- Evaluation of potential auras in generalized epilepsy from EEG signals using deep convolutional neural networks and time-frequency representation
- Matching pursuit algorithm for enhancing EEG signal quality and increasing the accuracy and efficiency of emotion recognition
- Properties of different types of dry electrodes for wearable smart monitoring devices
- Dual-frequency bioelectrical phase angle to estimate the platelet count for the prognosis of dengue fever in Indian children
- Adolescent age estimation using voice features
- A comprehensive evaluation for the prediction of mortality in intensive care units with LSTM networks: patients with cardiovascular disease
- A Matlab toolbox for analyzing repetitive movements: application in gait and tapping experiments
- Electrical stimulator with mechanomyography-based real-time monitoring, muscle fatigue detection, and safety shut-off: a pilot study
- Objectivization of vacuum-compression therapy effects on micro- and macrovascular perfusion in type 2 diabetic patients
- Optimisation of the drill-in behaviour of the EcoFit® SC threaded cup
- Realisation and assessment of a low-cost LED device for contact lens disinfection by visible violet light
- Developing a novel resorptive hydroxyapatite-based bone substitute for over-critical size defect reconstruction: physicochemical and biological characterization and proof of concept in segmental rabbit’s ulna reconstruction