Home A concise and effectual method for neutral pitch identification in stuttered speech
Article
Licensed
Unlicensed Requires Authentication

A concise and effectual method for neutral pitch identification in stuttered speech

  • Yashaswi Alva M , Nachamai M EMAIL logo and Joy Paulose
Published/Copyright: June 22, 2016

Abstract

Researchers have studied that human-computer interactions (HCIs) can be more effective only when machines understand the emotions conveyed in speech. Speech emotion recognition has seen growing interest in research due to its usefulness in different applications. Building a neutral speech model becomes an important and challenging task as it can help in identifying different emotions from stuttered speech. This paper suggests two different approaches for identifying neutral pitch from stuttered speech. The implementation has proved through its accuracy the best model that can be adopted for neutral speech pitch identification.

References

1. Schuller B, Rigoll G, Lang M. Hidden markov model-based speech emotion recognition. 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP ’03). 2003;2:401–4.Search in Google Scholar

2. SUSAS – Linguistic Data Consortium. Available at: https://catalog.ldc.upenn.edu/LDC99S78. Accessed December 12, 2015.Search in Google Scholar

3. Womack BD, Hansen JH. Stressed speech recognition using multi-dimensional hidden maskov models past research studies on stress. IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings. 1997;404–11.Search in Google Scholar

4. Zhang JZ, Mbitiru N, Tay PC, Adams RD. Analysis of stress in speech using adaptive empirical mode decomposition. 2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers. 2009;361–5.Search in Google Scholar

5. Steeneken H, Hansen J. Speech under stress conditions: overview of the effect on speech production and on system performance. Acoustics, Speech, and Signal Processing, 1999. Proceedings of the IEEE International Conference. Phoenix, AZ, 1999;4:2079–82.Search in Google Scholar

6. Nwe T, Foo S, De Silva C. Detection of stress and emotion in speech using traditional and FFT based log energy features. Information, Communications and Signal Processing, 2003 and Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint Conference of the Fourth International Conference. 2003;3:1619–23.10.1109/ICICS.2003.1292741Search in Google Scholar

7. Bou-ghazale SE, Hansen JH, Member S. A comparative study of traditional and newly proposed features for recognition of speech under stress. Speech and Audio Processing, IEEE Transactions on 8.4, 2000:429–42.10.1109/89.848224Search in Google Scholar

8. Berlin Database of Emotional Speech. Available at: http://www.expressive-speech.net/. Accessed March 27, 2015.Search in Google Scholar

9. Sidorova J. Speech emotion recognition with TGI+. 2 classifier. Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pp. 54–60. Association for Computational Linguistics, 2009.10.3115/1609179.1609186Search in Google Scholar

10. Chee LS. Automatic Detection of Prolongations and Repetitions using LPCC. International Conference for Technical Postgraduates 2009, TECHPOS. 2009:1–4.10.1109/TECHPOS.2009.5412080Search in Google Scholar

11. Hariharan M, Vijean V, Fook CY, Yaacob S. Speech stuttering assessment using sample entropy and Least Square Support Vector Machine. In Proceedings – 2012 IEEE 8th International Colloquium on Signal Processing and Its Applications, CSPA 2012:240–5.10.1109/CSPA.2012.6194726Search in Google Scholar

12. Kishore KV, Satish PK. Emotion recognition in speech using MFCC and wavelet features. Advance Computing Conference (IACC), 2013 IEEE 3rd International. 2013:842–7.Search in Google Scholar

13. Shukla S, Prasanna SR, Dandapat S. Stressed speech processing: human vs automatic in non-professional speakers scenario. 2011 National Conference on Communications (NCC). 2011:1–5. Available at: http://doi.org/10.1109/NCC.2011.5734704.10.1109/NCC.2011.5734704Search in Google Scholar

14. Ramamohan S, Dandapat S. Sinusoidal model-based analysis and classification of stressed speech. IEEE Transactions on Audio, Speech and Language Processing. 2006:14:737–46.10.1109/TSA.2005.858071Search in Google Scholar

15. Howell P, Davis S, Bartrip J. The University College London Archive of Stuttered Speech (UCLASS). J Speech Lang Hear Res 2009;52:556–69.10.1044/1092-4388(2009/07-0129)Search in Google Scholar

16. Release 2. Available at: http://www.uclass.psychol.ucl.ac.uk/uclass2.htm. Accessed December 12, 2015.Search in Google Scholar

Received: 2016-2-18
Accepted: 2016-5-1
Published Online: 2016-6-22

©2017 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 8.10.2025 from https://www.degruyterbrill.com/document/doi/10.1515/ijdhd-2016-0012/html
Scroll to top button