Startseite Medizin Speech transformation solutions
Kapitel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Speech transformation solutions

  • Dimitri Kanevsky , Sara Basson , Alexander Faisman , Leonid Rachevsky , Alex Zlatsin und Sarah Conrod
Weitere Titel anzeigen von John Benjamins Publishing Company
Cognition Distributed
Ein Kapitel aus dem Buch Cognition Distributed

Abstract

This paper outlines the background development of “intelligent” technologies such as speech recognition. Despite significant progress in the development of these technologies, they still fall short in many areas, and rapid advances in areas such as dictation are actually stalled. In this paper we have proposed semi-automatic solutions — smart integration of human and intelligent efforts. One such technique involves improvement to the speech recognition editing interface, thereby reducing the perception of errors to the viewer. Other techniques that are described in the paper are batch enrollment, which allows the user to reduce the amount of time required for enrollment, and content spotting, which can be used for applications that have repeated content flow, such as movies or museum tours. The paper also suggests a general concept of distributive training of speech recognition systems that is based on data collection across a network.

Abstract

This paper outlines the background development of “intelligent” technologies such as speech recognition. Despite significant progress in the development of these technologies, they still fall short in many areas, and rapid advances in areas such as dictation are actually stalled. In this paper we have proposed semi-automatic solutions — smart integration of human and intelligent efforts. One such technique involves improvement to the speech recognition editing interface, thereby reducing the perception of errors to the viewer. Other techniques that are described in the paper are batch enrollment, which allows the user to reduce the amount of time required for enrollment, and content spotting, which can be used for applications that have repeated content flow, such as movies or museum tours. The paper also suggests a general concept of distributive training of speech recognition systems that is based on data collection across a network.

Heruntergeladen am 1.10.2025 von https://www.degruyterbrill.com/document/doi/10.1075/bct.16.15kan/html
Button zum nach oben scrollen