Automatic speech-to-text transcription in an ecuadorian radio broadcast context

Sigcha, E; Espinoza Mejía, Jorge Mauricio; Medina, J; Saquicela Galarza, Víctor Hugo; Vega, F

Publication:
Automatic speech-to-text transcription in an ecuadorian radio broadcast context

dc.contributor.author	Sigcha, E
dc.contributor.author	Espinoza Mejía, Jorge Mauricio
dc.contributor.author	Medina, J
dc.contributor.author	Saquicela Galarza, Víctor Hugo
dc.contributor.author	Vega, F
dc.date.accessioned	2018-01-11T16:47:50Z
dc.date.available	2018-01-11T16:47:50Z
dc.date.issued	2017-09-19
dc.description.abstract	A key element to enable the analysis and accessing to radio broadcast content is the development of automatic speech-to-text systems. The building of these systems has been possible given the current available of different speech resources, models, and open source services designed mainly for English language. However, the most of these tools have been migrated to other languages like Spanish for avoiding the creation of these systems from scratch. Despite existing efforts there is no clear evidence of the tools that can be used to convert audio to text in other dialects of Spanish. Also, the most of these systems are trained to consider a specific context, therefore, audio transcription systems personalized for a language and a specific context are needed. This article describes the implementation of an architecture oriented to automatic speech-to-text transcription applied on Ecuadorian radio broadcasters, using available free tools for performing audio segmentation and transcription. The selected tools were evaluated measuring their performance and facilities for adjusting to the defined architecture. At the end, a Web application was developed and its final performance was compared with IBM Watson speech to text service; the results show that the proposed system improves the accuracy and achieves a Word Error Rate around 10%. The obtained results allow to suggest the use of a free tools set in order to train models oriented to specific speech-to-text transcription scenarios.
dc.description.city	Cali
dc.identifier.doi	10.1007/978-3-319-66562-7_49
dc.identifier.isbn	9783319665610
dc.identifier.issn	18650929
dc.identifier.uri	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85028800153&doi=10.1007%2f978-3-319-66562-7_49&partnerID=40&md5=fc942b108a228279f3e96b2b1984f4d1
dc.identifier.uri	http://dspace.ucuenca.edu.ec/handle/123456789/29245
dc.language.iso	en_US
dc.publisher	SPRINGER VERLAG
dc.source	Communications in Computer and Information Science
dc.subject	Audio Content Analysis
dc.subject	Automatic Audio Segmentation
dc.subject	Automatic Speech Recognition
dc.subject	Python
dc.subject	Speech To Text
dc.title	Automatic speech-to-text transcription in an ecuadorian radio broadcast context
dc.type	Article
dc.ucuenca.afiliacion	sigcha, e., school of systems engineering, university of cuenca, cuenca, ecuador
dc.ucuenca.afiliacion	espinoza, m., computer science department, university of cuenca, cuenca, ecuador
dc.ucuenca.afiliacion	medina, j., department of electrical, electronic engineering and telecommunications, university of cuenca, cuenca, ecuador
dc.ucuenca.afiliacion	saquicela, v., computer science department, university of cuenca, cuenca, ecuador
dc.ucuenca.afiliacion	vega, f., computer science department, university of cuenca, cuenca, ecuador
dc.ucuenca.correspondencia	Espinoza, M.; Computer Science Department, University of CuencaEcuador; email: mauricio.espinoza@ucuenca.edu.ec
dc.ucuenca.cuartil	Q3
dc.ucuenca.embargoend	2022-01-01 0:00
dc.ucuenca.factorimpacto	0.162
dc.ucuenca.idautor	0102778818
dc.ucuenca.idautor	0103599577
dc.ucuenca.indicebibliografico	SCOPUS
dc.ucuenca.nombrerevista	12th Colombian Conference on Computing CCC 2017
dc.ucuenca.volumen	735
dspace.entity.type	Publication
relation.isAuthorOfPublication	7f498bd8-8097-48e6-9d44-32c96e3abd23
relation.isAuthorOfPublication	48f3b0ef-dc7f-4a21-9cca-597c4a692117
relation.isAuthorOfPublication.latestForDiscovery	7f498bd8-8097-48e6-9d44-32c96e3abd23