Publication: A Comparative evaluation of preprocessing techniques for short texts in spanish
| dc.contributor.author | Orellana Cordero, Marcos Patricio | |
| dc.contributor.author | Trujillo, Andrea | |
| dc.contributor.author | Cedillo Orellana, Irene Priscila | |
| dc.date.accessioned | 2020-06-12T16:08:08Z | |
| dc.date.available | 2020-06-12T16:08:08Z | |
| dc.date.issued | 2020 | |
| dc.description | Natural Language Processing (NLP) is used to identify key information, generating predictive models, and explaining global events or trends. Also, NLP is supported during the process to create knowledge. Therefore, it is important to apply refinement techniques in major stages such as preprocessing, when data is frequently produced and processed with poor results. This document analyzes and measures the impact of combinations of preprocessing techniques and libraries for short texts that have been written in Spanish. These techniques were applied in tweets for analysis of sentiments considering evaluation parameters in its analysis, the processing time and characteristics of the techniques for each library. The performed experimentation provides readers insights for choosing the appropriate combination of techniques during preprocessing. The results show improvement of up to 5% to 9% in the performance of the classification. | |
| dc.description.abstract | Natural Language Processing (NLP) is used to identify key information, generating predictive models, and explaining global events or trends. Also, NLP is supported during the process to create knowledge. Therefore, it is important to apply refinement techniques in major stages such as preprocessing, when data is frequently produced and processed with poor results. This document analyzes and measures the impact of combinations of preprocessing techniques and libraries for short texts that have been written in Spanish. These techniques were applied in tweets for analysis of sentiments considering evaluation parameters in its analysis, the processing time and characteristics of the techniques for each library. The performed experimentation provides readers insights for choosing the appropriate combination of techniques during preprocessing. The results show improvement of up to 5% to 9% in the performance of the classification. | |
| dc.description.city | San Francisco | |
| dc.identifier.doi | 10.1007/978-3-030-39442-4_10 | |
| dc.identifier.isbn | 978-303039441-7 | |
| dc.identifier.issn | 2194-5357 | |
| dc.identifier.uri | https://link.springer.com/chapter/10.1007/978-3-030-39442-4_10 | |
| dc.language.iso | es_ES | |
| dc.publisher | Springer | |
| dc.source | Advances in Intelligent Systems and Computing | |
| dc.subject | Natural language processing | |
| dc.subject | Preprocessing | |
| dc.subject | ||
| dc.subject | Sentiment analysis | |
| dc.subject | Text mining | |
| dc.title | A Comparative evaluation of preprocessing techniques for short texts in spanish | |
| dc.title.alternative | A comparative evaluation of preprocessing techniques for short texts in spanish | |
| dc.type | ARTÍCULO DE CONFERENCIA | |
| dc.ucuenca.afiliacion | Orellana, M., Universidad del Azuay, Cuenca, Ecuador | |
| dc.ucuenca.afiliacion | Trujillo, A., Universidad del Azuay, Cuenca, Ecuador | |
| dc.ucuenca.afiliacion | Cedillo, I., Universidad del Azuay, Cuenca, Ecuador; Cedillo, I., Universidad de Cuenca, Cuenca, Ecuador | |
| dc.ucuenca.areaconocimientofrascatiamplio | 5. Ciencias Sociales | |
| dc.ucuenca.areaconocimientofrascatidetallado | 5.1.2 Psicología Especial(Terapia de Aprendizaje, Habla | |
| dc.ucuenca.areaconocimientofrascatiespecifico | 5.1 Psicología y Ciencias Cognitivas | |
| dc.ucuenca.areaconocimientounescoamplio | 03 - Ciencias Sociales, Periodismo e Información | |
| dc.ucuenca.areaconocimientounescodetallado | 0313 - Psicología | |
| dc.ucuenca.areaconocimientounescoespecifico | 031 - Ciencias Sociales y Ciencias del Comportamiento | |
| dc.ucuenca.comiteorganizadorconferencia | Organización de Ciencia e Información (SAI) | |
| dc.ucuenca.conferencia | Future of Information and Communication Conference (FICC) 2020 | |
| dc.ucuenca.correspondencia | Cedillo Orellana, Irene Priscila, priscila.cedillo@ucuenca.edu.ec | |
| dc.ucuenca.cuartil | Q3 | |
| dc.ucuenca.embargoend | 2050-01-12 | |
| dc.ucuenca.embargointerno | 2050-01-12 | |
| dc.ucuenca.factorimpacto | 0.184 | |
| dc.ucuenca.fechafinconferencia | 2020-03-06 | |
| dc.ucuenca.fechainicioconferencia | 2020-03-05 | |
| dc.ucuenca.idautor | 0102668209 | |
| dc.ucuenca.idautor | Sgrp-3157-2 | |
| dc.ucuenca.idautor | 0102815842 | |
| dc.ucuenca.indicebibliografico | SCOPUS | |
| dc.ucuenca.numerocitaciones | 0 | |
| dc.ucuenca.organizadorconferencia | Organización de Ciencia e Información (SAI) | |
| dc.ucuenca.pais | ESTADOS UNIDOS | |
| dc.ucuenca.urifuente | https://link.springer.com/book/10.1007/978-3-030-39442-4 | |
| dc.ucuenca.version | Versión publicada | |
| dc.ucuenca.volumen | Volumen 1130 | |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | 9ecaad85-5b06-4b92-b05c-0d89c7b10660 | |
| relation.isAuthorOfPublication.latestForDiscovery | 9ecaad85-5b06-4b92-b05c-0d89c7b10660 |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- documento.pdf
- Size:
- 363.43 KB
- Format:
- Adobe Portable Document Format
- Description:
- document
