Browsing by Author "Orellana Cordero, Marcos Patricio"

Now showing 1 - 10 of 10

A Comparative evaluation of preprocessing techniques for short texts in spanish
(Springer, 2020) Orellana Cordero, Marcos Patricio; Trujillo, Andrea; Cedillo Orellana, Irene Priscila
Natural Language Processing (NLP) is used to identify key information, generating predictive models, and explaining global events or trends. Also, NLP is supported during the process to create knowledge. Therefore, it is important to apply refinement techniques in major stages such as preprocessing, when data is frequently produced and processed with poor results. This document analyzes and measures the impact of combinations of preprocessing techniques and libraries for short texts that have been written in Spanish. These techniques were applied in tweets for analysis of sentiments considering evaluation parameters in its analysis, the processing time and characteristics of the techniques for each library. The performed experimentation provides readers insights for choosing the appropriate combination of techniques during preprocessing. The results show improvement of up to 5% to 9% in the performance of the classification.
A data infrastructure for managing information obtained from ambient assisted living
(Institute of Electrical and Electronics Engineers Inc., 2019) Valdez Solis, Wilson Fernando; Cedillo Orellana, Irene Priscila; Trujillo Orellana, Andrea Alexandra; Orellana Cordero, Marcos Patricio
The Internet of Things (IoT) is a current paradigm which can be part of several fields of application, and Healthcare is one of the most important. Ambient Assisted Living (AAL) is an important subfield in Healthcare. Also, services demand and network requirements of IoT systems due to the increase of connected devices and data flow overcharges Cloud. Fog Computing represents a new paradigm that lightens the network. Some applications use Fog Computing in order to reduce costs and improve performance of IoT applications. Healthcare is one of them and the implementation of Fog Computing architectures inside those systems is a new trend nowadays. The importance of having a data infrastructure in this kind of IoT systems is evident because of the relevance of the data. In this paper is presented a Data Infrastructure for Managing Information Obtained from Ambient Assisted Living. The model provides the data flow starting in the data reading in the IoT devices and ending in the Cloud by crossing Fog Computing Layer. The model considers communication protocols, security features and relationship between elements which take part in an AAL system. Moreover, the model was developed using Eclipse Modeling Tool and validated in the same tool by using an instantiation of the main classes of the model.
A methodology to develop an outdoor activities recommender based on air pollution variables
(Springer Science and Business Media Deutschland GmbH, 2022) Loja Arevalo, Pablo Santiago; Orellana Cordero, Marcos Patricio; Cedillo Orellana, Irene Priscila; Lima Sigua, Juan Fernando; Zambrano Martinez, Jorge Luis
Nowadays, the world faces a high level of environmental pollution. This phenomenon has become a constant challenge for our society due to its negative impact on health and the increased risk of disease. Considering this problem, applications, techniques and methodologies are generated that seek to relate atmospheric pollutants to each other to predict the state of the air. On the other hand, recommendation systems are present in numerous decision-making methods to find trends in various fields. Consequently, this work presents a methodology for a recommender system that provides people with the best hours to perform outdoor activities according to the pollutants found in the environment. The results obtained were verified through an evaluation and thus be able to contribute to the creation of new recommenders based on the previous topics
Clasificación de artículos académicos sobre la pandemia de la COVID-19, a través de técnicas de minería de texto
(Universidad de Cuenca, 2023-01-06) Vásquez Vanegas, Bayron Fernando; Orellana Cordero, Marcos Patricio
exactitud del 74%, en comparación con los modelos Word2Vec y Glove que alcanzaron el 72% y 65% respectivamente, siendo esta técnica una de las mejores opciones al momento de emplear modelos de representación semántica del texto.
Comparando técnicas de minería de datos en un centro de emergencias
(2021) Llivisaca Largo, Brandon Iván; Gutierrez García, Bayron Stalyn; Cedillo Orellana, Irene Priscila; Orellana Cordero, Marcos Patricio; Patiño León, Andrés
Data mining techniques applied in the neuropsychology domain: a systematic review
(IEEE explore, 2020) Alvear Padilla, Luis Miguel; Cedillo Orellana, Irene Priscila; Patiño León, Paúl Andrés; Lima Sigua, Juan Fernando; Bueno Pacheco, Gladys Alexandra; Acosta Uriguen, María Inés; Orellana Cordero, Marcos Patricio; Cordero Machuca, David Raúl
The vast amount of data collected in the health care field allows researchers to apply different data mining techniques that may support and improve patients’ condition. The neurocognitive field represents a new area of interest in which data mining techniques could be extremely useful to diagnose and treat impairments. This systematic literature review aims to identify the most relevant data mining techniques, tools, and approaches to collect, process, and represent results used in neuropsychology for the analysis of memory and cognitive attention. The results will contribute to the development of new tools to evaluate the neuropsychological variables cited before in elderly people in the context of a normal aging process.
Detección de valores atípicos con técnicas de minería de datos y métodos estadísticos
(2020) Cedillo Orellana, Irene Priscila; Orellana Cordero, Marcos Patricio
The detection of outliers in the field of data mining (DM) and the process of knowledge discovery in databases (KDD) is of great interest in areas that require support systems for decision making. A straightforward application can be found in the financial area, where DM can potentially detect financial fraud or find errors produced by the users. Thus, it is essential to evaluate the veracity of the information, through the use of methods for the detection of unusual behaviors in the data. This paper proposes a method to detect values that are considered outliers in a database of nominal type data. The method implements a global algorithm of "k" closest neighbors, a clustering algorithm called k-means and a statistical method called chi-square. These techniques have been implemented on a database of clients who have requested a financial credit. The experiment was performed on a data set with 1180 tuples, where, outliers were deliberately introduced. The results showed that the proposed method is able to detect all the outliers entered.
Discovering patterns of time association among air pollution and meteorological variables
(Springer, 2021) Orellana Cordero, Marcos Patricio; Lima Sigua, Juan Fernando; Cedillo Orellana, Irene Priscila
Lately, there is a concern about to air pollution, which leads to environmental specialists discovering relevant causes of this phenomenon. Several factors determine the level of pollution, but it is necessary to find behavior patterns between air pollution and meteorological variables. The relations between these variables in distinct hours a day could give clues to discover essential patterns in their relationships. This study revealed relations among five air pollution variables and nine meteorological variables collected for one month in the city Cuenca-Ecuador. The method used considerer an evaluation of the essential time associations using time rolling windows and correlations. The results were revelated using visualization frames for dimensions such as time, correlation rate, and component relation, highlighting 57 strong correlations from 91 pairs of variables, the best positive correlation is between Ozone and Radiation UVA. The best negative correlation is Ozone and Dew Point, both throughout the day.
Outlier detection with data mining techniques and statistical methods
(Institute of Electrical and Electronics Engineers Inc., 2019) Orellana Cordero, Marcos Patricio; Cedillo Orellana, Irene Priscila
The outlier detection in the field of data mining and Knowledge Discovering from Data (KDD) is capturing special interest due to its benefits. It can be applied in the financial area; because the obtained data patterns can help finding possible frauds and user errors. Therefore, it is essential to assess the truthfulness of the information. In this context, data auditory process uses techniques of data mining that play a significant role in the detection of unusual behavior. Here, a method for detecting values that can be considered as outliers in a nominal database is proposed. The basic idea in this method is to implement: a Global k-Nearest Neighbors algorithm, a clustering algorithm named k-means, and a statistical method of chi-square. The application of algorithms has been developed with a database of candidate people for the granting of a loan. Each test was made on a dataset of 1180 registers in which outliers have been introduced deliberately. The experimental results show that the method is able to detect all introduced values, which were previously labeled to be differentiated. Consequently, there were found a total of 48 tuples with outliers of 11 nominal columns. © 2019 IEEE.
Reconocimiento del habla con acento español basado en un modelo acústico
(2022) Sánchez -Zhunio, Cristina; Plaza Salto, Johanna Gabriela; Zambrano Martínez, Jorge Luis; Cedillo Orellana, Irene Priscila; Orellana Cordero, Marcos Patricio; Acosta Urigüen, María Inés
The objective of the article was to generate an Automatic Speech Recognition (ASR) model based on the translation from human voice to text, being considered as one of the branches of artificial intelligence. Voice analysis allows identifying information about the acoustics, phonetics, syntax, semantics of words, among other elements where ambiguity in terms, pronunciation errors, similar syntax but different semantics can be identified, which represent characteristics of the language. The model focused on the acoustic analysis of words proposing the generation of a methodology for acoustic recognition from speech transcripts from audios containing human voice and the error rate per word was considered to identify the accuracy of the model. The audios were taken from the Integrated Security Service ECU911 that represent emergency calls registered by the entity. The model was trained with the CMUSphinx tool for the Spanish language without internet connection. The results showed that the word error rate varies in relation to the number of audios; that is, the greater the number of audios, the smaller number of erroneous words and the greater the accuracy of the model. The investigation concluded by emphasizing the duration of each audio as a variable that affects the accuracy of the model.