Generación de un grafo de conocimiento de periódicos antiguos del Ecuador a través de procesos OCR.

Torres Cordero, Raul Sebastian; Valdez Llivisaca, Jonnathan Andrés

Please use this identifier to cite or link to this item: http://dspace.ucuenca.edu.ec/handle/123456789/42507

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Saquicela Galarza, Víctor Hugo	-
dc.contributor.author	Torres Cordero, Raul Sebastian	-
dc.contributor.author	Valdez Llivisaca, Jonnathan Andrés	-
dc.date.accessioned	2023-07-27T15:19:27Z	-
dc.date.available	2023-07-27T15:19:27Z	-
dc.date.issued	2023-07-26	-
dc.identifier.uri	http://dspace.ucuenca.edu.ec/handle/123456789/42507	-
dc.description	La historia nos revela la existencia de una multitud de eventos que se desarrollan en el mundo día a día, dejando una huella en el tiempo. Antiguamente, la transmisión de ese conocimiento se realizaba de manera oral y se mantenía vivo a través de generaciones. No obstante, el avance de la tecnología ha revolucionado la forma en que accedemos a la información y nos ha permitido explorar registros históricos en una escala sin precedentes. En este contexto, surge un desafío, gran parte de esa información yace dormida en periódicos antiguos, los cuales se encuentran en un estado de deterioro y son difíciles de tratar. Estos periódicos contienen relatos de eventos de la historia del Ecuador en los siglos XIX y XX, pero acceder a esa información de manera rápida y eficiente es un desafío. Para abordar este problema, en este trabajo de titulación, se propone una solución basada en la digitalización de texto, el procesamiento texto y las tecnologías de la web semántica. El objetivo principal es extraer la información de los periódicos antiguos, organizarla de manera estructurada y generar un grafo de conocimiento que represente los eventos ocurridos en Ecuador durante ese período histórico. La solución propuesta implica la automatización de cada uno de los pasos del proceso. Para lograrlo, se han construido varios widgets en Orange, que permite realizar tareas específicas en cada etapa del proceso. Estos widgets trabajan en conjunto para extraer la información, identificar entidades y relaciones, obtener Word Embendings y generar un grafo de conocimiento.	en_US
dc.description.abstract	History reveals to us the existence of a multitude of events that unfold in the world day by day, leaving a footprint in time. In the past, the transmission of this knowledge was done orally and kept alive through generations. However, the advancement of technology has revolutionized the way we access information and has allowed us to explore historical records on an unprecedented scale. In this context, a challenge arises: a large portion of this valuable information lies dormant in old newspapers, which are in a state of deterioration and are difficult to handle. These newspapers contain detailed accounts of events that marked Ecuador’s history in the 19th and 20th centuries, but accessing that information quickly and efficiently has become a challenge. To address this problem, this thesis proposes a solution based on text digitization, text processing, and semantic web technologies. The main objective is to extract information from old newspapers, organize it in a structured manner, and generate a knowledge graph that represents the events that occurred in Ecuador during that historical period. As part of this solution, a prototype search engine has also been developed that utilizes the generated knowledge graph. This search engine is one of the many ways to exploit the graph and allows users to make specific queries and searches related to historical events, people, places, and topics in the context of old newspapers. The proposed solution involves the automation of each step of the process. To achieve this, several widgets have been built in Orange, a visual data analysis platform, that allows for specific tasks to be performed at each stage of the process. These widgets include text digitization tools, text processing techniques, and semantic web algorithms that work together to extract relevant information, identify entities and relationships, obtain Word Embeddings, and generate a knowledge graph enriched with historical events.	en_US
dc.description.uri	0000-0002-2438-9220	en_US
dc.format	application/pdf	en_US
dc.format.extent	79 páginas	en_US
dc.language.iso	spa	en_US
dc.publisher	Universidad de Cuenca	en_US
dc.relation.ispartof	TS;308	-
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Internacional	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	Ingeniería de Sistemas	en_US
dc.subject	Ontología	en_US
dc.subject	Web semántica	en_US
dc.subject.other	CIUC::Informática::Procesamiento	en_US
dc.title	Generación de un grafo de conocimiento de periódicos antiguos del Ecuador a través de procesos OCR.	en_US
dc.type	bachelorThesis	en_US
dcterms.description	Ingeniero en Ciencias de la Computación	en_US
dcterms.spatial	Cuenca, Ecuador	en_US
dc.rights.accessRights	openAccess	en_US
Appears in Collections:	Tesis de Pregrado

Files in This Item:

File	Description	Size	Format
Trabajo-de-Titulación.pdf	Versión presentada (texto completo)	1.63 MB	Adobe PDF	View/Open

This item is protected by original copyright

View License

Show simple item record

This item is licensed under a Creative Commons License

Centro de Documentacion Regional "Juan Bautista Vázquez"

Biblioteca Campus Central		Biblioteca Campus Salud		Biblioteca Campus Yanuncay
Av. 12 de Abril y Calle Agustín Cueva, Telf: 4051000 Ext. 1311, 1312, 1313, 1314. Horario de atención: Lunes-Viernes: 07H00-21H00. Sábados: 08H00-12H00		Av. El Paraíso 3-52, detrás del Hospital Regional "Vicente Corral Moscoso", Telf: 4051000 Ext. 3144. Horario de atención: Lunes-Viernes: 07H00-19H00		Av. 12 de Octubre y Diego de Tapia, antiguo Colegio Orientalista, Telf: 4051000 Ext. 3535 2810706 Ext. 116. Horario de atención: Lunes-Viernes: 07H30-19H00