Logo Repositorio Institucional

Please use this identifier to cite or link to this item: http://dspace.ucuenca.edu.ec/handle/123456789/29083
Title: Modification of the random forest algorithm to avoid statistical dependence problems when classifying remote sensing imagery
Authors: Canovas Garcia, Fulgencio Jose
metadata.dc.ucuenca.correspondencia: Cánovas-García, F.; Departamento de Geología y Minas e Ingeniera Civil, Universidad Técnica Particular de Loja, San Cayetano Alto s/n, Ecuador; email: fulgencio.canovas@um.es
Keywords: Bagging
Classification
Object-Based Image Analysis
Random Forest
Statistical Independence
Issue Date: 1-Jun-2017
metadata.dc.ucuenca.embargoend: 1-Jan-2022
metadata.dc.ucuenca.volumen: 103
metadata.dc.source: Computers and Geosciences
metadata.dc.identifier.doi: 10.1016/j.cageo.2017.02.012
Publisher: ELSEVIER LTD
metadata.dc.type: Article
Abstract: 
Random forest is a classification technique widely used in remote sensing. One of its advantages is that it produces an estimation of classification accuracy based on the so called out-of-bag cross-validation method. It is usually assumed that such estimation is not biased and may be used instead of validation based on an external data-set or a cross-validation external to the algorithm. In this paper we show that this is not necessarily the case when classifying remote sensing imagery using training areas with several pixels or objects. According to our results, out-of-bag cross-validation clearly overestimates accuracy, both overall and per class. The reason is that, in a training patch, pixels or objects are not independent (from a statistical point of view) of each other; however, they are split by bootstrapping into in-bag and out-of-bag as if they were really independent. We believe that putting whole patch, rather than pixels/objects, in one or the other set would produce a less biased out-of-bag cross-validation. To deal with the problem, we propose a modification of the random forest algorithm to split training patches instead of the pixels (or objects) that compose them. This modified algorithm does not overestimate accuracy and has no lower predictive capability than the original. When its results are validated with an external data-set, the accuracy is not different from that obtained with the original algorithm. We analysed three remote sensing images with different classification approaches (pixel and object based); in the three cases reported, the modification we propose produces a less biased accuracy estimation.
URI: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85014293109&doi=10.1016%2fj.cageo.2017.02.012&partnerID=40&md5=12ff83094bce004779b84e4ae9137616
http://dspace.ucuenca.edu.ec/handle/123456789/29083
ISSN: 983004
Appears in Collections:Artículos

Files in This Item:
File Description SizeFormat 
documento.pdf168.92 kBAdobe PDFThumbnail
View/Open


This item is protected by original copyright



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Centro de Documentacion Regional "Juan Bautista Vázquez"

Biblioteca Campus Central Biblioteca Campus Salud Biblioteca Campus Yanuncay
Av. 12 de Abril y Calle Agustín Cueva, Telf: 4051000 Ext. 1311, 1312, 1313, 1314. Horario de atención: Lunes-Viernes: 07H00-21H00. Sábados: 08H00-12H00 Av. El Paraíso 3-52, detrás del Hospital Regional "Vicente Corral Moscoso", Telf: 4051000 Ext. 3144. Horario de atención: Lunes-Viernes: 07H00-19H00 Av. 12 de Octubre y Diego de Tapia, antiguo Colegio Orientalista, Telf: 4051000 Ext. 3535 2810706 Ext. 116. Horario de atención: Lunes-Viernes: 07H30-19H00