Analysis of encoder representations as features using sparse autoencoders in gradient boosting and ensemble tree models

Luis Aguilar, L. Antonio Aguilar

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

The performance of learning algorithms relies on factors such as the training strategy, the parameter tuning approach, and data complexity; in this scenario, extracted features play a fundamental role. Since not all the features maintain useful information, they can add noise, thus decreasing the performance of the algorithms. To address this issue, a variety of techniques such as feature ex-traction, feature engineering and feature selection have been developed, most of which fall into the unsupervised learning category. This study explores the generation of such features, using a set of k encoder layers, which are used to produce a low dimensional feature set F. The encoder layers were trained using a two-layer depth sparse autoencoder model, where PCA was used to estimate the right number of hidden units in the first layer. Then, a set of four algorithms, which belong to the gradient boosting and ensemble families were trained using the generated features. Finally, a performance comparison, using the encoder features against the original features was made. The results show that by using the reduced features it is possible to achieve equal or better results. Also, the approach improves more with highly imbalanced data sets.

Idioma originalInglés
Título de la publicación alojadaAdvances in Artificial Intelligence – IBERAMIA 2018 - 16th Ibero-American Conference on AI, Proceedings
EditoresEduardo Fermé, Guillermo R. Simari, Flabio Gutiérrez Segura, José Antonio Rodríguez Melquiades
EditorialSpringer Verlag
Páginas159-169
Número de páginas11
ISBN (versión impresa)9783030039271
DOI
EstadoPublicada - 2018
Evento16th Ibero-American Conference on Artificial Intelligence, IBERAMIA 2018 - Trujillo, Perú
Duración: 13 nov. 201816 nov. 2018

Serie de la publicación

NombreLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen11238 LNAI
ISSN (versión impresa)0302-9743
ISSN (versión digital)1611-3349

Conferencia

Conferencia16th Ibero-American Conference on Artificial Intelligence, IBERAMIA 2018
País/TerritorioPerú
CiudadTrujillo
Período13/11/1816/11/18

Huella

Profundice en los temas de investigación de 'Analysis of encoder representations as features using sparse autoencoders in gradient boosting and ensemble tree models'. En conjunto forman una huella única.

Citar esto