A Pruning of Random Forests: a diversity-based heuristic measure to simplify a random forest ensemble

Main Article Content

souad taleb zouggar abdelkader adla

Abstract

Random forests are among the most successful ensemble methods. They are fast, noise-resistant and do not suffer from over-learning. Moreover, they offer possibilities of explanation and visualization. In this paper, we propose to simplify a set of random forests using an entropy function that measures the diversity of trees in the forest. The function is used in two types of paths: an SFS path and a path based on genetic algorithms. The proposed methods are applied to datasets of the UCI Repository. The results are encouraging and provide ensembles of smaller sizes with performances that are similar to or even,in some cases,exceed the performances of the initial forest. Moreover, the comparison between the two methods shows that in most cases SFS provides reduced ensemble compared to GA, but the latter gives better success rates in the majority of cases.

Article Details

How to Cite
TALEB ZOUGGAR, souad; ADLA, abdelkader. A Pruning of Random Forests: a diversity-based heuristic measure to simplify a random forest ensemble. INFOCOMP Journal of Computer Science, [S.l.], v. 18, n. 1, p. 01-08, june 2019. ISSN 1982-3363. Available at: <http://www.dcc.ufla.br/infocomp/index.php/INFOCOMP/article/view/582>. Date accessed: 16 oct. 2019.
Section
Machine Learning and Computational Intelligence