Machine Learning · Ensembles and Model Selection

Random Forest — bagging trees with feature subsampling

Ensembles and Model Selection

Introduction

Random Forest (Breiman 2001) is bagging applied to decision trees with an extra decorrelation mechanism: at each split, only a randomly chosen mtry features are candidates (instead of all p features). This small addition makes base trees substantially less correlated than in plain bagging, and the ensemble error is markedly lower. Standard defaults: mtry = √p (classification) or p/3 (regression), trees grown to full depth (no pruning), 500 trees. RF requires no feature scaling, handles missing values, provides a free out-of-bag (OOB) error estimate and feature importance. It is the "out-of-the-box best" on many tabular benchmarks — Caruana & Niculescu-Mizil 2006, Fernández-Delgado et al. 2014 ("Do we need hundreds of classifiers?": RF wins the majority of UCI tasks).