Machine Learning · Ensembles and Model Selection

Intuition behind ensembles — why many models beat one

Ensembles and Model Selection

Introduction

An ensemble is a method that combines predictions of multiple base models into a single decision. Three classical families: bagging (training models in parallel on bootstrapped samples, averaging — Breiman 1996), boosting (sequentially adding weak learners that correct previous errors — Schapire 1990, Freund & Schapire 1997), and stacking (a meta-model learns how to combine base-model predictions — Wolpert 1992). Mathematical intuition: if base-model errors are uncorrelated, their average has lower variance than a single model (central limit theorem for error). In practice ensembles reduce variance (bagging), bias (boosting), or both (stacking). Ensembles dominate Kaggle and structured tabular data (Caruana & Niculescu-Mizil 2006: empirical comparison).