Machine Learning · Data and Preparation
Feature engineering
Data and Preparation
Introduction
Feature engineering is the deliberate construction of new variables that expose data structure to the model. The lesson covers classical techniques (polynomial features, interactions, ratios, bucketing), domain-specific ones (date, geography, text, time-series — lag and rolling), per-group aggregations, frequency/target encoding with K-Fold discipline, PCA as a dimensionality reducer, automated generation (Featuretools/DFS), the curse of dimensionality, and reproducibility via the sklearn Pipeline. Guiding rule: feature engineering MUST be part of the pipeline, fit on train only, to prevent leakage.