Machine Learning · Unsupervised Learning

PCA — dimensionality reduction

Unsupervised Learning

Introduction

Principal Component Analysis (PCA) is a linear dimensionality-reduction method introduced by Karl Pearson in 1901 and independently by Harold Hotelling in 1933. It projects data onto a lower-dimensional subspace that preserves as much of the original variance as possible. In this lesson you will see that PCA is simply finding eigenvectors of the covariance matrix (equivalently SVD), understand why centring and standardisation are critical, how to choose the component count via explained variance ratio, and when PCA fails — because the data has non-linear structure or because low-variance directions matter.