TF(t,d) = count of term t in document d / total words in d. IDF(t) = log(N / df(t)), where N = number of documents, df(t) = number of documents containing t. TF-IDF(t,d) = TF(t,d) × IDF(t). Resulting document vectors are sparse and can be used in retrieval and classification.
Bag-of-Words treats all words equally — words like "the", "is", "and" have high frequency but low informational value. TF-IDF gives lower weights to common words and higher weights to rare, document-specific ones.
TF-IDF cannot be computed incrementally — each new document changes the IDF of all terms. Dynamic corpora require periodic index rebuilding or approximate methods.
TF-IDF treats "car" and "automobile" as independent terms. For tasks requiring semantic matching (question answering, RAG) dense embeddings are a better choice.