tutorials_as_code/talks-articles/machine-learning/wiki/LSA.md at master · abhishekkr/tutorials_as_code

LSA (Latent Semantic Analysis)

wikipedia, datacamp.com

LSA assumes words close in meaning will occur in similar peices of text with distributional hypothesis.
LSA tries finding low-rank approximtion for term-document matrix by dot product between all term-vectors (giving correlation between terms) and document-vectors (giving document correlation over terms).
From theory of Linear Algebra, there exists a decomposition of term-document matrix giving a diagonal matrix alongwith a tall and a wide matrix. This is SVD.
Uses a BoW (Bag of Word) model resulting in term-document matrix (rows for words, columns for documents).
LSA learns latent topics by performing matrix decomposition on document-term matrix using SVD.
LSA typically used as dimesnion reduction or noise reduction technique.