How does T-SNE compare to PCA?

PCA is a linear dimensionality reduction technique designed to model the variability based on the global structure of the data, while T-SNE is a non-linear technique that is optimal for capturing the local structure of high-dimensional data.

T-SNE is better suited to handle outliers, as where PCA would project outliers onto the axis that captures the largest proportion of overall variability, T-SNE is more likely to partition outliers into a different neighborhood than regions of higher density.

T-SNE is considered a more modern technique that generally is preferred over PCA, especially for data exploration and visualization. It does require tuning hyper-parameters such as perplexity and learning rate, whereas PCA requires little tuning besides choosing the number of components post-hoc. 

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad
Find out all the ways that you can
Contribute
Here goes your text ... Select any part of your text to access the formatting toolbar.