The visual exploration of high-dimensional (HD) data has gained popularity through the use of dimensionality reduction (DR) techniques such as t-SNE and UMAP. However, the interpretability of low-dimensional (LD) embeddings produced by these nonlinear methods remains a challenge. Conversely, linear methods such as PCA are natively interpretable but fall behind regarding DR quality. To circumvent this trade-off, post-hoc interpretability methods have been introduced, where simpler models are used a posteriori to explain LD positions in terms of HD features. While these approaches can provide explanations for nonlinear DR methods without compromising DR quality, their downside is that they rely on approximations of the original LD embeddings which can lead to misinterpretations. In this paper, we propose a novel solution to the trade-off between DR quality and interpretability: a natively interpretable version of t-SNE. The key idea is to express the coordinates of each LD point as individual linear combinations of HD features and use regularization to promote local coherence of the various linear combination weights across the embedding. Experimental results demonstrate the effectiveness of our method in preserving HD structures while providing LD embeddings that are interpretable by design.
Couplet, E., Lambert, P., Verleysen, M., Mulders, D., Lee, J., & De Bodt, C. (2023). Natively Interpretable t-SNE. Proceedings of AIMLAI workshop, 1(1), 1-16. https://hdl.handle.net/2078.5/100421 (Original work published 2023)