What number components should I use if I want to show 2D and 3D UMAP
0
0
Entering edit mode
@shangguandong1996-21805
Last seen 2.1 years ago
China

Hi,

I am using scater runUMAP to calculate scRNA-seq UMAP coord and plot UMAP plot.

I noticed that runUMAP uwot to calculate UMAP according to https://github.com/Alanocallaghan/scater/issues/76.

I find the X1 and X2 coord when choosing n_components = 3 is not same as X1 and X2 coord when choosing n_components = 2

> set.seed(123456)
> head(uwot::umap(iris, n_components = 3))
          [,1]      [,2]     [,3]
[1,] -8.783597 -5.717427 2.266894
[2,] -9.552045 -6.461876 3.968688
[3,] -9.005832 -6.536717 3.583495
[4,] -9.136230 -6.481047 3.739561
[5,] -8.944820 -5.986025 2.252077
[6,] -9.329567 -5.138022 1.645643
> set.seed(123456)
> head(uwot::umap(iris, n_components = 2))
           [,1]       [,2]
[1,] -10.291664 -1.2414723
[2,]  -9.610963 -3.0971170
[3,] -10.281700 -2.7837709
[4,] -10.078126 -2.9112540
[5,] -10.548049 -1.3866089
[6,] -10.114601 -0.3645576

So I have a question that If I want to show a 2D UMAP and 3D UMAP for same scRNA-seq data, should I use calculate UMAP separately using n_components = 2 and 3 or use two coord in n_components = 3 result when plotting 2D UMAP ?

scater UAMP scRNAseq • 1.5k views
ADD COMMENT
1
Entering edit mode

It's to be expected the UMAP (and t-SNE) embeddings will differ greatly when created in 2D and 3D. That's because unlike PCA, if you calculate a 2D UMAP embedding what's happening isn't calculating a truncated version of the full UMAP, it's specifically trying to find a 2D representation that maximises the objective function (in UMAP and t-SNE, that's something like "preserving local neighbourhood structure"). When you make a 3D UMAP, you're giving it another dimension to find such a representation, and it's likely (nay, inevitable) that this will lead to both of the 2D representations contained in the 3D being totally different to a 2D UMAP of the same data. Therefore displaying 2 dimensions of a 3-dimensional embedding is likely (almost inevitable) to omit some representations of the structure that are captured in the third dimension.

Furthermore I actually find this idea of creating 3+ dimensional t-SNEs and UMAPs kind of silly. The whole point of these visualisation is to make some easier-to-interpret representation of this complex high-dimensional data, whether that be clusters or trajectories. If you make a 3+dimensional UMAP then you've still got a difficult-to-interpret space.

ADD REPLY
0
Entering edit mode

Thanks, I get it :)

ADD REPLY

Login before adding your answer.

Traffic: 905 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6