Question

DESeq2 PCA different from Prcomp PCA

1

Entering edit mode

tiago211287 ▴ 10

@tiago211287-9049

Last seen 4.3 years ago

Brazil

I made a PCA using the rlog matrix from DESEQ and got this plot where one of my sample groups did not group together.

plotPCA(rld, intgroup=c("condition"))

http://s3.postimg.org/whlalmp6b/Rplot03.png

Using the same matrix in Prcomp from r the samples get more clustered.

cruzi.pca <- prcomp(rldMat2,

center = TRUE,
scale. = FALSE)

library(ggbiplot)

g <- ggbiplot(pcobj = cruzi.pca, scale = 1, obs.scale = 1, var.scale = 1,
groups = groups, ellipse = TRUE,
circle = TRUE, var.axes = FALSE)
g <- g + scale_color_discrete(name = '')
g <- g + theme(legend.direction = 'horizontal',
legend.position = 'top')
print(g)

http://s4.postimg.org/vdsax2drx/Rplot04.png

How can I decide what plot to use? And Why a same matrix of transformed data got so differently clusted ? Thank you.

deseq2 Prcomp PCA • 6.8k views

ADD COMMENT • link updated 9.0 years ago by Michael Love 43k • written 9.0 years ago by tiago211287 ▴ 10

score 3 · Accepted Answer · 2016-04-18

3

Entering edit mode

Michael Love 43k

@mikelove

Last seen 6 days ago

United States

See ?plotPCA in particular the arguments and the note.

ADD COMMENT • link 9.0 years ago Michael Love 43k

2

Entering edit mode

Adding on to Mike's comment, it is most likely due to the number of genes you use for the DESeq2::plotPCA function. This number defaults to 500, while you take all the genes in the rldMat2 object - at least, if rld and rldMat2 are exactly the same objects.

ADD REPLY • link 9.0 years ago Federico Marini ▴ 180

0

Entering edit mode

Indeed. This explain the difference.

Why I would make the PCA for only 500 genes instead of all of them ?

ADD REPLY • link 9.0 years ago tiago211287 ▴ 10

3

Entering edit mode

Making a PCA plot after first ranking the genes by total variance helps to make more clear the sample groupings. Of course, you can tune this parameter, but 500 is a good number for many RNA-seq datasets.