Question

Higher Dimensional RNASeq Clustering Significance

0

Entering edit mode

James • 0

@73ef4518

Last seen 2.1 years ago

United States

Looking at the principal components of our RNASeq data, there is clear separation between the diseased and controlled, however, this separation is in the 5th principal component, which only accounts for 0.45% of variance. There is no clear separation in the lower dimensions, which mostly show batch separation.

How can I statistically leverage the genes associated with this PC when they aren't differentially expressed in DESeq2? I've attached an image of the plot. 5th and 6th PC of RNASeq data

DESeq2 RNASeq PrincipalComponent pcaExplorer • 1.1k views

ADD COMMENT • link written 2.1 years ago by James • 0

0

Entering edit mode

You could perform GO on the genes that contribute the most to the variation along PC5, but indeed there is very very small difference between disease and control samples. Have you tried to perform a GSEA ?

ADD REPLY • link 2.1 years ago Basti ▴ 780

score 0 · Answer 1 · 2022-11-23

0

Entering edit mode

James W. MacDonald 67k

@james-w-macdonald-5106

Last seen 2 days ago

United States

The conventional answer is to adjust for the batch and other unobserved variability in the linear model using e.g., a batch factor and likely additional surrogate variables (using svaseq from the sva package), presuming that batch is orthogonal to your variable of interest.

ADD COMMENT • link 2.1 years ago James W. MacDonald 67k