How do you know if your DESeq2 data has normal distribution?
1
2
Entering edit mode
ecg1g15 ▴ 30
@ecg1g15-19970
Last seen 4.1 years ago

Hi,

I am working with a set of genes over 20 samples therefore I have been using the DESeq2 package. When plotting a PCA (after normalising using vat etc...) I would like to draw the ellipses on a high confidence level, and justify the clustering, AT the moment it works very good for t-distribution and normal distribution. However, I am unsure how is my data distributed (normal, t-distribution?) - how can I find out?

Another way is to use Euclidean distance and decrease the confidence level. But is this recommended?

This is more a text question rather than code question, therefore not expecting answers replicating code, but here is what I have used.

ggplot(pcaData, aes(x = PC1, y = PC2, color = A, shape = B)) +   geom_point(size =3) + scale_color_gradientn(colours = rainbow(10)) +   xlab(paste0("PC1: ", percentVar[1], "% variance")) +   ylab(paste0("PC2: ", percentVar[2], "% variance")) +    coord_fixed()
+   stat_ellipse(type = "euclid", lty=2, col=1)
R pca ellipse normaldistribution • 1.4k views
ADD COMMENT
0
Entering edit mode
Kevin Blighe ★ 4.0k
@kevin
Last seen 6 weeks ago
Republic of Ireland

after normalising using vat etc...

I presume that you mean vst()? Just to be clear, vst() provides a variance-stabilising transformation of your normalised count data. In a typical workflow, raw counts will be normalised via DESeq(), and then a transformation for downstream analyses is performed via rlog() or vst().

You can check the distribution of your transformed expression data via hist(). Generally, though, using the variance stabilised expression levels is fine for most downstream analyses, including PCA and clustering via Euclidean distance. Just check for outlier samples, of course, and ensure that you have removed, for example, genes that have high missingness.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 489 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6