Differential Expression analysis of specific list of genes
1
0
Entering edit mode
JoannaF ▴ 10
@joannaf-9881
Last seen 3.4 years ago
France

Hello,

I analysed RNA-seq data (432 samples) and I have obtained the statistics about differentially expressed genes thanks to DEseq2 (into results(DEseq(dds)) with dds obtained after using DESeqDataSetFromHTSeqCount).

We are interested about a list of about 60 specific genes among the 19000 protein coding genes studied.

How can I have a differential expression analysis specific to the 60 genes of interest ? Do I have to adjust p-values ?

Thanks a lot in advance for your answer !

DESeq2 • 1.8k views
ADD COMMENT
0
Entering edit mode
swbarnes2 ★ 1.4k
@swbarnes2-14086
Last seen 1 hour ago
San Diego

Just do things the right way: process all the genes, then subset the results to the 60 genes you care about.

ADD COMMENT
0
Entering edit mode

Thanks for your answer! I don't have to adjust the p-values for the ~60 genes? I have seen on this old post DEseq2 with limited gene set that it is advisable to correct the p-values but is this always the case?

ADD REPLY
0
Entering edit mode

The post seems to agree with what I said; process all the genes, subset at the very end. I would not alter the adjusted p-value calculation; doing so will make your p-values too good.

ADD REPLY
0
Entering edit mode

To readjust (or not) the pvalues after the 60 gene subset I think will depend on whether or not these 60 genes were known a priori or if they were found through the analysis of this data.

@JoannaF: If you assembled all these data together in order to only analyze these 60 genes, then I think it is OK to readjust the pvalues after you pull out the results. If you just discovered this 60 gene subset here, then no.

If you are using these 60 genes as some representation of a biological pathway and you want to assess the statistical significance of its activity across your comparison, then you should rather revert to one of the "standard" modes of gene set enrichment / over-representation analysis.

ADD REPLY
0
Entering edit mode

Thanks a lot @swbarnes2 and @Steve Lianoglou for your answers!

These 60 genes are known a priori (before the differential gene expression analysis) but they are not the only goal of this analysis.

We have already done Gene Set Enrichment Analysis, what did you mean by "over-representation analysis"?

Thanks again!

ADD REPLY
0
Entering edit mode

The type of analysis that is done when you use goseq, for instance, falls under the class of analyses I'm calling "over representation analysis"

ADD REPLY

Login before adding your answer.

Traffic: 1028 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6