DESeq2 - why genes with very large variance between reps show up as significantly changed?
1
0
Entering edit mode
ninova • 0
@ninova-23327
Last seen 4.5 years ago

I was asked to run DESeq2 on RNA-seq libraries, 2 replicas x 2 conditions (I know it is not ideal but that's how the data is). I notice that there is a small subset of about 20 genes which only showed up in one replica of one condition (and I think this is due to contamination in the second replica):

raw reads:

cond1.1 cond1.2 cond2.1 cond2.2
geneA      0       0       0    1376

after deseq normalization:

counts(dds, normalized=T)["geneA",]
cond1.1  cond1.2  cond2.1  cond2.2 
0.000    0.000    0.000 1265.967

For the rest of the genes the values are pretty consistent in each pair of replicates. I thought that this gene won't be considered significantly changed, but in fact it shows up in the result as significant. Here is how I run the analysis

baseMean log2FoldChange    lfcSE     stat       pvalue         padj    
gene A 316.49167       24.38369 4.784902 5.095965 3.469687e-07 7.367278e-06

Here's how I run the analysis, just default parameters

design<-data.frame(row.names=colnames(countsTable),condition=c(rep("cond1",2), rep("cond2",2)))
dds<-DESeqDataSetFromMatrix(countData = countsTable, colData=design, design = ~condition)
dds<-DESeq(dds)
results(dds)

I am not a statistician but biologist and I don't understand well the math behind the DESeq2. I was wondering if there is some parameter I can change in the DESeq that will help to increase the stringency of the analysis and exclude these genes, or with two replicates there is nothing that can be done?

Thank you in advance, and I apologize in advance if similar question was answered before!

deseq2 • 642 views
ADD COMMENT
0
Entering edit mode
swbarnes2 ★ 1.4k
@swbarnes2-14086
Last seen 19 hours ago
San Diego

If 50% of your samples for one condition show no counts, and 100% of the samples for the other condition have counts...on the face of it, that's a difference between the groups. It might be biologically uninteresting to you, because it's an artifact, but I don't think the software wants to make that call.

ADD COMMENT

Login before adding your answer.

Traffic: 469 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6