Hi,
I have a question about an analysis of RNA seq data via DESeq2.
The RNA seq design is the following:
Sample experiment siRNA hormonal treatment
1 M0 sineg control
2 M1 sineg control
3 M2 sineg control
4 M0 sineg hormone
5 M1 sineg hormone
6 M2 sineg hormone
7 M0 siX control
8 M1 siX control
9 M2 siX control
10 M0 siX hormone
11 M1 siX hormone
12 M2 siX hormone
So there are two conditions (siRNA and hormonal treatment) and within each condition there are a control sample (sineg or control) and a treated sample (siX or hormone). And we realized 3 experiments (M0,M1,M2).
I would like to identify genes that respond differently to hormonal treatment in condition siX compared to the condition sineg. The design I used is the following: ~ hormonal treatment + si + hormonal treatment :si. Is it OK?
Finally I would like to take into account the experiment. In fact my experiments are paired. For example, a treatment for the experiment M0 must be compared with the control of the same experiment M0. So my question is: how do I set up the design for that?
Thanks in advance.
Severine
Hi I have a similar setup of experiments (but with experiments not being paired)
I have two genotypes (geno): "A","B", Two treatments (treat): "control" "damage"
Four replicates for each genotype/treatment combination, 16 samples in total
I want to know whether the effect of "damage" compared to "control" is different between genotypes.
So I used the following design: ~geno+treat+geno:treat
and then, following the example in the ?results section of the DESeq2 vignette, I used
results <- results(dds,name="genoB.treatdamage")
However, no DEGs could be identified (all adj. p values had the same value: 0.9999928)
This is in contrast to another approach:
I first determined DEGs between genoA_control and genoB_control (comparison 1), next I determined DEGs between genoA_damage and genoB_damage (comparison 2).
Then I made a Venny, and there were 8 DEGs (adj. pvalue < E-5) in comparison 2 that were not present in comparison 1 and I also validated this by checking read coverage of these 8 DEGs in IGV.
So, I'm wondering what I should do...
Regards
Wannes
The first procedure is a statistical test of the interaction and is the recommended test. The second is an ad-hoc procedure without FDR control on the resulting set. Sometimes intersecting FDR sets is the only option, and the resulting intersection set does not have FDR control, but here there is a statistical test that gives you the correct answer with FDR control.
Dear Prof. Love
Thank you for your reply. I guess I'll have to perform more downstream experiments whether the DEGs of the ad-hoc procedure are meaningful.
Just to give you an idea: this is a gene that is not significantly downregulated (based on FDR) according to the recommend test (test of interaction), but based on the intersection approach this is one of the few DEG that pops up (adj. p value < e-30) in comparison 2 and not in comparison 1.
'genoB.treatdamage"
"ID" "name" "log2FoldChange" "baseMean" "lfcSE" "stat" "pvalue" "padj"
X" "protein" -1.88972972182758 1366.46086631548 0.880800838900554 -2.14546766802176 0.0319154738120772 0.999992751003146
"genoA_damage vs genoB_damage" ad-hoc procedure
"ID" "name" "log2FoldChange" "baseMean" "lfcSE" "stat" "pvalue" "padj"
X protein -0.99663284 2805.129516 0.087926543 -11.33483477 8.82E-30 9.79E-26
It is a gene that is lowly expressed in control of geno A and genoB (reads 10-20) while being highly expressed upon treatment of geno A (reads 800-900) and geno B (reads 1700-1800).
Regards
Wannes
You are highlighting that the interaction is not significant while the main effect in the "damage" group is significant. There is nothing inconsistent here.
Hi Dr Love
Thanks, this is really helpful and I feel that I am almost there, but I still cannot be sure so wanted to see if I had the correct set up.
My data is very similar to Severine's, it goes something like this:
see image here of sample table
I'd like to do the following comparisons, for each day:
Currently, my design is ~ Day + CHOW + shRNA + CHOW:shRNA, but I'm not sure this is correct. Any suggestions/corrections you might have would be greatly appreciated!!
Thank you! Mao
This is a complex design and there are many choices to work through. I'd recommend working with a statistician. I just don't have time to provide non-software related support here.
Ok, will do. Thanks for the prompt reply and your continued support here!