I have multiple RNA-seq samples from 6 conditions and have to compare each condition against all the others: find genes dominant only in a specific group.
I tried two approaches.
First approach: separate all samples in 2 conditions: samples from one group vs samples from other groups, then repeat the operation 6 times switching the group against others.
Second approach: mark groups inside condition i.e.
s1 | g1 |
s2 | g1 |
s3 | g1 |
s4 | g2 |
s5 | g2 |
s6 | g2 |
s7 | g3 |
s8 | g3 |
s9 | g3 |
... | ... |
Then run DESeq2 and perform final function with specific contrast. i.e
rds <- results(dds, contrast = list( c(g1),c(g2,g3,g4,g5,g6) ) )
In general, results were similar in both methods i.e. majority of the same genes were validated dominant using both approaches, however there were slight differences in lfcSE, p-values and q-values.
What is the most appropriate way to perform such analysis? Are there special settings to be applied for it? Is there a better method (i.e. ANOVA) to apply in this case?
Hi Michael and Konstantin Okonechnikov
I have a similar problem
I have 5 conditions lets call it phenotype: A, B, C, D and E. One of it is control. As like the post in this thread, I too have to compare each condition against all the others but before doing that, I have to control for Age, PMI, pH, and Batch. Which, when tested independently affect the expression.
I used the
dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design = ~Age + PMI + pH + batch + phenotype) I believe this design will take care of the all the confounding variable (Age, PMI, pH, and Batch) and will give corrected expression for the phenotype.
but I am unable to use contrast in this case as the program is taking one of the phenotypes as a reference level and giving results in comparison to the reference
function resultsNames does not give the list which I want to compare.
I am aware of dds$group <- factor(paste0(dds$genotype, dds$condition)) but this is not working for my case.
I want to test various combinations also, for instance, A+B vs C+D ; A+B+C vs D+E etc.
How should I make a design such that the phenotype is corrected for confounding variables and give contrast for various combinations.
Can you please help,
Thanks
You may want to speak with a local statistician to discuss modeling. There are not individual coefficients representing these contrasts. It’s also not straightaway clear how to make a comparison like A+B vs C+D. This is why we don’t have these comparisons in the vignette. It’s important to discuss a plan with a statistician.
Thanks, Michael,
Sorry to bug you again, Can you tell the design to use in case I want to do something similar ?
i.e rds <- results(dds, contrast = list( c(g1),c(g2,g3,g4,g5,g6) ) )
Thanks