Question

DESeq2: one condition vs multiple combined

0

Entering edit mode

Konstantin Okonechnikov ▴ 40

@konstantin-okonechnikov-11325

Last seen 4.4 years ago

I have multiple RNA-seq samples from 6 conditions and have to compare each condition against all the others: find genes dominant only in a specific group.

I tried two approaches.

First approach: separate all samples in 2 conditions: samples from one group vs samples from other groups, then repeat the operation 6 times switching the group against others.

Second approach: mark groups inside condition i.e.

s1	g1
s2	g1
s3	g1
s4	g2
s5	g2
s6	g2
s7	g3
s8	g3
s9	g3
...	...

Then run DESeq2 and perform final function with specific contrast. i.e

rds <- results(dds, contrast = list( c(g1),c(g2,g3,g4,g5,g6) ) )

In general, results were similar in both methods i.e. majority of the same genes were validated dominant using both approaches, however there were slight differences in lfcSE, p-values and q-values.

What is the most appropriate way to perform such analysis? Are there special settings to be applied for it? Is there a better method (i.e. ANOVA) to apply in this case?

rnaseq deseq2 • 15k views

ADD COMMENT • link updated 7.1 years ago by rammohanshukla ▴ 10 • written 8.7 years ago by Konstantin Okonechnikov ▴ 40

0

Entering edit mode

Hi Michael and Konstantin Okonechnikov

I have a similar problem

I have 5 conditions lets call it phenotype: A, B, C, D and E. One of it is control. As like the post in this thread, I too have to compare each condition against all the others but before doing that, I have to control for Age, PMI, pH, and Batch. Which, when tested independently affect the expression.

I used the

dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design = ~Age + PMI + pH + batch + phenotype) I believe this design will take care of the all the confounding variable (Age, PMI, pH, and Batch) and will give corrected expression for the phenotype.

but I am unable to use contrast in this case as the program is taking one of the phenotypes as a reference level and giving results in comparison to the reference

function resultsNames does not give the list which I want to compare.

I am aware of dds$group <- factor(paste0(dds$genotype, dds$condition)) but this is not working for my case.

I want to test various combinations also, for instance, A+B vs C+D ; A+B+C vs D+E etc.

How should I make a design such that the phenotype is corrected for confounding variables and give contrast for various combinations.

Can you please help,

Thanks

ADD REPLY • link 7.1 years ago rammohanshukla ▴ 10

0

Entering edit mode

You may want to speak with a local statistician to discuss modeling. There are not individual coefficients representing these contrasts. It’s also not straightaway clear how to make a comparison like A+B vs C+D. This is why we don’t have these comparisons in the vignette. It’s important to discuss a plan with a statistician.

ADD REPLY • link 7.1 years ago Michael Love 43k

0

Entering edit mode

Thanks, Michael,

Sorry to bug you again, Can you tell the design to use in case I want to do something similar ?

i.e rds <- results(dds, contrast = list( c(g1),c(g2,g3,g4,g5,g6) ) )

Thanks

ADD REPLY • link 7.1 years ago rammohanshukla ▴ 10

score 7 · Answer 1 · 2016-08-19

7

Entering edit mode

Michael Love 43k

@mikelove

Last seen 2 days ago

United States

Your second method, performing the contrast with list, is what I would do. Except you also need to add listValues=c(1,-1/5), so that you are contrasting the first group vs the average of the other groups. You could also specify this with a numeric contrast, e.g. contrast=c(0,1,-1/5,-1/5,-1/5,-1/5,-1/5). Whichever is easiest for you.

The reason it's preferable is that the estimate of within-group variance (dispersion) will be smaller, because the the different groups 2-6 have their own coefficients.

ADD COMMENT • link 8.7 years ago Michael Love 43k

0

Entering edit mode

Thanks a lot for quick and detailed reply!

ADD REPLY • link 8.7 years ago Konstantin Okonechnikov ▴ 40

0

Entering edit mode

Hi, Dr love. I am wondering when I do oneVSmulitple like

rds <- results(dds, 
contrast = list( c(g1),c(g2,g3,g4,g5,g6) ), 
listValues = c(1,-1/5)
)

Does DESeq2 just consider the g2-g5 as a same group just like approach 1 or will consider something.

Because I think if just consider as a same group, there will be too much so-called outlier or too big variance in "g2-g5 group", and it will produced many NA in pvalue. But it seems that there are not too much NA in my results. So I am wondering whether DESeq2 consider something else.