DEseq2: setting up the correct contrast.
2
0
Entering edit mode
fcamus • 0
@fcamus-15207
Last seen 3.0 years ago
United Kingdom

Hi there, I have a question regarding RNASeq analysis and the DESeq2 package.

I have a 2x2 full factorial design using Drosophila:

  • 2 mating status (virgin, mated)
  • 2 diets (A and B)

I started out with looking at an overall model and using contrasts got a list of genes that were DE for each main category

dds <- DESeqDataSetFromMatrix(countData = count.fem,
                              colData = meta,
                              design = ~ diet + status + diet:status)

For simplicity here I will just focus on effects due to changing diet.
Question: is this the correct contrasts to look at genes that respond to diet in both mated and virgin flies? (Overall diet effect)

## DIET GENES
comp.d <- results( dds, contrast = c("diet", "B", "A") )
summary(comp.d)

out of 12996 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)       : 248, 1.9%
LFC < 0 (down)     : 59, 0.45%
outliers [1]       : 31, 0.24%
low counts [2]     : 2770, 21%
(mean count < 9)

Now that I have the overall effects, I wanted to do more specific contrasts. For example:

  • changing diets in virgin and mated flies separately

To do these comparisons I created another column with both factors and ran the below model:

> meta.t
       diet status treat
VF-B      B virgin   B.v
VF-B.1    B virgin   B.v
VF-B.2    B virgin   B.v
MF-B      B  mated   B.m
MF-B.1    B  mated   B.m
MF-B.2    B  mated   B.m
VF-A      A virgin   A.v
VF-A.1    A virgin   A.v
VF-A.2    A virgin   A.v
MF-A      A  mated   A.m
MF-A.1    A  mated   A.m
MF-A.2    A  mated   A.m

dds.t <- DESeqDataSetFromMatrix(countData = count.fem,
                              colData = meta.t,
                              design = ~ treat)

So far everything seems to be going OK, but when I start checking the individual comparisons I seem to be getting a lot more genes than the overall models:

## GENES THAT RESPOND TO DIET IN MATED FLIES
comparison1 <-results(dds.t, contrast = c("treat", "B.m", "A.m"))  
summary(comparison1)

> summary(comparison1)
out of 12996 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)       : 872, 6.7%
LFC < 0 (down)     : 920, 7.1%
outliers [1]       : 31, 0.24%
low counts [2]     : 3265, 25%
(mean count < 16)

## GENES THAT RESPOND TO DIET IN VIRGIN FLIES
comparison2 <-results(dds.t, contrast = c("treat", "B.v", "A.v"))  
summary(comparison2)

> summary(comparison2)
out of 12996 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)       : 248, 1.9%
LFC < 0 (down)     : 59, 0.45%
outliers [1]       : 31, 0.24%
low counts [2]     : 2770, 21%
(mean count < 9)

Question: why do I get many more genes that respond to diet when I run these specific comparisons, rather than the overall model. Also why is the overall number of genes that respond to diet the same as the the result when I only look at virgin flies?

Thanks!

DESeq2 RNASeq • 2.0k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 7 hours ago
United States

I have to limit my time on the support site to DESeq2 software related questions. Due to limited time, I can't provide statistical analysis consultation here, e.g. is this the correct design for various hypotheses, or how to interpret various contrasts. I would recommend to collaborate with a local statistician to help work through these questions. At the least, read over the Interactions section of the DESeq2 vignette for some basic guidance, but for follow-up questions I would recommend to have a consult with a statistician.

ADD COMMENT
0
Entering edit mode
swbarnes2 ★ 1.4k
@swbarnes2-14086
Last seen 14 hours ago
San Diego

Question: is this the correct contrasts to look at genes that respond to diet in both mated and virgin flies? (Overall diet effect)

No. See how you got the exact same results as you did when you compared diets between only the virgin flies?

Your first design + that first contrast will only give you the differences caused by diet in the samples which have the reference level of status, which is virgin.

If you want to know difference caused by diet in all samples, with the software understanding the status is a contributor to the variance within each diet, the design should just be ~ diet + status

ADD COMMENT
0
Entering edit mode

Many thanks for the comment! So would you suggest that I use the model ~ diet + status to get the overall effects, and then ~ diet + status + diet:status to get the list of genes with an interaction term?

thanks again!

ADD REPLY
0
Entering edit mode

Depends what contrast you set, and what exact question you are asking.

ADD REPLY

Login before adding your answer.

Traffic: 675 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6