DESeq2, sizeFactors in different comparisons; betaPrior=F
1
1
Entering edit mode
@elenigeorgopoulou86-8300
Last seen 9.4 years ago
Austria

Hello!

 

I have 2 questions. The first one is regarding the sizeFactors.

I have the following sample groups:

1. control = no treatment: 5 biol. replicates

2. after 1h of treatment:  5 biol. replicates

3. after 2 h of the same treatment:  5 biol. replicates.

(summa summarum: 1 factor, 3 levels, 15 animals)

 

My task was to find DE genes between the groups. So I did pairwise comparisons: i.1 vs.2, ii.1vs.3 and iii.2 vs.3.

 

The sizeFactors were computed for the each comparison: it means that the normalized counts for samples from group1 can be different in e.g. comparisons i and ii. 

My question is: is this wrong? Should I instead compute the sizeFactors for the all samples before testing, and not just for the compared ones?

 

 

The second question is regarding the analysis without an intercept, which I did here. Before 3-4 weeks, I did not get any error message when my design was without an intercept. But today, when I repeated the analysis, I got an error message:

  betaPrior=TRUE can only be used if the design has an intercept.

  if specifying + 0 in the design formula, use betaPrior=FALSE

 

So, I added in DESeq function an argument betaPrior=FALSE, and got the extended/different set of genes (e.g. under 5% FDR).

 

My 2nd question is just for sanity check: is there something changed in the code during this period of time, since my input hasn't changed, and I used the same code? If so, why betaPrior has to be =FALSE when the design is without an intercept?

 

Thank you in advance!

 

Eleni 

deseq2 betaPrior=FALSE sizeFactors • 3.1k views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 5 days ago
United States

1. " Should I instead compute the sizeFactors for the all samples before testing, and not just for the compared ones?"

My typical recommendation would be to put all the groups into one DESeqDataSet, with a design ~condition, run DESeq() on the whole dds, and then extract comparisons like:

results(dds, contrast=c("condition","treatment1hr","control"))
results(dds, contrast=c("condition","treatment2hr","treatment1hr"))
etc.

The size factors will stay the same this way and there are more degrees of freedom for estimating the dispersion parameter.

2. In version 1.6 (released October 2014), I added an error check, because this is a bad idea (to combine beta prior and a design without an intercept). I would have added this error check earlier if I had thought of the situation.

The reason it's a bad idea is that, we want to shrink differences between samples. These are represented by the coefficients in the model that look like "conditionTreated", or "condition_Treated_vs_Control", when there is also a term "Intercept". However, when you remove the intercept term, the coefficients above no longer represent the differences between samples, but the vector from 0 to the group mean. We do not want to shrink these terms to zero. In the second case, shrinkage provides no statistical benefit while adding bias, whereas the shrinkage of differences (in models with an intercept) adds a little bias while reducing error. (See our paper for more intuition on the shrinkage of differences).

ADD COMMENT
0
Entering edit mode

Thank you for the answers.

Just additionally to the question #1: how is the intercept calculated in this case and what does it practically mean? 

> resultsNames(dds)
[1] "Intercept" "control"     "treatment1hr"     "treatment2hr".

Would it be better to set the design here as ~0+condition? And when it make sense to include the intercept and when not to?

Thanks!

ADD REPLY
0
Entering edit mode

The intercept allows the software to shrink the differences between the groups symmetrically towards a middle value. We've written the DESeq() function to determine what is the most appropriate model matrix to use given the users choice of design and other arguments. Using the default settings and ~condition allows you to compare the groups and have the advantage of moderated log fold changes (see our paper for the motivation). If you prefer to not have moderation of log fold changes, you can set betaPrior=FALSE and use either ~condition or ~0+condition (these will result in equivalent results tables).

ADD REPLY

Login before adding your answer.

Traffic: 526 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6