Question: DESeq2 - factor base level changes LFCs -- hoping to understand this better
1
0
Entering edit mode
@victor-missirian-8804
Last seen 9.3 years ago
USA/Davis/UC Davis

Hi Michael,

    I'd like to ask a question related to a previous post: "Question: DESeq2 - factor base level changes the DE genes."

    I ran the following code on DESeq2 (version 1.4.5):

## START CODE

sampleTable <- read.delim(sample.table.file, row.names=1, stringsAsFactors=FALSE)
coldata <- DataFrame(sample=sampleTable$SampleName,
            group=factor(sampleTable$group),
            condition=factor(sampleTable$condition))

countdata <- read.delim(htseq.output.file, header=TRUE, sep="\t", quote="", comment.char="", row.names=1)


## using "ref.group" as reference level for "group" factor

dds.1 <- DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design = ~ group + condition + group:condition)
dds.1$condition <- relevel(dds.1$condition, ref.condition)
dds.1$group <- relevel(dds.1$group, ref.group)
dds.1 <- DESeq(dds.1)
res.other.vs.ref.condition.effect.in.ref.group <- results(dds.1, contrast=c("condition", other.condition, ref.condition))
res.ratio.of.other.vs.ref.condition.effect.in.other.vs.ref.group <- results(dds.1, name=paste("group", other.group, ".condition", other.condition, sep=""))


## using "other.group" as reference level for "group" factor
dds.2 <- DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design = ~ group + condition + group:condition)
dds.2$condition <- relevel(dds.2$condition, ref.condition)
dds.2$group <- relevel(dds.2$group, other.group)
dds.2 <- DESeq(dds.2)
res.other.vs.ref.condition.effect.in.other.group <- results(dds.2, contrast=c("condition", other.condition, ref.condition))

> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] hash_3.0.1                DESeq2_1.4.5              RcppArmadillo_0.5.100.1.0
 [4] Rcpp_0.11.6               GenomicRanges_1.16.4      GenomeInfoDb_1.0.2       
 [7] IRanges_1.22.10           Biobase_2.24.0            BiocGenerics_0.10.0      
[10] pasilla_0.4.0            

loaded via a namespace (and not attached):
 [1] annotate_1.42.1      AnnotationDbi_1.26.1 DBI_0.3.1            DESeq_1.16.0        
 [5] genefilter_1.46.1    geneplotter_1.42.0   grid_3.1.0           lattice_0.20-31     
 [9] locfit_1.5-9.1       RColorBrewer_1.1-2   RSQLite_1.0.0        splines_3.1.0       
[13] stats4_3.1.0         survival_2.38-1      tools_3.1.0          XML_3.98-1.1        
[17] xtable_1.7-4         XVector_0.4.0  

## END CODE

 

And I found that the LFC of the interaction effect was not the same as the difference between the LFCs of the condition effects for the two groups:

> res.other.vs.ref.condition.effect.in.ref.group["AT5G20230","log2FoldChange"]
[1] 1.632546
> res.ratio.of.other.vs.ref.condition.effect.in.other.vs.ref.group["AT5G20230","log2FoldChange"]
[1] 2.30044
> res.other.vs.ref.condition.effect.in.other.group["AT5G20230","log2FoldChange"]
[1] 4.312705


    If possible, could you please explain the reason that this is the case?

    Do the values computed for the mean normalized read counts or dispersions depend on the reference level of the "group" factor?

    If so, is there a quick explanation that could help me to better understand how this dependence works?


Thanks,
-Victor Missirian

deseq2 • 2.2k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 3 days ago
United States

The reason is that for DESeq2 versions 1.4-1.8 the main effect terms are not shrunk, while the interaction terms are shrunk. So in comparison 1 you are adding (main + interaction) and you have (unshrunk + shrunk) while in comparison 2 you have just a main effect terms which is not shrunk.

I recommend turning off the betaPrior for designs with interactions, and in the next release (v1.10 in Oct 2015), it will automatically turn off the betaPrior. This looks like:

dds <- DESeq(dds, betaPrior=FALSE)

For more background, I wrote this other post recently on the topic:

A: Interactions in DESeq2

ADD COMMENT
0
Entering edit mode

Thanks!  I will try turning off betaPrior!

-Victor

ADD REPLY
0
Entering edit mode

In R 1.8.1 I have done a small test for a design with no interactions of the type ~line+treatment, where line has 4 different levels.

I have noticed (small) differences in the estimated pvalues and adjusted pvalues if I change names of the levels for "line" factor (so that the first level in alphabetical order that appears to be used as reference is going to change).

This is true for betaPRIOR=FALSE as well.

I was curious to know whether these differences are because of how estimation of parameters is implemented? And if yes, are they expected to be always minor?

Many thanks

ADD REPLY
0
Entering edit mode

What version of DESeq2 are you using?

ADD REPLY
0
Entering edit mode

Sorry, I meant to write DESeq2 1.8.1 - I have to say I have not (yet) tried with the current 1.12.3

ADD REPLY
0
Entering edit mode

You should only ever see very small differences (not in the first significant digits) having to do with algorithm convergence.

ADD REPLY
0
Entering edit mode

Thanks for the clarification!

ADD REPLY

Login before adding your answer.

Traffic: 476 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6