Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.4 years ago
Hello,
I have an RNAseq experiment from two strains of mice, that have been
exposed to two different environments (three biological replicates of
each combination). So my experimental setup is
> colData
genetics environment
black2 B6 black
black3 B6 black
black5 B6 black
black1 B6 yellow
black4 B6 yellow
black6 B6 yellow
agouti1 129 yellow
agouti4 129 yellow
agouti6 129 yellow
agouti2 129 black
agouti3 129 black
agouti5 129 black
I want to test which genes change their expression when the
environment changes, controlling for the genetic background. I am
using DESeq2 for this, with the formula ~ genetics + environment +
genetics:environment.
However, the choice of the base level for the factors in my
experimental design changes the genes that are significant. If I leave
the default reference levels (129 for genetics and black for env) I
get more DE genes that if I set as reference B6 for genetics and
yellow for environemnt.
I thought the reference level is used to decide how to calculate the
fold change and is necessary to correctly interpret the coefficients,
but shouldn't the results be the same?
Any thoughts on what am I missing?
Thank you very much in advance.
My code is below.
## 1 ##
> levels(colData$genetics)
[1] "129" "B6"
> levels(colData$environment)
[1] "black" "yellow"
> dds <- DESeqDataSetFromMatrix(countData = data[,8:19], colData =
colData, design = ~ genetics + environment + genetics:environment)
> dds <- DESeq(dds)
> resG <- results(dds, alpha=0.05, name="genetics_B6_vs_129")
> resE <- results(dds, alpha=0.05, name="environment_yellow_vs_black")
> resGE <- results(dds, alpha=0.05,
name="geneticsB6.environmentyellow")
> dim(subset(resG, resG$padj < 0.05))
[1] 3675 6
> dim(subset(resE, resE$padj < 0.05))
[1] 20 6
> dim(subset(resGE, resGE$padj < 0.05))
[1] 24 6
## 2 ##
> colData2 <- colData
> colData2$genetics <- relevel(colData2$genetics, ref="B6")
> colData2$environment <- relevel(colData2$environment, ref="yellow")
> levels(colData2$genetics)
[1] "B6" "129"
> levels(colData2$environment)
[1] "yellow" "black"
> dds2 <- DESeqDataSetFromMatrix(countData = data[,8:19], colData =
colData2, design = ~ genetics + environment + genetics:environment)
> dds2 <- DESeq(dds2)
> res2G <- results(dds2, alpha=0.05, name="genetics_129_vs_B6")
> res2E <- results(dds2, alpha=0.05,
name="environment_black_vs_yellow")
> res2GE <- results(dds2, alpha=0.05,
name="genetics129.environmentblack")
> dim(subset(res2G, resG$padj < 0.05))
[1] 3127 6
> dim(subset(res2E, resE$padj < 0.05))
[1] 2 6
> dim(subset(res2GE, resGE$padj < 0.05))
[1] 0 6
##
> g1 <- subset(resG, resG$padj < 0.05)
> g2 <- subset(res2G, res2G$padj < 0.05)
> length(intersect(rownames(g1), rownames(g2)))
[1] 2640
-- output of sessionInfo():
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods base
other attached packages:
[1] DESeq2_1.2.10 RcppArmadillo_0.4.320.0 Rcpp_0.11.2
[4] GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7
[7] BiocGenerics_0.8.0
loaded via a namespace (and not attached):
[1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0
DBI_0.2-7
[5] genefilter_1.44.0 grid_3.0.2 lattice_0.20-23
locfit_1.5-9.1
[9] RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.0.2
stats4_3.0.2
[13] survival_2.37-7 tools_3.0.2 XML_3.95-0.2
xtable_1.7-3
--
Sent via the guest posting facility at bioconductor.org.