DESeq2: how to construct the design matrix?
1
0
Entering edit mode
taf • 0
@taf-7651
Last seen 9.7 years ago
Germany

Hello,

I have some questions concerning the construction of design matrices within the DESeq2 package.\\
I constructed the following design matrix, which contains 3 different treatments (T1,T2,T3) in duplicates and 4 controls (ctrl).

 

designMat <- rbind( c(1,0,0,0), c(1,0,0,0), c(0,1,0,0), c(0,1,0,0), c(0,0,1,0), c(0,0,1,0), c(0,0,0,1), c(0,0,0,1), c(0,0,0,1), c(0,0,0,1) ) 
rownames(designMat) <- 1:10 
colnames(designMat) <- c("T1","T2","T3","ctrl") 
print(designMat)

 

I know that it is easily possible to convert it into a "colData" data frame, which DESeq2 "understands".:

colData <- mat.or.vec(nrow(designMat), 1)
colData <- as.data.frame(colData)
colnames(colData) <- "condition"
for(i in 1:nrow(designMat))
{
    colData[i,"condition"] <- colnames(designMat)[which(designMat[i,]==1)]
}
print(colData)

 

And, indeed, if I apply the DESeq work flow to some sampled data and I have a look at the model matrix of the DESeq2 object I get "my" design matrix.

 

 

lambda0 <- 1000
T1 <- matrix(rpois(2000,lambda=lambda0), ncol=2)
T2 <- matrix(rpois(2000,lambda=2*lambda0), ncol=2)
T3 <-  matrix(rpois(2000,lambda=4*lambda0), ncol=2)
ctrl <-  matrix(rpois(4000,lambda=0.5*lambda0), ncol=4)

data <- cbind(T1,T2,T3,ctrl)
colnames(data) <- c(rep("T1",2),rep("T2",2),rep("T3",2),rep("ctrl",4))

## create DESeq data sets and check the model matrices
dds1 <- DESeqDataSetFromMatrix(countData = data, colData =colData , design =  ~ condition  ) 
dds1 <- estimateSizeFactors(dds1)
dds1 <- estimateDispersions(dds1)
dds1 <- nbinomWaldTest(dds1)
print(attr(dds1, "modelMatrix"))

 

With the R package limma I could deal with that, too. And if I would apply model.matrix() function to my scenario I would come up with

print(model.matrix(~T1 +T2 + T3 +ctrl , designMat))

But in DESeq I get an error message, if I try to apply that formula.:

dds2 <- DESeqDataSetFromMatrix(countData = data, colData =as.data.frame(designMat) , design =  ~ T1 +T2 + T3 + ctrl ) 

My Problem is, that my design is more complex than that.
It also contains combinations of treatments T1, T2 and T3.
And I would also like to "use" these combinations within the linear model to improve my estimates through a better estimation of variance (like limma() does).

Is there any possibility to pass the design matrix directly to a DESeq2 object?
Is it possible in general to deal with designs like, e.g.:

print(rbind(designMat.f,c(1,1,0,0), c(0,1,1,0)))

in DESeq2?

Thanks in advance for your help!

Best regards,

Franziska

sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
[1] DESeq2_1.6.1            RcppArmadillo_0.4.500.0 Rcpp_0.11.3            
[4] GenomicRanges_1.18.1    GenomeInfoDb_1.2.2      IRanges_2.0.0          
[7] S4Vectors_0.4.0         BiocGenerics_0.12.0    

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.28.1 BBmisc_1.8           BatchJobs_1.5       
 [4] Biobase_2.26.0       BiocParallel_1.0.0   DBI_0.3.1           
 [7] Formula_1.1-2        Hmisc_3.14-5         MASS_7.3-35         
[10] RColorBrewer_1.0-5   RSQLite_1.0.0        XML_3.98-1.1        
[13] XVector_0.6.0        acepack_1.3-3.3      annotate_1.44.0     
[16] base64enc_0.1-2      brew_1.0-6           checkmate_1.5.0     
[19] cluster_1.15.3       codetools_0.2-9      colorspace_1.2-4    
[22] digest_0.6.4         fail_1.2             foreach_1.4.2       
[25] foreign_0.8-61       genefilter_1.48.1    geneplotter_1.44.0  
[28] ggplot2_1.0.0        grid_3.1.2           gtable_0.1.2        
[31] iterators_1.0.7      lattice_0.20-29      latticeExtra_0.6-26 
[34] locfit_1.5-9.1       munsell_0.4.2        nnet_7.3-8          
[37] plyr_1.8.1           proto_0.3-10         reshape2_1.4        
[40] rpart_4.1-8          scales_0.2.4         sendmailR_1.2-1     
[43] splines_3.1.2        stringr_0.6.2        survival_2.37-7     
[46] tools_3.1.2          xtable_1.7-4       

 

 

 

 

deseq2 design matrix limma design matrix • 8.1k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 1 day ago
United States

Hi Franziska,

Maybe this solves your problem most easily: in DESeq2 v1.8 which was released this week, we provided support for users to use a custom model matrix / design matrix, which you can provide to the DESeq() function. See the top NEWS item:

http://bioconductor.org/packages/release/bioc/news/DESeq2/NEWS

You would provide the matrix to the 'full' argument of DESeq(). You need to specify betaPrior=FALSE, though because the routines which calculate the prior use the information in the design formula.

 

You suggested you want to use this design:

"print(model.matrix(~T1 +T2 + T3 +ctrl , designMat))"

If you wrap designMat in data.frame(), then you get a rank-deficient design matrix. The one you likely would supply to limma also has a + 0, to remove the intercept term. You can use this design with DESeq2 (but you need to turn off the betaPrior as well, because it doesn't make sense to shrink the betas zero in this case -- they do not represent group fold changes anymore.)

But maybe the point above will allow you to work with the design matrices you want to use.

ADD COMMENT

Login before adding your answer.

Traffic: 704 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6