arrayQualityMetrics with DESeq2
4
0
Entering edit mode
@rachelwright8-7146
Last seen 9.6 years ago
United States

Hi experts,

Does anyone have advice on how to use arrayQualityMetrics to identify sample outliers with DESeq data sets produced using DESeq2?  I'm having trouble generating a file that the "arrayQualityMetrics" function will accept using DESeq2 functions ("DESeq" or "varianceStabilizingTransformation" or "rlog"...).  

I've tried several things, including:

dds<-DESeqDataSetFromMatrix(countData=countData, colData=colData, design=design)

 

vsd=varianceStabilizingTransformation(dds, blind=T)

arrayQualityMetrics(vsd, intgroup=c("gen", "conditon"), force=T)

The error message tells me that arrayQualityMetrics can't "find an inherited method for function ‘platformspecific’ for signature ‘"SummarizedExperiment""... the standard output of the DESeq2 functions I'm calling.

My current work-around is to use DESeq(1) to generate the variance stabilized data, calling arrayQualityMetrics on that, and proceeding with DESeq2 omitting the sample outliers.  I worry about this strategy since DESeq2 estimates dispersions differently.

library(DESeq)
cds=newCountDataSet(countData,colData)
cds = estimateDispersions(estimateSizeFactors(cds), method="blind")
Vst = varianceStabilizingTransformation(cds)
arrayQualityMetrics(Vst, intgroup = c("gen", "condition"))

Any tips?

Thanks in advance!

Rachel

sessionInfo()

R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DESeq_1.18.0               lattice_0.20-29            locfit_1.5-9.1            
 [4] Biobase_2.26.0             arrayQualityMetrics_3.22.0 DESeq2_1.6.3              
 [7] RcppArmadillo_0.4.600.0    Rcpp_0.11.3                GenomicRanges_1.18.3      
[10] GenomeInfoDb_1.2.4         IRanges_2.0.1              S4Vectors_0.4.0           
[13] BiocGenerics_0.12.1       

loaded via a namespace (and not attached):
 [1] acepack_1.3-3.3       affy_1.44.0           affyio_1.34.0         affyPLM_1.42.0       
 [5] annotate_1.44.0       AnnotationDbi_1.28.1  base64_1.1            base64enc_0.1-2      
 [9] BatchJobs_1.5         BBmisc_1.8            beadarray_2.16.0      BeadDataPackR_1.18.0 
[13] BiocInstaller_1.16.1  BiocParallel_1.0.0    Biostrings_2.34.1     brew_1.0-6           
[17] Cairo_1.5-5           checkmate_1.5.1       cluster_1.15.3        codetools_0.2-10     
[21] colorspace_1.2-4      DBI_0.3.1             digest_0.6.8          fail_1.2             
[25] foreach_1.4.2         foreign_0.8-62        Formula_1.1-2         gcrma_2.38.0         
[29] genefilter_1.48.1     geneplotter_1.44.0    ggplot2_1.0.0         grid_3.1.1           
[33] gridSVG_1.4-2         gtable_0.1.2          Hmisc_3.14-6          hwriter_1.3.2        
[37] illuminaio_0.8.0      iterators_1.0.7       latticeExtra_0.6-26   limma_3.22.1         
[41] MASS_7.3-37           munsell_0.4.2         nnet_7.3-8            plyr_1.8.1           
[45] preprocessCore_1.28.0 proto_0.3-10          RColorBrewer_1.1-2    reshape2_1.4.1       
[49] RJSONIO_1.3-0         rpart_4.1-8           RSQLite_1.0.0         scales_0.2.4         
[53] sendmailR_1.2-1       setRNG_2013.9-1       splines_3.1.1         stringr_0.6.2        
[57] survival_2.37-7       SVGAnnotation_0.93-1  tools_3.1.1           vsn_3.34.0           
[61] XML_3.98-1.1          xtable_1.7-4          XVector_0.6.0         zlibbioc_1.12.0   

 

arrayqualitymetrics deseq2 sample outliers • 4.7k views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 5 hours ago
United States

The technical details of why it's not working out of the box:

Because DESeq's CountDataSet extended eSet, the arrayQualityMetrics methods worked automatically. 

DESeq2's data object DESeqDataSet extends SummarizedExperiment, which is a different class, although it's a similar "shape" of object: a matrix of values (accessed with exprs/assay) with column info (pData/colData) and row info (fData/rowData).

There was some talk on the developer list about making a simple conversion function, which would help here: 

https://stat.ethz.ch/pipermail/bioc-devel/2014-September/006249.html

I think the line of code which will work for you is the following:

e = ExpressionSet(assay(vsd), AnnotatedDataFrame(colData(as.data.frame(vsd))), AnnotatedDataFrame(as.data.frame(rowData(vsd))))

Then use 'e' for arrayQualityMetrics.

ADD COMMENT
1
Entering edit mode
@rachelwright8-7146
Last seen 9.6 years ago
United States

Thanks, Michael!  Though I still had an object class error:

#Dr. Love's way
e=ExpressionSet(assay(vsd), AnnotatedDataFrame(colData(vsd)), AnnotatedDataFrame(rowData(vsd)))

#Error in (function (classes, fdef, mtable)  : 
#            unable to find an inherited method for function ‘AnnotatedDataFrame’ for signature "DataFrame", "missing"’

but I think I fixed it by using just the colData of my vsd object (where samples, "gen", and "condition" are listed) and specifying it as a data.frame (for AnnotatedDataFrame()):

e=ExpressionSet(assay(vsd), AnnotatedDataFrame(as.data.frame(colData(vsd))))

arrayQualityMetrics(e, intgroup=c("gen", "condition"), force=T)

 

This worked and gave me the exact same results in arrayQualityMetrics as I was getting with DESeq, so that was a nice sanity check.  I didn't anticipate them to change much/at all.

Could you confirm that my solution wasn't a naughty thing to do?

Thanks again!

R

ADD COMMENT
0
Entering edit mode

Perfect. I forgot you would need to add as.data.frame. good save.

ADD REPLY
0
Entering edit mode
@wolfgang-huber-3550
Last seen 10 weeks ago
EMBL European Molecular Biology Laborat…

Dear Rachel

this looks very reasonable. And arrayQualityMetrics does not need the featureData/rowData.

We really should provide something at the project / generic infrastructure level to perform these types of object conversions (esp. SummarizedExperiment -> ExpressionSet), as they can be confusing and arcane.

Wolfgang

 

ADD COMMENT
1
Entering edit mode

The 'multi assay' group has an implementation of this; I'll put moving this into the main stream on our tasks for the next sprint.

ADD REPLY
0
Entering edit mode
@rachelwright8-7146
Last seen 9.6 years ago
United States

Thanks for your comments.  I'll keep an eye out for updates, but I appreciate all of your help developing a solution in the meantime.  

ADD COMMENT

Login before adding your answer.

Traffic: 736 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6