problems when working with SingleCellExperiment object in scater
2
0
Entering edit mode
chriad ▴ 10
@chriad-10721
Last seen 7.2 years ago

Hi,

I have a SingleCellExperiment object:

class: SingleCellExperiment
dim: 27998 3265
metadata(0):
assays(1): counts
rownames(27998): ENSMUSG00000051951 ENSMUSG00000089699 ... ENSMUSG00000096730 ENSMUSG00000095742
rowData names(2): id symbol
colnames: NULL
colData names(2): dataset barcode
reducedDimNames(0):
spikeNames(0):

and I would like to use the scater package for quality control.

When I try to use e.g. the calculateCPM function according to this tutorial: https://bioconductor.org/packages/devel/bioc/vignettes/scater/inst/doc/vignette.html

I get the following error:

> exprs(sce10x) <- log2(
+   calculateCPM(sce10x, use.size.factors = FALSE) + 1)
Error in colSums(counts_mat) :
  'x' must be an array of at least two dimensions

Other errors also turn up, e.g.runTSNE:

> runTSNE(object = sce10x, exprs_values = "counts")
Error in matrixStats::rowVars(exprs_mat) :
  Argument 'x' must be a matrix or a vector.

The count matrix is saved as a sparse matrix:

> class(counts(sce10x))
[1] "dgTMatrix"
attr(,"package")
[1] "Matrix"

My question now is: Can the scater package not yet handle this data structure or do I have outdated/incompatible packages installed? In the latter case, how can I know which packages I have to upgrade/downgrade? I have installed some packages with devtools::install_github and some with the useDevel (i.e. development versions of bioconductor packages). I am not experienced with managing conflicts with packages and would thus be thankful if someone could clarify.

> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Workstation release 6.9 (Santiago)

Matrix products: default
BLAS/LAPACK: /usr/prog/OpenBLAS/0.2.8-gompi-1.5.14-NX-LAPACK-3.5.0/lib/libopenblas_nehalemp-r0.2.8.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] profvis_0.3.3               purrr_0.2.3                 stringr_1.2.0               biomaRt_2.33.4             
 [5] igraph_1.1.2                Ckmeans.1d.dp_4.2.1         topGO_2.29.0                SparseM_1.77               
 [9] GO.db_3.4.1                 AnnotationDbi_1.38.1        graph_1.55.0                statmod_1.4.30             
[13] edgeR_3.19.6                limma_3.32.7                cellrangerRkit_1.1.0        Rmisc_1.5                  
[17] plyr_1.8.4                  lattice_0.20-35             bit64_0.9-7                 bit_1.1-12                 
[21] RColorBrewer_1.1-2          Matrix_1.2-11               scater_1.5.12               ggplot2_2.2.1              
[25] SingleCellExperiment_0.99.4 SummarizedExperiment_1.7.9  DelayedArray_0.3.20         matrixStats_0.52.2         
[29] Biobase_2.36.2              GenomicRanges_1.29.14       GenomeInfoDb_1.13.4         IRanges_2.11.17            
[33] S4Vectors_0.15.8            BiocGenerics_0.22.0        

loaded via a namespace (and not attached):
 [1] viridis_0.4.0           viridisLite_0.2.0       shiny_1.0.5             assertthat_0.2.0        blob_1.1.0             
 [6] GenomeInfoDbData_0.99.0 vipor_0.4.5             yaml_2.1.14             progress_1.1.2          RSQLite_2.0            
[11] glue_1.1.1              digest_0.6.12           XVector_0.17.1          colorspace_1.3-2        htmltools_0.3.6        
[16] httpuv_1.3.5            devtools_1.13.3         XML_3.98-1.9            pkgconfig_2.0.1         pheatmap_1.0.8         
[21] zlibbioc_1.22.0         xtable_1.8-2            scales_0.5.0            Rtsne_0.13              tibble_1.3.4           
[26] withr_2.0.0             lazyeval_0.2.0          magrittr_1.5            mime_0.5                memoise_1.1.0          
[31] beeswarm_0.2.3          shinydashboard_0.6.1    tools_3.4.1             data.table_1.10.4       prettyunits_1.0.2      
[36] munsell_0.4.3           locfit_1.5-9.1          irlba_2.2.1             bindrcpp_0.2            compiler_3.4.1         
[41] rlang_0.1.2             rhdf5_2.21.4            grid_3.4.1              RCurl_1.95-4.8          tximport_1.5.0         
[46] htmlwidgets_0.9         rjson_0.2.15            bitops_1.0-6            gtable_0.2.0            DBI_0.7                
[51] reshape2_1.4.2          R6_2.2.2                gridExtra_2.3           dplyr_0.7.3             bindr_0.1              
[56] stringi_1.1.5           ggbeeswarm_0.6.0        Rcpp_0.12.13    
scater SingleCellExperiment • 4.3k views
ADD COMMENT
0
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 14 minutes ago
The city by the bay

There's no problem with your installation. The issue is that the low-level methods in matrixStats do not support sparse inputs. I thought I had caught and replaced most of these calls when I refactored scater earlier in the year, but apparently not. I will purge the remainders soon. The colSums case is probably just because scater hasn't imported the colSums method from the Matrix package; this is easily fixed.

FYI, most functions prefer to work with dgCMatrix objects, due to the more structured format of the data. I am a bit bemused about why readMM returns a dgTMatrix when all the other documentation in the Matrix package indicates a preference towards dgCMatrix objects. I guess we should also modify read10XResults to coerce the 10X input data to the dgCMatrix format automatically.

ADD COMMENT
0
Entering edit mode
davis ▴ 90
@davis-8868
Last seen 7.1 years ago
United Kingdom

Thanks for the bug report! I've just commited (to Bioc devel) fixes for the `colSums` issues and adjusted `read10xResults` so that it coerces 10x data to a `dgCMatrix` automatically.

We're still working through all of the other possibilities and adding tests, so the `rowVars` issue you experienced with `runTSNE` should be resolved in the next few days too. 

ADD COMMENT

Login before adding your answer.

Traffic: 483 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6