Hi all,
I am recently analyzing a 10X singlecell RNA-seq data following the workflow posted on: https://master.bioconductor.org/packages/release/workflows/vignettes/simpleSingleCell/inst/doc/work-3-tenx.html
I got an error when using makeTechTrend() and I have figured out that the problem is that this function calls rowMeans() internally but for some reason the function doesn't work for sparse matrix anymore. However, when I try to run makeTechTrend() using PBMC 4K data, it works with no problem although rowMeans() still not working when I pull the sparse matrix "counts(sce)" outside the function. I've made some reproducible codes to demonstrate the problem:
#download the PBMC file download.file("http://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc4k/pbmc4k_raw_gene_bc_matrices.tar.gz","pbmc4k_raw_gene_bc_matrices.tar.gz") untar("pbmc4k_raw_gene_bc_matrices.tar.gz", exdir="pbmc4k")
#make sce object library(DropletUtils) fname <- "pbmc4k/raw_gene_bc_matrices/GRCh38" sce <- read10xCounts(fname, col.names=TRUE) sce
# class: SingleCellExperiment # dim: 33694 737280 # metadata(0): # assays(1): counts # rownames(33694): ENSG00000243485 ENSG00000237613 ... ENSG00000277475 ENSG00000268674 # rowData names(2): ID Symbol # colnames(737280): AAACCTGAGAAACCAT-1 AAACCTGAGAAACCGC-1 ... TTTGTCATCTTTAGTC-1 TTTGTCATCTTTCCTC-1 # colData names(2): Sample Barcode # reducedDimNames(0): # spikeNames(0):
class(counts(sce)) # [1] "dgCMatrix" # attr(,"package") # [1] "Matrix"
methods(class=class(counts(sce))) #is has colMeans colSums rowMeans rowSums in it.
#gets error in this step colSums(counts(sce))[1:5] # Error in colSums(counts(sce)) : # 'x' must be an array of at least two dimensions
Can anyone give me some hints on it? Thanks in advance!
Meng-Chun
sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] scran_1.8.0 EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.4.0 AnnotationFilter_1.4.0
[5] GenomicFeatures_1.32.0 AnnotationDbi_1.42.0 scater_1.8.0 ggplot2_2.2.1
[9] DropletUtils_1.0.0 SingleCellExperiment_1.2.0 SummarizedExperiment_1.10.0 DelayedArray_0.6.0
[13] matrixStats_0.53.1 Biobase_2.40.0 GenomicRanges_1.32.0 GenomeInfoDb_1.16.0
[17] IRanges_2.14.0 S4Vectors_0.18.0 BiocGenerics_0.26.0 BiocParallel_1.13.1
loaded via a namespace (and not attached):
[1] ProtGenerics_1.12.0 bitops_1.0-6 bit64_0.9-7 progress_1.1.2
[5] httr_1.3.1 dynamicTreeCut_1.63-1 tools_3.5.0 irlba_2.3.2
[9] R6_2.2.2 DT_0.4 vipor_0.4.5 DBI_1.0.0
[13] lazyeval_0.2.1 colorspace_1.3-2 gridExtra_2.3 prettyunits_1.0.2
[17] bit_1.1-12 curl_3.2 compiler_3.5.0 rtracklayer_1.40.0
[21] scales_0.5.0 stringr_1.3.0 digest_0.6.15 Rsamtools_1.32.0
[25] XVector_0.20.0 pkgconfig_2.0.1 htmltools_0.3.6 limma_3.36.0
[29] htmlwidgets_1.2 rlang_0.2.0 RSQLite_2.1.0 FNN_1.1
[33] shiny_1.0.5 DelayedMatrixStats_1.2.0 bindr_0.1.1 dplyr_0.7.4
[37] RCurl_1.95-4.10 magrittr_1.5 GenomeInfoDbData_1.1.0 Matrix_1.2-14
[41] Rcpp_0.12.16 ggbeeswarm_0.6.0 munsell_0.4.3 Rhdf5lib_1.2.0
[45] viridis_0.5.1 stringi_1.1.7 yaml_2.1.19 edgeR_3.22.0
[49] zlibbioc_1.26.0 rhdf5_2.24.0 plyr_1.8.4 grid_3.5.0
[53] blob_1.1.1 promises_1.0.1 shinydashboard_0.7.0 lattice_0.20-35
[57] Biostrings_2.48.0 locfit_1.5-9.1 pillar_1.2.2 igraph_1.2.1
[61] rjson_0.2.15 reshape2_1.4.3 biomaRt_2.36.0 XML_3.98-1.11
[65] glue_1.2.0 data.table_1.11.0 httpuv_1.4.1 gtable_0.2.0
[69] assertthat_0.2.0 mime_0.5 xtable_1.8-2 later_0.7.1
[73] viridisLite_0.3.0 tibble_1.4.2 GenomicAlignments_1.16.0 beeswarm_0.2.3
[77] memoise_1.1.0 tximport_1.8.0 bindrcpp_0.2.2 statmod_1.4.30
>
Good call. I ran into the same issue, and after trying `base::rowSums()` with no success, was left clueless. I wonder if perhaps Bioconductor should be updated so-as to better detect sparse matrices and call the appropriate function? The current error message is not very informative.
This is under active consideration:
https://github.com/Bioconductor/MatrixGenerics/issues/2
It is not straightforward as it requires coordination with Matrix and associated packages.