scran's quickCluster error
1
0
Entering edit mode
marc.tormo • 0
@marctormo-23189
Last seen 2.5 years ago
UPF (Barcelona)

Hello everyone,

I'm getting an error while using scran's quickCluster in R v4.0 that worked fine with previous R versions (example data from https://osca.bioconductor.org/normalization.html#normalization-by-deconvolution):

> library(scran)
> set.seed(100)
> clust.zeisel <- quickCluster(sce.zeisel) 
Error in if (any(above.noise)) { : missing value where TRUE/FALSE needed

Any ideas? Thanks!

> sessionInfo()
R version 4.0.0 (2020-04-24)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /gpfs42/robbyfs/homes/aplic/noarch/software/OpenBLAS/0.3.1-GCC-7.3.0-2.30/lib/libopenblas_nehalemp-r0.3.1.so

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 

locale:
 [1] LC_CTYPE=en_US.UTF-8          LC_NUMERIC=C                 
 [3] LC_TIME=en_US.UTF-8           LC_COLLATE=en_US.UTF-8       
 [5] LC_MONETARY=en_US.UTF-8       LC_MESSAGES=en_US.UTF-8      
 [7] LC_PAPER=en_US.UTF-8          LC_NAME=en_US.UTF-8          
 [9] LC_ADDRESS=en_US.UTF-8        LC_TELEPHONE=en_US.UTF-8     
[11] LC_MEASUREMENT=en_US.UTF-8    LC_IDENTIFICATION=en_US.UTF-8

attached base packages:
 [1] grid      parallel  stats4    stats     graphics  grDevices utils     datasets 
 [9] methods   base     

other attached packages:
 [1] org.Mm.eg.db_3.11.1         scRNAseq_2.2.0              DESeq2_1.28.1              
 [4] LoomExperiment_1.6.0        rtracklayer_1.48.0          rhdf5_2.32.0               
 [7] SingleR_1.2.2               pheatmap_1.0.12             xlsx_0.6.3                 
[10] AUCell_1.10.0               GSEABase_1.50.0             graph_1.66.0               
[13] annotate_1.66.0             XML_3.99-0.3                AnnotationDbi_1.50.0       
[16] gridExtra_2.3               edgeR_3.30.0                limma_3.44.1               
[19] Rtsne_0.15                  scran_1.16.0                scater_1.16.0              
[22] ggplot2_3.3.0               SingleCellExperiment_1.10.1 SummarizedExperiment_1.18.1
[25] DelayedArray_0.14.0         matrixStats_0.56.0          Biobase_2.48.0             
[28] GenomicRanges_1.40.0        GenomeInfoDb_1.24.0         IRanges_2.22.1             
[31] S4Vectors_0.26.0            BiocGenerics_0.34.0        

loaded via a namespace (and not attached):
 [1] ggbeeswarm_0.6.0              colorspace_1.4-1             
 [3] ellipsis_0.3.0                XVector_0.28.0               
 [5] BiocNeighbors_1.6.0           rstudioapi_0.11              
 [7] bit64_0.9-7                   interactiveDisplayBase_1.26.0
 [9] splines_4.0.0                 R.methodsS3_1.8.0            
[11] geneplotter_1.66.0            knitr_1.28                   
[13] Rsamtools_2.4.0               rJava_0.9-12                 
[15] dbplyr_1.4.3                  R.oo_1.23.0                  
[17] HDF5Array_1.16.0              shiny_1.4.0.2                
[19] BiocManager_1.30.10           compiler_4.0.0               
[21] httr_1.4.1                    dqrng_0.2.1                  
[23] assertthat_0.2.1              Matrix_1.2-18                
[25] fastmap_1.0.1                 later_1.0.0                  
[27] BiocSingular_1.4.0            htmltools_0.4.0              
[29] tools_4.0.0                   rsvd_1.0.3                   
[31] igraph_1.2.5                  gtable_0.3.0                 
[33] glue_1.4.1                    GenomeInfoDbData_1.2.3       
[35] dplyr_0.8.5                   rappdirs_0.3.1               
[37] Rcpp_1.0.4.6                  Biostrings_2.56.0            
[39] vctrs_0.3.0                   ExperimentHub_1.14.0         
[41] DelayedMatrixStats_1.10.0     stringr_1.4.0                
[43] xfun_0.13                     xlsxjars_0.6.1               
[45] mime_0.9                      lifecycle_0.2.0              
[47] irlba_2.3.3                   statmod_1.4.34               
[49] AnnotationHub_2.20.0          zlibbioc_1.34.0              
[51] scales_1.1.1                  promises_1.1.0               
[53] RColorBrewer_1.1-2            yaml_2.2.1                   
[55] curl_4.3                      memoise_1.1.0                
[57] stringi_1.4.6                 RSQLite_2.2.0                
[59] genefilter_1.70.0             BiocVersion_3.11.1           
[61] BiocParallel_1.22.0           rlang_0.4.6                  
[63] pkgconfig_2.0.3               bitops_1.0-6                 
[65] evaluate_0.14                 lattice_0.20-41              
[67] purrr_0.3.4                   Rhdf5lib_1.10.0              
[69] GenomicAlignments_1.24.0      bit_1.1-15.2                 
[71] tidyselect_1.1.0              magrittr_1.5                 
[73] R6_2.4.1                      DBI_1.1.0                    
[75] pillar_1.4.4                  withr_2.2.0                  
[77] survival_3.1-12               RCurl_1.98-1.2               
[79] tibble_3.0.1                  crayon_1.3.4                 
[81] BiocFileCache_1.12.0          rmarkdown_2.1                
[83] viridis_0.5.1                 locfit_1.5-9.4               
[85] data.table_1.12.8             blob_1.2.1                   
[87] digest_0.6.25                 xtable_1.8-4                 
[89] httpuv_1.5.2                  R.utils_2.9.2                
[91] munsell_0.5.0                 beeswarm_0.2.3               
[93] viridisLite_0.3.0             vipor_0.4.5                  
scran • 1.6k views
ADD COMMENT
0
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 8 hours ago
The city by the bay

I don't know. This works fine for me:

#--- loading ---#
library(scRNAseq)
sce.zeisel <- ZeiselBrainData()

library(scater)
sce.zeisel <- aggregateAcrossFeatures(sce.zeisel, 
    id=sub("_loc[0-9]+$", "", rownames(sce.zeisel)))

#--- gene-annotation ---#
library(org.Mm.eg.db)
rowData(sce.zeisel)$Ensembl <- mapIds(org.Mm.eg.db, 
    keys=rownames(sce.zeisel), keytype="SYMBOL", column="ENSEMBL")

#--- quality-control ---#
stats <- perCellQCMetrics(sce.zeisel, subsets=list(
    Mt=rowData(sce.zeisel)$featureType=="mito"))
qc <- quickPerCellQC(stats, percent_subsets=c("altexps_ERCC_percent", 
    "subsets_Mt_percent"))
sce.zeisel <- sce.zeisel[,!qc$discard]

library(scran)
set.seed(100)
clust.zeisel <- quickCluster(sce.zeisel) 
table(clust.zeisel)
##   1   2   3   4   5   6   7   8   9  10  11  12  13  14 
## 170 254 441 178 393 148 219 240 189 123 112 103 135 111 

Session information:

R version 4.0.0 Patched (2020-05-01 r78341)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /home/luna/Software/R/R-4-0-branch/lib/libRblas.so
LAPACK: /home/luna/Software/R/R-4-0-branch/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] scran_1.16.0                org.Mm.eg.db_3.11.1        
 [3] AnnotationDbi_1.50.0        scater_1.16.0              
 [5] ggplot2_3.3.0               scRNAseq_2.2.0             
 [7] SingleCellExperiment_1.10.1 SummarizedExperiment_1.18.1
 [9] DelayedArray_0.14.0         matrixStats_0.56.0         
[11] Biobase_2.48.0              GenomicRanges_1.40.0       
[13] GenomeInfoDb_1.24.0         IRanges_2.22.1             
[15] S4Vectors_0.26.0            BiocGenerics_0.34.0        

loaded via a namespace (and not attached):
 [1] viridis_0.5.1                 httr_1.4.1                   
 [3] edgeR_3.30.0                  BiocSingular_1.4.0           
 [5] bit64_0.9-7                   viridisLite_0.3.0            
 [7] AnnotationHub_2.20.0          DelayedMatrixStats_1.10.0    
 [9] shiny_1.4.0.2                 assertthat_0.2.1             
[11] statmod_1.4.34                interactiveDisplayBase_1.26.0
[13] BiocManager_1.30.10           BiocFileCache_1.12.0         
[15] dqrng_0.2.1                   blob_1.2.1                   
[17] GenomeInfoDbData_1.2.3        vipor_0.4.5                  
[19] yaml_2.2.1                    BiocVersion_3.11.1           
[21] pillar_1.4.4                  RSQLite_2.2.0                
[23] lattice_0.20-41               limma_3.44.1                 
[25] glue_1.4.1                    digest_0.6.25                
[27] promises_1.1.0                XVector_0.28.0               
[29] colorspace_1.4-1              htmltools_0.4.0              
[31] httpuv_1.5.2                  Matrix_1.2-18                
[33] pkgconfig_2.0.3               zlibbioc_1.34.0              
[35] purrr_0.3.4                   xtable_1.8-4                 
[37] scales_1.1.1                  later_1.0.0                  
[39] BiocParallel_1.22.0           tibble_3.0.1                 
[41] ellipsis_0.3.1                withr_2.2.0                  
[43] magrittr_1.5                  crayon_1.3.4                 
[45] mime_0.9                      memoise_1.1.0                
[47] beeswarm_0.2.3                tools_4.0.0                  
[49] lifecycle_0.2.0               locfit_1.5-9.4               
[51] munsell_0.5.0                 irlba_2.3.3                  
[53] compiler_4.0.0                rsvd_1.0.3                   
[55] rlang_0.4.6                   grid_4.0.0                   
[57] RCurl_1.98-1.2                BiocNeighbors_1.6.0          
[59] rappdirs_0.3.1                igraph_1.2.5                 
[61] bitops_1.0-6                  ExperimentHub_1.14.0         
[63] gtable_0.3.0                  DBI_1.1.0                    
[65] curl_4.3                      R6_2.4.1                     
[67] gridExtra_2.3                 dplyr_0.8.5                  
[69] fastmap_1.0.1                 bit_1.1-15.2                 
[71] ggbeeswarm_0.6.0              Rcpp_1.0.4.6                 
[73] vctrs_0.3.0                   dbplyr_1.4.3                 
[75] tidyselect_1.1.0             
ADD COMMENT
0
Entering edit mode

Thanks Aaron,

I'm trying to understand what's the problem here. I've noticed that when I'm running R with 1 or 2 available cores, the command ends without errors. When I'm working with 3 o 4 cores, I get a warning:

> clust.zeisel <- quickCluster(sce.zeisel) 
Warning message:
In (function (to_check, X, clust_centers, clust_info, dtype, nn,  :
  detected tied distances to neighbors, see ?'BiocNeighbors-ties'
> table(clust.zeisel)
clust.zeisel
  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15 
178 268 168 193 163 220 227 196 234 191 205 119 193 172 210 

If there're more than 4 cores, I get the error:

> table(clust.zeisel)
Error in eval(quote(list(...)), env) : object 'clust.zeisel' not found

Does anyone have an explanation?

Many thanks!

ADD REPLY
0
Entering edit mode

It shouldn't matter how many cores you have, because:

  • you haven't told quickCluster to parallelize itself via BPPARAM=.
  • there shouldn't be any accidental parallelization either, I am quite sure about that.
  • and even if there was, all of my functions guarantee that you get the same result regardless of what parallelization scheme you're using; you shouldn't see differences in behavior.

Rather, I would suspect that your non-standard openBLAS is responding to the differences in the number of cores (you're probably on a cluster?) and this is causing errors during matrix multiplication in the IRLBA step. This kind of issue is not as uncommon as one might expect, see here.

You can check whether or not this is the case by running the above code on any other machine.

ADD REPLY
0
Entering edit mode

Hi Aaron,

Yes, I'm working on a cluster, and there was something wrong with the installation within some old machines. Reinstalling it and working with newer machines solved our problem (I don't have more information, sorry).

Thanks for your time!

ADD REPLY

Login before adding your answer.

Traffic: 626 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6