Segmentation Fault with development versions of R, Bioconductor, and DropletUtils when processing output of CellRanger 3.0.2
2
0
Entering edit mode
@matthew-thornton-5564
Last seen 12 weeks ago
USA, Los Angeles, USC

Hello!

I have the output of the latest CellRanger software 3.0.2. I was unable to load the molecule_info.h5 file with the released version of DropletUtils. I was told that I needed to use the development version of DropletUtils for newer cellranger output. So I installed a development version of R to then install the development version of Bioconductor and then the development version of DropletUtils. I am using the development version of DropletUtils and It is loading the data, but now I am getting a segfault with the downsampling. It is only using one core for the downsampling even though it depends on the BiocParallel library. Here is my error:

> tmp <- "/home/met/data/Hong_Lab/02Apr19/S11M"
> sce <- read10xCounts(tmp)
> mol.info <- read10xMolInfo("/home/met/data/Hong_Lab/02Apr19/S11M/molecule_info.h5")
> no.sampling <- downsampleReads("/home/met/data/Hong_Lab/02Apr19/S11M/molecule_info.h5", prop=1)
Error: segfault from C stack overflow
> with.sampling <- downsampleReads("/home/met/data/Hong_Lab/02Apr19/S11M/molecule_info.h5", prop=0.5)
Error: segfault from C stack overflow
> sessionInfo()
R Under development (unstable) (2019-04-01 r76306)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: openSUSE Leap 15.0

Matrix products: default
BLAS/LAPACK: /usr/local/lib/OpenBLAS_home/lib/libopenblas_zenp-r0.2.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] DropletUtils_1.3.13         SingleCellExperiment_1.5.2 
 [3] SummarizedExperiment_1.13.0 DelayedArray_0.9.9         
 [5] BiocParallel_1.17.18        matrixStats_0.54.0         
 [7] Biobase_2.43.1              GenomicRanges_1.35.1       
 [9] GenomeInfoDb_1.19.2         IRanges_2.17.4             
[11] S4Vectors_0.21.21           BiocGenerics_0.29.2        

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.1             edgeR_3.25.3           XVector_0.23.2        
 [4] zlibbioc_1.29.0        lattice_0.20-38        tools_3.7.0           
 [7] grid_3.7.0             rhdf5_2.27.15          dqrng_0.1.1           
[10] R.oo_1.22.0            HDF5Array_1.11.11      Matrix_1.2-17         
[13] GenomeInfoDbData_1.2.0 Rhdf5lib_1.5.4         R.utils_2.8.0         
[16] bitops_1.0-6           RCurl_1.95-4.12        limma_3.39.14         
[19] compiler_3.7.0         R.methodsS3_1.7 .1      locfit_1.5-9.1

Any assitance or advice is always greatly appreciated. Thank you!

DropletUtils Bioc-devel • 1.5k views
ADD COMMENT
1
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 11 hours ago
The city by the bay

DropletUtils doesn't do a lot of explicit allocation on the stack, so I'm bemused by the error message. Perhaps std::stable_sort is the offender here, depending on your C++ standard library implementation. I suggest:

  • Checking that you have fewer than .Machine$integer.max molecules (i.e., rows in the output of read10xMolInfo). This causes downsampleReads's ordering to misbehave; it has been fixed on Github, but I haven't pushed it yet.
  • Determining whether the current stack size on your machine is too small for whatever the function is trying to allocate. Check this with ulimit -s and try increasing it to ulimit -s <SOME LARGE NUMBER>, e..g, 65536 for a 64 MB stack.

In any case, it would help if you did debug(downsampleReads) and stepped through the function to locate where the error arises. Once you get into C-level errors, the cause could literally be anything.

Regarding BiocParallel: you shouldn't have any reason to think that downsampleReads is parallelized. Certainly the documentation does not imply any parallelization options (unlike, e.g., ?emptyDrops). In fact, parallelization is not possible in the standard application because the downsampling of all later counts is dependent on that of earlier counts.

ADD COMMENT
0
Entering edit mode
@matthew-thornton-5564
Last seen 12 weeks ago
USA, Los Angeles, USC

ulimit was the issue. I assumed that it was unlimited, but when I checked it was 8145, so I changed it to 65536 and it worked. Thank you!

ADD COMMENT

Login before adding your answer.

Traffic: 477 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6