I'm using GSVA (v1.18.0) on R 3.2.2 and trying to perform a gene set analysis on a small data set (75 genes, 6 samples). However, whenever I set the bootstrap rounds to > 10, I end up with a segfault. As far as I can tell there is no issue with the data itself (e.g., genes with sd = 0). The details of my R session I provided below. Any pointers would be appreciated.(I face the same problem with R 3.2.3 on OS X 10.9.5)
My invocation I'm using is given below, and the input data (gdat, with row names being the Entrez Gene ID's) can be obtained from
library(org.Hs.eg.db) library(GSVA) library(GSEABase) library(GSVAdata) data(c2BroadSets) gse <- gsva(gdat,c2BroadSets,rnaseq=FALSE,min.sz=4,no.bootstraps=1000)
After a few gene sets it segfaults with the following error:
*** caught segfault *** address 0x7f01da2843a0, cause 'memory not mapped' Traceback: 1: .C("matrix_density_R", as.double(t(expr[, sample.idxs, drop = FALSE])), as.double(t(expr)), R = double(n.test.samples * n.genes), n.density.samples, n.test.samples, n.genes, as.integer(rnaseq)) 2: compute.gene.density(expr, sample.idxs, rnaseq, kernel) 3: compute.geneset.es(expr, gset.idx.list, sample(n.samples, bootstrap.nsamples, replace = T), rnaseq = rnaseq, abs.ranking = abs.ranking, mx.diff = mx.diff, tau = tau, kernel = kernel, verbose = verbose) 4: .gsva(expr, mapped.gset.idx.list, method, rnaseq, abs.ranking, no.bootstraps, bootstrap.percent, parallel.sz, parallel.type, mx.diff, tau, kernel, ssgsea.norm, verbose) 5: .local(expr, gset.idx.list, ...) 6: gsva(gdat, c2BroadSets, rnaseq = FALSE, min.sz = 4, no.bootstraps = 1000) 7: gsva(gdat, c2BroadSets, rnaseq = FALSE, min.sz = 4, no.bootstraps = 1000)
The output of sessionInfo() is given below
> sessionInfo() R version 3.2.2 (2015-08-14) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Scientific Linux release 6.7 (Carbon) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 [8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] GSVAdata_1.6.0 hgu95a.db_3.2.2 GSEABase_1.32.0 graph_1.48.0 annotate_1.48.0 XML_3.98-1.3 org.Hs.eg.db_3.2.3 RSQLite_1.0.0 DBI_0.3.1 [10] AnnotationDbi_1.32.0 IRanges_1.22.10 Biobase_2.24.0 BiocGenerics_0.16.1 GSVA_1.18.0 loaded via a namespace (and not attached): [1] xtable_1.8-2 S4Vectors_0.8.11 tools_3.2.2
Thanks for taking a look. Unfortunately, even after upgrading to R 3.3 and the latest bioc release, I get the same segfault. I've put up a proper data file as a github gist - would you mind running this to see if it fails for you?
hi, at the moment i have a latest BioC 3.3 installation in a mac osx "el capitan" laptop and it runs fine also with this file, here's my session information.
cheers,
robert.