derfinder: analyzeChr() freezes in f-stat calculation
1
0
Entering edit mode
@jessicahekman-6877
Last seen 9.4 years ago
United States

analyzeChr seems to just freeze during f-stat calculation. "top" tells me that R is still running but using 0% of processor, so I don't think it's doing much!

My code:

 

library(derfinder)
library(GenomicFeatures)
library(BiocParallel)

cores <- 10
perms <- 1
fstat <- 10
options(species = 'canis_familiaris')

chroms <- c(1:38, 'X')
files <- rawFiles(datadir="~/transcriptome/pituitary/aligned/cf3-N3-ensembl/", samplepatt="tophat-N3-ensembl-*")
tameIds <- read.table("/home/hekman2/Dropbox/Research/transcriptome/pituitary/notes/samples-pit-tame.txt")
tameFileNames <- lapply(tameIds, function(id) { return(paste("tophat-N3-ensembl-", id, sep="")) } )

groups <- as.factor(sapply(names(files), function(filename) { ifelse(grepl(filename, tameFileNames), "tame", "aggr") }))
fullCov <- fullCoverage(files, chrs = chroms)
sampleDepths <- sampleDepth(collapseFullCoverage(fullCov), 1) # only worthwhile if we analyzed all chromosomes
models <- makeModels(sampleDepths, testvars = groups)
filteredCov <- lapply(fullCov, filterData, cutoff = 2)

dogTx <- makeTranscriptDbFromBiomart(biomart="ensembl",
                                 dataset="cfamiliaris_gene_ensembl",
                                 transcript_ids=NULL,
                                 circ_seqs=DEFAULT_CIRC_SEQS,
                                 filters="",
                                 id_prefix="ensembl_",
                                 host="www.biomart.org",
                                 port=80,
                                 miRBaseBuild=NA)
originalWd <- getwd()
setwd(file.path(originalWd, derResults))
currentChrom <- "chr1"

analyzeChr(chr = currentChrom, filteredCov[[currentChrom]], models, mc.cores=cores, groupInfo = groups, cutoffFstat = fstat, cutoffType = 'manual', nPermute = perms, seeds = 19731107 + seq_len(perms), lowMemDir = file.path(tempdir(), currentChrom), BPPARAM.custom = MulticoreParam(workers = cores), verbose = TRUE, runAnnotation = TRUE, writeOutput = TRUE, returnOutput = FALSE, txdb=dogTx)

 

Relevant output:

2015-03-23 16:51:51 analyzeChr: Pre-processing the coverage data
2015-03-23 16:52:03 preprocessCoverage: splitting the data
2015-03-23 16:53:24 analyzeChr: Calculating statistics
2015-03-23 16:53:24 calculateStats: calculating the F-statistics

 

R version:

 

R Under development (unstable) (2015-03-19 r68032) -- "Unsuffered Consequences"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)

Session info:

> devtools::session_info()
Session info -------------------------------------------------------------------
 setting  value                                             
 version  R Under development (unstable) (2015-03-19 r68032)
 system   x86_64, linux-gnu                                 
 ui       X11                                               
 language (EN)                                              
 collate  en_US.UTF-8                                       
 tz       America/Chicago                                   

Packages -----------------------------------------------------------------------
 package           * version  date       source        
 acepack           * 1.3-3.3  2014-11-24 CRAN (R 3.3.0)
 AnnotationDbi     * 1.29.20  2015-03-23 Bioconductor  
 Biobase           * 2.27.2   2015-03-23 Bioconductor  
 BiocGenerics      * 0.13.8   2015-03-23 Bioconductor  
 BiocParallel      * 1.1.19   2015-03-23 Bioconductor  
 biomaRt           * 2.23.5   2015-03-23 Bioconductor  
 Biostrings        * 2.35.11  2015-03-23 Bioconductor  
 bitops            * 1.0-6    2013-08-17 CRAN (R 3.3.0)
 bumphunter        * 1.7.6    2015-03-23 Bioconductor  
 cluster           * 2.0.1    2015-01-31 CRAN (R 3.3.0)
 codetools         * 0.2-11   2015-03-10 CRAN (R 3.3.0)
 colorspace        * 1.2-6    2015-03-11 CRAN (R 3.3.0)
 DBI               * 0.3.1    2014-09-24 CRAN (R 3.3.0)
 derfinder           1.1.17   2015-03-23 Bioconductor  
 derfinderHelper   * 1.1.6    2015-03-23 Bioconductor  
 devtools            1.7.0    2015-01-17 CRAN (R 3.3.0)
 digest            * 0.6.8    2014-12-31 CRAN (R 3.3.0)
 doRNG             * 1.6      2014-03-07 CRAN (R 3.3.0)
 foreach           * 1.4.2    2014-04-11 CRAN (R 3.3.0)
 foreign           * 0.8-63   2015-02-20 CRAN (R 3.3.0)
 Formula           * 1.2-0    2015-01-20 CRAN (R 3.3.0)
 futile.logger     * 1.4      2015-03-21 CRAN (R 3.3.0)
 futile.options    * 1.0.0    2010-04-06 CRAN (R 3.3.0)
 GenomeInfoDb      * 1.3.15   2015-03-23 Bioconductor  
 GenomicAlignments * 1.3.32   2015-03-23 Bioconductor  
 GenomicFeatures   * 1.19.33  2015-03-23 Bioconductor  
 GenomicFiles      * 1.3.14   2015-03-23 Bioconductor  
 GenomicRanges     * 1.19.47  2015-03-23 Bioconductor  
 ggplot2           * 1.0.1    2015-03-17 CRAN (R 3.3.0)
 gtable            * 0.1.2    2012-12-05 CRAN (R 3.3.0)
 Hmisc             * 3.15-0   2015-02-16 CRAN (R 3.3.0)
 IRanges           * 2.1.43   2015-03-23 Bioconductor  
 iterators         * 1.0.7    2014-04-11 CRAN (R 3.3.0)
 lambda.r          * 1.1.7    2015-03-20 CRAN (R 3.3.0)
 lattice           * 0.20-30  2015-02-22 CRAN (R 3.3.0)
 latticeExtra      * 0.6-26   2013-08-15 CRAN (R 3.3.0)
 locfit            * 1.5-9.1  2013-04-20 CRAN (R 3.3.0)
 MASS              * 7.3-40   2015-03-21 CRAN (R 3.3.0)
 Matrix            * 1.1-5    2015-01-18 CRAN (R 3.3.0)
 matrixStats       * 0.14.0   2015-02-14 CRAN (R 3.3.0)
 munsell           * 0.4.2    2013-07-11 CRAN (R 3.3.0)
 nnet              * 7.3-9    2015-02-11 CRAN (R 3.3.0)
 pkgmaker          * 0.22     2014-05-14 CRAN (R 3.3.0)
 plyr              * 1.8.1    2014-02-26 CRAN (R 3.3.0)
 proto             * 0.3-10   2012-12-22 CRAN (R 3.3.0)
 qvalue            * 1.43.0   2015-03-23 Bioconductor  
 RColorBrewer      * 1.1-2    2014-12-07 CRAN (R 3.3.0)
 Rcpp              * 0.11.5   2015-03-06 CRAN (R 3.3.0)
 RCurl             * 1.95-4.5 2014-12-28 CRAN (R 3.3.0)
 registry          * 0.2      2012-01-24 CRAN (R 3.3.0)
 reshape2          * 1.4.1    2014-12-06 CRAN (R 3.3.0)
 rngtools          * 1.2.4    2014-03-06 CRAN (R 3.3.0)
 rpart             * 4.1-9    2015-02-24 CRAN (R 3.3.0)
 Rsamtools         * 1.19.47  2015-03-23 Bioconductor  
 RSQLite           * 1.0.0    2014-10-25 CRAN (R 3.3.0)
 rstudioapi        * 0.2      2014-12-31 CRAN (R 3.3.0)
 rtracklayer       * 1.27.9   2015-03-23 Bioconductor  
 S4Vectors         * 0.5.22   2015-03-23 Bioconductor  
 scales            * 0.2.4    2014-04-22 CRAN (R 3.3.0)
 stringr           * 0.6.2    2012-12-06 CRAN (R 3.3.0)
 survival          * 2.38-1   2015-02-24 CRAN (R 3.3.0)
 XML               * 3.98-1.1 2013-06-20 CRAN (R 3.3.0)
 xtable            * 1.7-4    2014-09-12 CRAN (R 3.3.0)
 XVector           * 0.7.4    2015-03-23 Bioconductor  
 zlibbioc          * 1.13.2   2015-03-23 Bioconductor  


Thanks for all your help in the past!

Jessica

derfinder • 3.2k views
ADD COMMENT
0
Entering edit mode

Hi Jessica,

I haven't been able to reproduce your problem. Can you try running the following code  ? It will check that the cluster type is correctly being selected and that you can get results using the example data in the package.

This is what the output should look like:

> library('derfinder')
Find out what's changed in derfinder with
news(Version == "1.1.17", package = "derfinder")
Warning message:
multiple methods tables found for ‘score’
> library('BiocParallel')
>
> ## Check that the cluster is correctly defined
> ## Aka, that it's a MulticoreParam class instead of the default SnowParam
> x <- derfinder:::.define_cluster(BPPARAM.custom = MulticoreParam(workers = 2))
> x
class: MulticoreParam
bpworkers: 2
bpisup: FALSE
bplog: FALSE
bpthreshold:
bplogdir:
bpresultdir:
bpstopOnError: FALSE
cluster type:  FORK
>
> ## Collapse the coverage information
> collapsedFull <- collapseFullCoverage(list(genomeData$coverage),
+     verbose = TRUE)
2015-03-25 14:44:58 collapseFullCoverage: Sorting fullCov
2015-03-25 14:44:58 collapseFullCoverage: Collapsing chromosomes information by sample
>
> ## Calculate library size adjustments
> sampleDepths <- sampleDepth(collapsedFull, probs = c(0.5), nonzero=TRUE,
+     verbose=TRUE)
2015-03-25 14:44:58 sampleDepth: Calculating sample quantiles
2015-03-25 14:44:58 sampleDepth: Calculating sample adjustments
>
> ## Build the models
> groupInfo <- genomeInfo$pop
> adjustvars <- data.frame(genomeInfo$gender)
> models <- makeModels(sampleDepths, testvars=groupInfo, adjustvars=adjustvars)
>
> ## Analyze the chromosome
> results <- analyzeChr(chr='21', coverageInfo=genomeData, models=models,
+     cutoffFstat=1, cutoffType='manual', groupInfo=groupInfo, mc.cores = 2,
+     writeOutput=FALSE, returnOutput=TRUE, method='regular',
+     runAnnotation = FALSE, BPPARAM.custom = MulticoreParam(workers = 2))
extendedMapSeqlevels: sequence names mapped from NCBI to UCSC for species homo_sapiens
2015-03-25 14:44:58 analyzeChr: Pre-processing the coverage data
2015-03-25 14:44:58 analyzeChr: Calculating statistics
2015-03-25 14:44:58 calculateStats: calculating the F-statistics
2015-03-25 14:44:59 analyzeChr: Calculating pvalues
2015-03-25 14:44:59 analyzeChr: Using the following manual cutoff for the F-statistics 1
2015-03-25 14:44:59 calculatePvalues: identifying data segments
2015-03-25 14:44:59 findRegions: segmenting F-stats information
2015-03-25 14:44:59 findRegions: identifying candidate regions
2015-03-25 14:44:59 findRegions: identifying region clusters
2015-03-25 14:44:59 calculatePvalues: calculating F-statistics for permutation 1 and seed 20150326
2015-03-25 14:44:59 findRegions: segmenting F-stats information
2015-03-25 14:44:59 findRegions: identifying candidate regions
2015-03-25 14:44:59 calculatePvalues: calculating the p-values
2015-03-25 14:44:59 analyzeChr: Annotating regions
> names(results)
[1] "timeinfo"     "optionsStats" "coveragePrep" "fstats"       "regions"
[6] "annotation"
>
> ## Session info
> options(width = 120)
> devtools::session_info()
Session info-----------------------------------------------------------------------------------------------------------
 setting  value
 version  R Under development (unstable) (2014-11-01 r66923)
 system   x86_64, darwin10.8.0
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 tz       America/New_York

Packages---------------------------------------------------------------------------------------------------------------
 package           * version  date       source
 acepack             1.3.3.3  2013-05-03 CRAN (R 3.2.0)
 AnnotationDbi       1.29.20  2015-03-19 Bioconductor
 Biobase             2.27.2   2015-02-28 Bioconductor
 BiocGenerics        0.13.8   2015-03-19 Bioconductor
 BiocParallel      * 1.1.21   2015-03-24 Bioconductor
 biomaRt             2.23.5   2014-11-22 Bioconductor
 Biostrings          2.35.11  2015-02-22 Bioconductor
 bitops              1.0.6    2013-08-17 CRAN (R 3.2.0)
 bumphunter          1.7.6    2015-03-13 Github (lcolladotor/bumphunter@37d10e7)
 cluster             2.0.1    2015-01-31 CRAN (R 3.2.0)
 codetools           0.2.11   2015-03-10 CRAN (R 3.2.0)
 colorout          * 1.0.2    2014-11-03 local
 DBI                 0.3.1    2014-09-24 CRAN (R 3.2.0)
 derfinder         * 1.1.17   2015-03-14 Github (lcolladotor/derfinder@3532e0c)
 derfinderHelper     1.1.6    2015-03-15 Bioconductor
 devtools            1.6.1    2014-10-07 CRAN (R 3.2.0)
 digest              0.6.8    2014-12-31 CRAN (R 3.2.0)
 doRNG               1.6      2014-03-07 CRAN (R 3.2.0)
 foreach             1.4.2    2014-04-11 CRAN (R 3.2.0)
 foreign             0.8.63   2015-02-20 CRAN (R 3.2.0)
 Formula             1.2.0    2015-01-20 CRAN (R 3.2.0)
 futile.logger       1.4      2015-03-21 CRAN (R 3.2.0)
 futile.options      1.0.0    2010-04-06 CRAN (R 3.2.0)
 GenomeInfoDb        1.3.15   2015-03-19 Bioconductor
 GenomicAlignments   1.3.32   2015-03-18 Bioconductor
 GenomicFeatures     1.19.33  2015-03-21 Bioconductor
 GenomicFiles        1.3.14   2015-03-07 Bioconductor
 GenomicRanges       1.19.47  2015-03-21 Bioconductor
 Hmisc               3.14.5   2014-09-12 CRAN (R 3.2.0)
 IRanges             2.1.43   2015-03-07 Bioconductor
 iterators           1.0.7    2014-04-11 CRAN (R 3.2.0)
 lambda.r            1.1.7    2015-03-20 CRAN (R 3.2.0)
 lattice             0.20.30  2015-02-22 CRAN (R 3.2.0)
 latticeExtra        0.6.26   2013-08-15 CRAN (R 3.2.0)
 locfit              1.5.9.1  2013-04-20 CRAN (R 3.2.0)
 Matrix              1.1.5.1  2015-03-23 CRAN (R 3.2.0)
 matrixStats         0.14.0   2015-02-14 CRAN (R 3.2.0)
 nnet                7.3.9    2015-02-11 CRAN (R 3.2.0)
 pkgmaker            0.22     2014-05-14 CRAN (R 3.2.0)
 qvalue              1.43.0   2015-03-04 Bioconductor
 RColorBrewer        1.1.2    2014-12-07 CRAN (R 3.2.0)
 RCurl               1.95.4.5 2014-12-28 CRAN (R 3.2.0)
 registry            0.2      2012-01-24 CRAN (R 3.2.0)
 rngtools            1.2.4    2014-03-06 CRAN (R 3.2.0)
 rpart               4.1.9    2015-02-24 CRAN (R 3.2.0)
 Rsamtools           1.19.47  2015-03-21 Bioconductor
 RSQLite             1.0.0    2014-10-25 CRAN (R 3.2.0)
 rstudioapi          0.2      2014-12-31 CRAN (R 3.2.0)
 rtracklayer         1.27.9   2015-03-19 Bioconductor
 S4Vectors           0.5.22   2015-03-06 Bioconductor
 snow                0.3.13   2013-09-27 CRAN (R 3.2.0)
 stringr             0.6.2    2012-12-06 CRAN (R 3.2.0)
 survival            2.38.1   2015-02-24 CRAN (R 3.2.0)
 XML                 3.98.1.1 2013-06-20 CRAN (R 3.2.0)
 xtable              1.7.4    2014-09-12 CRAN (R 3.2.0)
 XVector             0.7.4    2015-02-08 Bioconductor
 zlibbioc            1.13.3   2015-03-23 Bioconductor
>

Cheers,

Leo

 

 

ADD REPLY
0
Entering edit mode
> library('BiocParallel')
> x <- derfinder:::.define_cluster(BPPARAM.custom = MulticoreParam(workers = 2))
> x
class: MulticoreParam
bpworkers: 2
bpisup: FALSE
bplog: FALSE
bpthreshold:
bplogdir:  
bpresultdir:  
bpstopOnError: FALSE
cluster type:  FORK
> collapsedFull <- collapseFullCoverage(list(genomeData$coverage),verbose=TRUE)
2015-03-25 13:48:11 collapseFullCoverage: Sorting fullCov
2015-03-25 13:48:11 collapseFullCoverage: Collapsing chromosomes information by sample
> sampleDepths <- sampleDepth(collapsedFull, probs = c(0.5), nonzero=TRUE,verbose=TRUE)
2015-03-25 13:48:27 sampleDepth: Calculating sample quantiles
2015-03-25 13:48:27 sampleDepth: Calculating sample adjustments
> groupInfo <- genomeInfo$pop
> adjustvars <- data.frame(genomeInfo$gender)
> models <- makeModels(sampleDepths, testvars=groupInfo, adjustvars=adjustvars)
> results <- analyzeChr(chr='21', coverageInfo=genomeData, models=models,
+      cutoffFstat=1, cutoffType='manual', groupInfo=groupInfo, mc.cores = 2,
+      writeOutput=FALSE, returnOutput=TRUE, method='regular',
+      runAnnotation = FALSE, BPPARAM.custom = MulticoreParam(workers = 2))
extendedMapSeqlevels: sequence names mapped from NCBI to UCSC for species homo_sapiens
2015-03-25 13:50:18 analyzeChr: Pre-processing the coverage data
2015-03-25 13:50:18 analyzeChr: Calculating statistics
2015-03-25 13:50:18 calculateStats: calculating the F-statistics

 

From your timestamps, it should proceed within a second, but instead it has frozen.

Do you remember when I had trouble with analyzeChr() before and we ended up figuring out I had to set BPPARAM.custom to get around it? This feels like that. Some weird thing with processor allocation or something.

Jessica

ADD REPLY
1
Entering edit mode

Hi Jessica,

Thanks for the reminder about the old problem. I found the old threads: derfinder: analyzeChr() breaks on chromosomes > 22 and bplapply() fails when BPPARAM = SnowParam() is specified

Can you try running the code from  ? Thanks.

It will generate output like this:

> library('BiocParallel')
>
> ## Run bplapply example code with MulticoreParam
> system.time(result <- bplapply(1:10, function(v) {
+     message("working") ## 10 tasks
+     sqrt(v)
+ }, BPPARAM = MulticoreParam(workers = 2)))
   user  system elapsed
  0.037   0.018   0.057
> result[[1]]
[1] 1
>
> ## Run bplapply example code with SnowParam
> system.time(result <- bplapply(1:10, function(v) {
+     message("working") ## 10 tasks
+     sqrt(v)
+ }, BPPARAM = SnowParam(workers = 2)))
Loading required package: colorout
Loading required package: colorout
   user  system elapsed
  0.013   0.003   0.573
> result[[1]]
[1] 1
>
> ## Run bplapply example code with SerialParam
> system.time(result <- bplapply(1:10, function(v) {
+     message("working") ## 10 tasks
+     sqrt(v)
+ }, BPPARAM = SerialParam()))
working
working
working
working
working
working
working
working
working
working
   user  system elapsed
  0.003   0.000   0.003
> result[[1]]
[1] 1
>
>
> ## Check clusterargs from https://support.bioconductor.org/p/62845/
> MulticoreParam(workers = 2)$.clusterargs
$spec
[1] 2

$type
[1] "FORK"

> SnowParam(workers = 2)$.clusterargs
$spec
[1] 2

$type
[1] "SOCK"

>
> ## Print all the info, as requested by Laurent Gatto in https://support.bioconductor.org/p/62845/
> MulticoreParam(workers = 2)
class: MulticoreParam
bpworkers: 2
bpisup: FALSE
bplog: FALSE
bpthreshold:
bplogdir:
bpresultdir:
bpstopOnError: FALSE
cluster type:  FORK
> SnowParam(workers = 2)
class: SnowParam
bpworkers: 2
bpisup: FALSE
bplog: FALSE
bpthreshold: INFO
bplogdir:
bpresultdir:
bpstopOnError: FALSE
cluster type:  SOCK
> SerialParam()
class: SerialParam
bpworkers: 1
>
> ## Use parallel::makeCluster()
> cl <- parallel::makeCluster(spec = 2, type = 'SOCK')
Loading required package: colorout
Loading required package: colorout
> cl
socket cluster with 2 nodes on host ‘localhost’
>
> ## Session info
> devtools::session_info()
Session info-------------------------------------------------------------------
 setting  value
 version  R Under development (unstable) (2014-11-01 r66923)
 system   x86_64, darwin10.8.0
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 tz       America/New_York

Packages-----------------------------------------------------------------------
 package        * version date       source
 BiocParallel   * 1.1.21  2015-03-24 Bioconductor
 colorout       * 1.0.2   2014-11-03 local
 devtools         1.6.1   2014-10-07 CRAN (R 3.2.0)
 futile.logger    1.4     2015-03-21 CRAN (R 3.2.0)
 futile.options   1.0.0   2010-04-06 CRAN (R 3.2.0)
 lambda.r         1.1.7   2015-03-20 CRAN (R 3.2.0)
 rstudioapi       0.2     2014-12-31 CRAN (R 3.2.0)
 snow             0.3.13  2013-09-27 CRAN (R 3.2.0)
>

If it doesn't work, then please create a new issue with the BiocParallel tag. You can refer them back to this thread and we'll both learn from whatever solution they find.

Cheers,

Leo 

 

ADD REPLY
0
Entering edit mode

SerialParam() worked; the other two did not.

I then re-tested your derfinder code with SerialParam():

results <- analyzeChr(chr='21', coverageInfo=genomeData, models=models,
cutoffFstat=1, cutoffType='manual', groupInfo=groupInfo, mc.cores = 2,
writeOutput=FALSE, returnOutput=TRUE, method='regular',
runAnnotation = FALSE, BPPARAM.custom = SerialParam())

 

It worked beautifully. I have my workaround and will close this issue. I'll consider whether or not to open one with the BiocParallel folks, but as I recall last time they just told me to use what worked, and since I have something that works it may not be worth anyone's time for me to do that.

 

Thank you! I'll let you know how it goes running derfinder with my data. Fingers crossed.

 

Jessica

ADD REPLY
1
Entering edit mode

Well, if you use SerialParam() your analysis is going to take much longer because you won't be running anything in parallel. The upper bound limit is 10 times the time it took before (because you were using 10 cores). It's actually a bit better than 10x, but the point remains.

You could alternatively manually run the functions analyzeChr() runs. That would allow you to use preprocessCoverage() to generate all the chunks you need. Then, you could submit a cluster job for each chunk to run calculateStats(), then merge the results. Then do something similar with calculatePvalues().

However, the above approach is complicated and it might be much easier to (a) figure out why MulticoreParam() and SnowParam() don't work in your cluster, or (b) try to use another *Param() such as BatchJobsParam() or DoparParam().

Cheers,

Leo

ADD REPLY
1
Entering edit mode

In the SnowParam report from earlier one possibility was blocked ports, with the OP saying they'd check on this... Another possibility with SnowParam is that a different R is being started; you might try writing code in the worker that sends output to a file (cat("hello world\n", file="my_test.R") -- nothing fancy, and then work up to providing useful debugging information, e.g., about .libPaths(). Also it helps alot to simplify things, so doing your testing in a new R session with no unneeded packages loaded, and removing BiocParallel from the picture (at least until it becomes apparent that that is what the problem is) by calling functions from the parallel and snow packages directly. It's always helpful to have complete output, rather than 'it doesn't work'.

snow and multicore are quite different beasts, so it's interesting that they have both apparently stopped working?

ADD REPLY
0
Entering edit mode

Sorry for the delayed response; I hit the six posts in six hours limit and had to wait.

Here are the results. I had to control-C out of the first two bplapply() calls because they just hung.

> library('BiocParallel')
> system.time(result <- bplapply(1:10, function(v) {
+      message("working") ## 10 tasks
+      sqrt(v)
+  }, BPPARAM = MulticoreParam(workers = 2)))
^C
Timing stopped at: 0.018 0.002 48.012
> system.time(result <- bplapply(1:10, function(v) {
+      message("working") ## 10 tasks
+      sqrt(v)
+  }, BPPARAM = SnowParam(workers = 2)))
^C
Timing stopped at: 0.009 0.005 26.412
> system.time(result <- bplapply(1:10, function(v) {
+      message("working") ## 10 tasks
+      sqrt(v)
+  }, BPPARAM = SerialParam()))
working
working
working
working
working
working
working
working
working
working
   user  system elapsed
  0.003   0.000   0.003
> MulticoreParam(workers = 2)$.clusterargs
$spec
[1] 2

$type
[1] "FORK"

> SnowParam(workers = 2)$.clusterargs
$spec
[1] 2

$type
[1] "SOCK"

> MulticoreParam(workers = 2)
class: MulticoreParam
bpworkers: 2
bpisup: FALSE
bplog: FALSE
bpthreshold:
bplogdir:  
bpresultdir:  
bpstopOnError: FALSE
cluster type:  FORK
> SnowParam(workers = 2)
class: SnowParam
bpworkers: 2
bpisup: FALSE
bplog: FALSE
bpthreshold: INFO
bplogdir:  
bpresultdir:  
bpstopOnError: FALSE
cluster type:  SOCK
> SerialParam()
class: SerialParam
bpworkers: 1
> devtools::session_info()
Session info -------------------------------------------------------------------
 setting  value                                             
 version  R Under development (unstable) (2015-03-19 r68032)
 system   x86_64, linux-gnu                                 
 ui       X11                                               
 language (EN)                                              
 collate  en_US.UTF-8                                       
 tz       America/Chicago                                   

Packages -----------------------------------------------------------------------
 package        * version date       source        
 BiocParallel     1.1.19  2015-03-23 Bioconductor  
 devtools       * 1.7.0   2015-01-17 CRAN (R 3.3.0)
 futile.logger  * 1.4     2015-03-21 CRAN (R 3.3.0)
 futile.options * 1.0.0   2010-04-06 CRAN (R 3.3.0)
 lambda.r       * 1.1.7   2015-03-20 CRAN (R 3.3.0)
 rstudioapi     * 0.2     2014-12-31 CRAN (R 3.3.0)
 snow           * 0.3-13  2013-09-27 CRAN (R 3.3.0)


I very much believe the problem is with a blocked port. My admin has a history of blocking ports quite aggressively. Additionally, when I run analyzeChr() with SerialParam(), I get this error:

Error in sqliteSendQuery(con, statement, bind.data) :
  expired SQLiteConnection
In addition: Warning message:


...which also sounds like a blocked-port problem to me.

I can talk to my admin about opening up some ports -- but which ones?

Thanks for your help!

Jessica

 

ADD REPLY
1
Entering edit mode

Thanks for that so far; what about in a new session with no packages loaded

x <- parallel::mclapply(1:10, sqrt)
cl <- parallel::makeCluster(10, "PSOCK")
y <- parallel::clusterApply(cl, 1:10, sqrt)
identical(x, y)

FUN <- function(v) {
    message("working") ## 10 tasks
    sqrt(v)
}
x <- parallel::mclapply(1:10, FUN, mc.cores=10)
cl <- parallel::makeCluster(10, "PSOCK")
y <- parallel::clusterApply(cl, 1:10, FUN)
identical(x, y)

Also, if these do not work, can you try outside R-studio, just on the command line? For me these all work, and identical(x, y) returns TRUE.

ADD REPLY
1
Entering edit mode

Code works for me, lets see what happens in Jessica's case. My output for local and cluster environments is shown below:

 

 

ADD REPLY
1
Entering edit mode

I'm curious if MulticoreParam() failing could be an issue related to Jessica using R 3.3 (devel) instead of 3.2 (alpha, matches current BioC-devel). A recent BioC-devel thread https://stat.ethz.ch/pipermail/bioc-devel/2015-March/007228.html was solved by using R 3.2 I haven't tested myself if the code in this thread works with R 3.3 since http://r.research.att.com/ doesn't have a snowleopard 3.3 version out yet (aye, I'm too lazy to compile R myself).

ADD REPLY
0
Entering edit mode

I just installed 3.2: R version 3.2.0 alpha (2015-03-26 r68103). In a new session with nothing loaded:

 

> x <- parallel::mclapply(1:10, sqrt)
> cl <- parallel::makeCluster(2, "PSOCK")


Again parallel:makeCluster() hung and I had to exit with ^C. I conclude that 3.2 vs 3.3 isn't the problem.

 

Jessica

ADD REPLY
0
Entering edit mode
> x <- parallel::mclapply(1:10, sqrt)
> cl <- parallel::makeCluster(10, "PSOCK")
^C


(I had to control-C out of the second command, and then obviously could not continue with that set.)

> FUN <- function(v) {
+     message("working") ## 10 tasks
+     sqrt(v)
+ }
> x <- parallel::mclapply(1:10, FUN, mc.cores=10)
working
working
working
working
working
working
working
working
working
working
> cl <- parallel::makeCluster(10, "PSOCK")
^C


I had to control-C again; it does not like parallel::makeCluster()!

devtools::session_info()
Session info -------------------------------------------------------------------
 setting  value                                             
 version  R Under development (unstable) (2015-03-19 r68032)
 system   x86_64, linux-gnu                                 
 ui       X11                                               
 language (EN)                                              
 collate  en_US.UTF-8                                       
 tz       America/Chicago                                   

Packages -----------------------------------------------------------------------
 package    * version date       source        
 devtools   * 1.7.0   2015-01-17 CRAN (R 3.3.0)
 rstudioapi * 0.2     2014-12-31 CRAN (R 3.3.0)

 

None of this is in R studio (I hate graphical interfaces); it is all on the command line, where I always work. It was fresh, nothing loaded, nothing done before the code I copied above.

 

Jessica

 

ADD REPLY
1
Entering edit mode

Sorry for my delayed reply. I think at the end of the day Jessica you're right, the sockets R is trying to use are blocked. The initial mystery (to me) was why MulticoreParam stopped working, but it turns out that the implementation has changed and now also relies on sockets for some aspect of communication. I understand that the environment variable R_PARALLEL_PORT can be used to determine the port on which communication occurs, so if you can determine the ports that remain open, starting R with something like R_PARALLEL_PORT=12345 R --vanilla might do the trick. I'd asked about rstudio because it seems from your session_info that a package rstudioapi is always present...

ADD REPLY
1
Entering edit mode

I believe that's because `devtools` imports `rstudioapi`.See https://github.com/hadley/devtools/blob/master/DESCRIPTION#L26

On a clean session:

> devtools::session_info()
Session info------------------------------------------------------------------------------------------------------------------------
 setting  value                                             
 version  R Under development (unstable) (2014-11-01 r66923)
 system   x86_64, darwin10.8.0                              
 ui       AQUA                                              
 language (EN)                                              
 collate  en_US.UTF-8                                       
 tz       America/New_York                                  

Packages----------------------------------------------------------------------------------------------------------------------------
 package    * version date       source        
 devtools     1.6.1   2014-10-07 CRAN (R 3.2.0)
 rstudioapi   0.2     2014-12-31 CRAN (R 3.2.0)
> 

I didn't know about R_PARALLEL_PORT. Sounds like the best lead!

ADD REPLY
0
Entering edit mode

That's so helpful, thanks! I'm going to be out of the country for a week starting today, but when I get back I'll sit down and try to figure this out. I'll post whatever I find here (hopefully an answer about what ports to open, for the benefit of future answer seekers).

ADD REPLY
0
Entering edit mode

OK! I contacted my sysadmin for help in finding an appropriate port to use with R_PARALLEL_PORT, and he said "oh, looks like loopback is blocked -- I opened it back up, see if things work now." And now they do.

So the answer is: check loopback and make sure it's open.

Thanks to both of you for all your help. Leo, derfinder completed when I did a test run with one permutation. I'm doing a real run now with 200 perms, and if/when it completes I'll let you know.

Jessica

ADD REPLY
0
Entering edit mode

Awesome! Good luck with the rest of the analysis.

ADD REPLY
1
Entering edit mode
@lcolladotor
Last seen 4 weeks ago
United States

As stated in C: derfinder: analyzeChr() freezes in f-stat calculation the answer to this problem is to check with your system administrator's help that the port used by R_PARALLEL_PORT is open in your system as this is not a derfinder-specific problem but rather a problem when using SnowParam() and/or MulticoreParam() from BiocParallel. In this specific case, "loopback" was blocked and once it was opened, SnowParam() and/or MulticoreParam() worked again.

Feel free to look at the full comments history in this question to see how this problem was solved.

ADD COMMENT
0
Entering edit mode

Loopback, not lookback.

ADD REPLY
0
Entering edit mode

Thanks, I fixed the typo.

ADD REPLY

Login before adding your answer.

Traffic: 576 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6