Problems running oligo functions in parallel on a linux computer
   I am currently operating a linux (scientific linux 7.4) computer with 8 cores.  I would like to run some of the functions from oligo, like RMA, in parallel and have not had any success.

   I have tried the suggestions in the oligo manual to use ff or to use foreach and doMC and neither of these sped up computations or were successful in running calculations in parallel at all.

  I have tried using the parallel package and mclapply.  I just used "read.celfiles" in as my test and it will run in parallel, but does not give me one expression set with all of the celfiles.  It gives a separate expression set for each celfile, which is useless.

  I have also tried using BiocParallel, with the same test of "read.celfiles", also without luck.  bpiterate is "unable to input data, can't coerce S4 class into vector".  bpvec produces an error because the output vector length doesn't match the length of the vector for the starting data. bpmapply will parallelize and read in the celfiles, but like parallel, creates a separate expression set for each one.  

    I have checked, using taskset, and R is set to be able to use any or all eight cores, so R is not limited by a computer system setting.  Does anyone know of a way to have R run in parallel to do things like read.celfiles, oligo::rma, etc?  Most of the examples I have seen have to do with reading data out of a list and producing some output for each line, each file, etc.  Is this the only way R will run in parallel?


BiocParallel Code:

 > library(BiocParallel)
> param<-SnowParam(workers=6, type="SOCK")
> library(affy)

> library(oligo)

> a<-list.celfiles()

> FUN<-function(x){
+ library(oligo)
+ library(affy)
+ rawData<-read.celfiles(x)}
> bpvec(x, FUN, AGGREGATE=c, BPREDO=list(), BPPARAM=param)

Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

Platform design info loaded.

Reading in : NF25-AL_2_(HTA-2_0)_3.CEL
Reading in : NF25-SA_2_(HTA-2_0).CEL
Reading in : NF27-AL_(HTA-2_0).CEL
Reading in : NF27-SA_(HTA-2_0).CEL
Reading in : NF32-AL_(HTA-2_0).CEL
Reading in : NF32-SA_(HTA-2_0).CEL
Reading in : NG01-AL_(HTA-2_0).CEL
Reading in : NG01-SA_(HTA-2_0).CEL
Reading in : NG07-AL_(HTA-2_0).CEL
Reading in : NG07-SA_(HTA-2_0).CEL
Reading in : NG09-AL_(HTA-2_0).CEL
Reading in : NG09-SA_(HTA-2_0).CEL
Reading in : NG16-AL_(HTA-2_0).CEL
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

Platform design info loaded.

Reading in : NF01-AL_(HTA-2_0).CEL
Reading in : NF01-SA_(HTA-2_0).CEL
Reading in : NF07-AL_(HTA-2_0).CEL
Reading in : NF07-SA_(HTA-2_0).CEL
Reading in : NF09-AL_(HTA-2_0).CEL
Reading in : NF09-SA_(HTA-2_0).CEL
Reading in : NF16-AL_(HTA-2_0).CEL
Reading in : NF16-SA_(HTA-2_0)_3.CEL
Reading in : NF19-AL_(HTA-2_0).CEL
Reading in : NF19-SA_(HTA-2_0).CEL
Reading in : NF21-AL_(HTA-2_0).CEL
Reading in : NF21-SA_(HTA-2_0).CEL
Reading in : NF24-AL_(HTA-2_0).CEL
Reading in : NF24-SA_(HTA-2_0).CEL

.....(this repeats, in chunks, to read in all 80 files)

Error: length(FUN(X)) not equal to length(X)
> bpiterate(x, FUN, BPPARAM=param, REDUCE=merge)
Error in as.list.default(X) : 
  no method for coercing this S4 class to a vector
Calls: local ... doTryCatch -> bpok -> vapply -> as.list -> as.list.default
Execution halted
Error in as.list.default(X) : 
  no method for coercing this S4 class to a vector
Calls: local ... doTryCatch -> bpok -> vapply -> as.list -> as.list.default
Execution halted
Error: 'bpiterate' receive data failed:
  error reading from connection
> Error in as.list.default(X) : 
  no method for coercing this S4 class to a vector
Calls: local ... doTryCatch -> bpok -> vapply -> as.list -> as.list.default
Execution halted
Error in as.list.default(X) : 
  no method for coercing this S4 class to a vector
Calls: local ... doTryCatch -> bpok -> vapply -> as.list -> as.list.default
Execution halted
Error in as.list.default(X) : 
  no method for coercing this S4 class to a vector
Calls: local ... doTryCatch -> bpok -> vapply -> as.list -> as.list.default
Execution halted
Error in as.list.default(X) : 
  no method for coercing this S4 class to a vector
Calls: local ... doTryCatch -> bpok -> vapply -> as.list -> as.list.default
Execution halted

> bpmapply(FUN, a, BPPARAM=param)

Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

Platform design info loaded.

Reading in : SF04-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF04-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF10-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF10-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF15-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF15-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF18-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF18-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF20-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF20-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF23-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF23-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF26-AL_(HTA-2_0).CEL
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

Loading required package: pd.hta.2.0

Loading required package: RSQLite
Loading required package: DBI
Platform design info loaded.
Reading in : NF01-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF01-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF07-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF07-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF09-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF09-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF16-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF16-SA_(HTA-2_0)_3.CEL
Platform design info loaded.
Reading in : NF19-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF19-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF21-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF21-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF24-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF24-SA_(HTA-2_0).CEL

..... (it did this for all 80 files)

HTAFeatureSet (storageMode: lockedEnvironment)
assayData: 6892960 features, 1 samples 
  element names: exprs 
  rowNames: NF01-AL_(HTA-2_0).CEL
  varLabels: exprs dates
  varMetadata: labelDescription channel
  rowNames: NF01-AL_(HTA-2_0).CEL
  varLabels: index
  varMetadata: labelDescription channel
featureData: none
experimentData: use 'experimentData(object)'
Annotation: pd.hta.2.0 

HTAFeatureSet (storageMode: lockedEnvironment)
assayData: 6892960 features, 1 samples 
  element names: exprs 
  rowNames: NF01-SA_(HTA-2_0).CEL
  varLabels: exprs dates
  varMetadata: labelDescription channel
  rowNames: NF01-SA_(HTA-2_0).CEL
  varLabels: index
  varMetadata: labelDescription channel
featureData: none
experimentData: use 'experimentData(object)'
Annotation: pd.hta.2.0 

..... (there were 80 of these....)

Error: BiocParallel errors
  element index: 1, 2, 3, 4, 5, 6, ...
  first error: unused arguments (SIMPLIFLY = dots[[2]][[1]], USE.NAME = dots[[3]][[1]])
> bpmapply(FUN, a, BPPARAM=param, SIMPLIFLY=TRUE)
Error: BiocParallel errors
  element index: 1, 2, 3, 4, 5, 6, ...
  first error: unused argument (SIMPLIFLY = dots[[2]][[1]])
> ?bpaggregate
> ?bpaggregate
> bpaggregate(a, FUN, BPPARAM=param)
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘bpaggregate’ for signature ‘"character", "SnowParam"’






biocparallel oligo rma linux parallel • 1.5k views

