Hello,
I'm currently using WGCNA v1.68 to do network analysis on 50k probes. I have a few questions regarding parallelization in WGCNA, particularly running the blockwiseModules and TOMsimilarity functions. I came across another bioconductor question on this topic (https://support.bioconductor.org/p/86147/), but it was from 3 years ago, so I was wondering if there's any updates that I should be aware of?
In the previous questioned, Peter said that blockwiseModules was not parallelized, has this changed? He kindly suggested using an faster BLAS to speed up matrix multiplication in TOM calculations, currently my output when running TOMsimilarity() is showing "..matrix multiplication (system BLAS)..", so I'm guessing the system BLAS is not the fast BLAS Peter's referring to? Does anyone know which fast BLAS I should try installing? (I'm currently using R on a CentOS server with up to 50 cores).
I already tried setting "enableWGCNAThreads(nThreads = 50)", but I don't think it did anything.
Much thanks for any help anyone can provide,
Ming
Hi Peter, thank-you so much for your fast response, I understand more now. When I do sessionInfo() it indeed shows the generic Rblas. I'm working with a computing cluster and getting R to switch to using openBLAS seems to be complicated as you have foreseen.
On another note regarding blockwiseModules, I just wanted to confirm the parallelization inside this function. I tried testing on some BRCA data (590 subjects x 8640 genes), and ran with:
Am I correct in assuming the 18 blocks, when given enough threads, will execute in parallel? But the TOM calculations inside blockwiseModules is the one that has not been parallelized yet, and will benefit from openBLAS? So if I wanted to run a large dataset, running blockwiseModules together with openBLAS will be the best way to go?.
Again thank-you for your time in answer my questions.
Ming
Hi Peter! A query regarding the use of BLAS. Can different version impact speed of execution? I have been trying to run my script on two different servers. For the part involving
pickSoftThreshold
, while it takes about 3 minutes on one server it has not compiled on the other even in 12 hours. I was wondering why this might be. The input matrix contains about 12500 genes and 100 samples. I have 60 threads enabled. The following are some information fromsessionInfo
.Server 1 (fast execution)::
Server 2 (slow execution)::
I suspect that the slow execution is actually stuck. I am not familiar with what flexiBLAS OPENBLAS-OPENMP actually uses as BLAS, but it is possible that the BLAS implementation in use is not re-entrant, i.e., if you run the same BLAS routine in two different threads, they will clash and potentially never end the calculation. This used to happen with GotoBLAS. If that's not the case, I would double-check that the other server is actually running the code rather than it waiting in the queue, and if it's running that is has enough physical RAM to run. You can add argument
verbose = 3
to the call ofpickSoftThreshold
and watch the output file to see if the function is making any progress. Lastly, 60 threads is way too many. Let it run single-threaded or at most 8 threads - in my experience, more than that will only slow down the system.