how to paralelize this R script in a HPC cluster using Slurm Workload Manager?
1
0
Entering edit mode
parra.sev ▴ 20
@parrasev-23808
Last seen 4.4 years ago

I'm following a tutorial for using MethylMix package for analyzing DNA methylation data, the problem is that the processing of the methylation data is soooo slow for being runned in my PC, so I need to run this script in a HPC cluster using the Slurm Workload Manager (this HPC uses MPI), but I had no idea about how to do it. Any suggestion?

library(MethylMix)
library(doParallel)

cancerSite <- "OV"
targetDirectory <- paste0(getwd(), "/")

cl <- makeCluster(4)
registerDoParallel(cl)

# Downloading methylation data
METdirectories <- Download_DNAmethylation(cancerSite, targetDirectory)
# Processing methylation data
METProcessedData <- Preprocess_DNAmethylation(cancerSite, METdirectories)
# Saving methylation processed data
# saveRDS(METProcessedData, file = paste0(targetDirectory, "MET_", cancerSite, "_Processed.rds"))

# Downloading gene expression data
GEdirectories <- Download_GeneExpression(cancerSite, targetDirectory)
# Processing gene expression data
GEProcessedData <- Preprocess_GeneExpression(cancerSite, GEdirectories)
# Saving gene expression processed data
saveRDS(GEProcessedData, file = paste0(targetDirectory, "GE_", cancerSite, "_Processed.rds"))

# Clustering probes to genes methylation data
METProcessedData <- readRDS(paste0(targetDirectory, "MET_", cancerSite, "_Processed.rds"))
res <- ClusterProbes(METProcessedData[[1]], METProcessedData[[2]])

# Putting everything together in one file
toSave <- list(METcancer = res[[1]], METnormal = res[[2]], GEcancer = GEProcessedData[[1]], 
               GEnormal = GEProcessedData[[2]], ProbeMapping = res$ProbeMapping)
saveRDS(toSave, file = paste0(targetDirectory, "data_", cancerSite, ".rds"))

stopCluster(cl)
methylation epigenomics slurm mpi methylmix • 1.0k views
ADD COMMENT
0
Entering edit mode
Kevin Blighe ★ 4.0k
@kevin
Last seen 17 days ago
Republic of Ireland

Exact information for setting up MethylMix for parallel processing can be found here: https://www.bioconductor.org/packages/release/bioc/vignettes/MethylMix/inst/doc/vignettes.html

If you actually want help writing the submission script for your HPC, then please contact your System Administrator or Service Desk for that. There are different configurations and we are not to immediately know which is required in your case.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 717 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6