Hello,
I am using the HTSFilter library to filter out low count samples for some RNA data. It is working but takes a while to run. Is there a way to run it using multiple cores/cpu's?
Regards,
Richard
Hello,
I am using the HTSFilter library to filter out low count samples for some RNA data. It is working but takes a while to run. Is there a way to run it using multiple cores/cpu's?
Regards,
Richard
Hi Richard,
If HTSFilter is taking a while to run, I'm guessing that it's because you have a fairly large number of samples -- right? The method in HTSFilter is extremely parallelizable since for a given filtering threshold, the Jaccard similarity index is calculated in a loop for all possible pairs of replicates and then averaged (which means calculations could be done in parallel both for different pairs of samples and for different filtering thresholds).
That being said, unfortunately I haven't yet included the ability to run HTSFilter over multiple cores/cpu's since most of my use cases to date have had a limited number of replicate samples (say, less than 10 or say). However, if this is an option you're interested in, I could take a look at including it (although it may take me a bit of time since I need to familiarize myself with the necessary packages). Let me know!
Regards,
Andrea
Ok, I will work on adding the possibility of parallel calculations to HTSFilter. It may take me a couple of weeks to get around to it, but I will let you know when it is ready for testing in the development version. Thanks again for the feedback!
Best,
Andrea
After a longer delay than expected (my apologies!), HTSFilter now implements (as of Bioconductor 3.4, version 1.14.0) the option for parallel calculations through the BiocParallel package. There are now two additional optional arguments in calls to HTSFilter: parallel
(TRUE/FALSE) and BPPARAM
to specify the backend for parallel execution. I hope this helps the execution time for your use case! Any feedback is welcome.
Best,
Andrea
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hi Andrea,
Thanks for the response!
Yes we do have a large number of samples and hope to setup a pipeline using HTSFilter. We have used sorter s.len to speed it up but need to be sure the results are robust. If there was a multicore option it would be a great help.
Thanks,
Richard