[DiffBind] Memory issues with dba.count()
1
0
Entering edit mode
enricoferrero ▴ 660
@enricoferrero-6037
Last seen 3.2 years ago
Switzerland

Hi Rory et al., 

I'm hitting the memory limits of my server (96GB RAM) when using DiffBind::dba.count(), which results in my job getting killed.

I'm trying to generate a count matrix from many samples (>30), which translates to many sites/peaks. I suspect the massive matrix cannot be allocated by R into memory.

I've seen the argument bLowMem mentioned in some previous discussions, but it doesn't seem to be recognised by dba.count() any longer, is that right?

Is there any way to use dba.count() in this scenario? Would something like the bigmemory package be helpful here?

Thank you,

 

diffbind bigmemory matrix dba.count • 2.7k views
ADD COMMENT
1
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 12 days ago
Cambridge, UK

Hello-

The bLowMem parameter was replaced by bUseSummarizeOverlaps. You can try setting this to TRUE. when calling dba.count(). You can also set the configuration value $config$yieldSize in your DBA object to a lower value (like 50000).

Another approach is to use a consensus peakset with fewer peaks. If you are relying on the minOverlap parameter (default value 2), you can set it higher. Calling dba.overlap() with mode=DBA_OLAP_RATE will return a vector with the number of consensus peaks for successively greater values of minOverlap so you can choose an appropriate one.

I am currently looking at memory usage in DiffBind, as it does seem to occasionally ballon very high, and hope to have a fix in the next version.

Regards-

Rory

ADD COMMENT
0
Entering edit mode

Thanks Rory,

I'll try using the summarizeOverlaps option with a lower yieldSize.

It's great to hear that you're looking at the memory consumption - it's the one thing that is keeping me from using DiffBind more extensively across projects.

Best,

ADD REPLY
0
Entering edit mode

FYI, in the development version of DiffBind (1.17.6 and later), we have made significant improvements in peak memory usage, reducing it by an order of magnitude, especially in the case where a binding matrix is being constructed (e.g. dba.count). I have an analysis that was taking >70GB to run and now takes 5GB. Give it a try!

Cheers-

Rory

ADD REPLY

Login before adding your answer.

Traffic: 935 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6