Question

DiffBind: dba.overlap error

0

Entering edit mode

romicakerketta • 0

@romicakerketta-14630

Last seen 2.1 years ago

United States

I am using DiffBind for differential occupancy analysis for my Chip-seq data set. The problem I am having is that- as of last week, whenever I ran "dba.overlap" function for my 4 samples (2 samples/condition), I would get 4 overlapping rates (82695,51554,34786,24150). However, When I tried running the code today (updated DiffBind to version 2.8.0), I am getting the following numbers. The first two numbers are the same from dba.overlap function.

> olaprate <- dba.overlap(DBdata, mode = DBA_OLAP_RATE)
> olaprate
[1] 51554 51554 34786 24150
> plot(olaprate, type='b', ylab='# peaks', xlab ='overlap at least this meany peaksets')

I also ran the Tamoxifen vignette as well to see if the error is reproduced. Following is what I get from DiffBind for the Tamoxifen data.

> olap.rate <- dba.overlap(tamoxifen,mode=DBA_OLAP_RATE)
> olap.rate
[1] 2845 2845 1773 1388 1074 817 653 484 384 202 129

So even in the vignette the first two numbers are repeating (it should have been 3795 2845 1773 1388 1074 817 653 484 384 202 129).

So I am wondering whether something changed in the package during the update..?

Please help!

Thanks

DiffBind dba.overlap • 1.0k views

ADD COMMENT • link 6.3 years ago romicakerketta • 0

score 0 · Answer 1 · 2018-08-06

0

Entering edit mode

romicakerketta • 0

@romicakerketta-14630

Last seen 2.1 years ago

United States

I guess I figured out the solution to my problem. I had to give a different name when DiffBind reads in the occupancy data and then use a different name when DiffBind reads in counts. Following code worked:

DBdata_peak <- dba(sampleSheet=ikras)
DBdata_peak

#creat count
DBdata <- dba.count(DBdata_peak)
DBdata

And then using DBdata_peak dataset in the dba.overlap gave me the correct numbers:

> olaprate <- dba.overlap(DBdata_peak, mode = DBA_OLAP_RATE)
> olaprate
[1] 82695 51554 34786 24150

ADD COMMENT • link 6.3 years ago romicakerketta • 0

0

Entering edit mode

Yep, you got it. The DBA object that results after the consensus peakset is formed (by calling dba.count()) loses information about peaks that are not in the consensus. As the default is to include all peaks in at least two peaksets, the number of peaks that are in at least one sample and those that overlap at least two samples are the same.