Question

got 0 significant differential bound (DB)

0

Entering edit mode

mforoo1 • 0

@mforoo1-10771

Last seen 7.7 years ago

Hello,

I run diffbind to find DB between my two samples( control and phosphate deficiency) with two replicates. i choose deseq2 method for dba.analyzes and I got this:

1 Contrast:
Group1 Members1 Group2 Members2 DB.DESeq2
1 C 3 P 3 0
Error in pv.DBAreport(pv, contrast = contrast, method = method, th = th, :
edgeR analysis has not been run for this contrast
Calls: plot ... plot.DBA -> dba.plotHeatmap -> pv.getPlotData -> pv.DBAreport
Execution halted

I used EDGEr as well, for that one I got 7 DB. I run others peakcalling software to find DB and I got some DB , but I could not understand why I did not get any DB from diffbind.

my samplesheet :

SampleID Tissue Factor Condition Treatment Replicate
1 WTC1 Shoot H3K4me3 WT C 1
2 WTC2 Shoot H3K4me3 WT C 2
3 WTC3 Shoot H3K4me3 WT C 3
4 WTP1 Shoot H3K4me3 WT P 1
5 WTP2 Shoot H3K4me3 WT P 2
6 WTP3 Shoot H3K4me3 WT P 3
bamReads ControlID bamControl
1 rep1_WT_C_H3K4me3_bowtie.bam InPutC1 rep1_WT_C_input_bowtie.bam
2 rep2_WT_C_H3K4me3_trimmed_bowtie.bam InPutC2 rep2_WT_C_input_bowtie.bam
3 rep3_WT_C_H3K4me3_bowtie.bam InPutC3 rep2_WT_C_input_bowtie.bam
4 rep1_WT_P_H3K4me3_bowtie.bam InPutP1 rep1_WT_P_input_bowtie.bam
5 rep2_WT_P_H3K4me3_bowtie.bam InPutP2 rep2_WT_P_input_bowtie.bam
6 rep3_WT_P_H3K4me3_bowtie.bam InPutP3 rep2_WT_P_input_bowtie.bam
Peaks
1 sorted_rep1_WT_C_H3K4me3_bowtie-W200-G200-FDR0.001-island.bed
2 sorted_rep2_WT_C_H3K4me3_trimmed_bowtie-W200-G200-FDR0.001-island.bed
3 sorted_rep3_WT_C_H3K4me3_bowtie-W200-G200-FDR0.001-island.bed
4 sorted_rep1_WT_P_H3K4me3_bowtie-W200-G200-FDR0.001-island.bed
5 sorted_rep2_WT_P_H3K4me3_bowtie-W200-G200-FDR0.001-island.bed
6 sorted_rep3_WT_P_H3K4me3_bowtie-W200-G200-FDR0.001-island.bed
PeakCaller
1 bed
2 bed
3 bed
4 bed
5 bed
6 bed

diffbind • 3.2k views

ADD COMMENT • link 7.8 years ago mforoo1 • 0

score 0 · Answer 1 · 2017-06-06

0

Entering edit mode

Rory Stark ★ 5.2k

@rory-stark-5741

Last seen 3 months ago

Cambridge, UK

Could you let me know what version of DiffBind you are using [output of sessionInfo()]?

-Rory

ADD COMMENT • link 7.8 years ago Rory Stark ★ 5.2k

score 0 · Answer 2 · 2017-06-06

0

Entering edit mode

mforoo1 • 0

@mforoo1-10771

Last seen 7.7 years ago

hello,

thanks for responding to my question. It is DiffBind_1.16.3.

ADD COMMENT • link 7.8 years ago mforoo1 • 0

score 0 · Answer 3 · 2017-06-07

0

Entering edit mode

mforoo1 • 0

@mforoo1-10771

Last seen 7.7 years ago

Do you think I should update it to new version to get the DB?

ADD COMMENT • link 7.8 years ago mforoo1 • 0

score 0 · Answer 4 · 2017-06-08

0

Entering edit mode

Rory Stark ★ 5.2k

@rory-stark-5741

Last seen 3 months ago

Cambridge, UK

Yes, this is a very old version and is no longer unsupported. The issues you are experiencing have been addressed in subsequent versions. You should try this again with a more recent release.

Cheers-

Rory

ADD COMMENT • link 7.8 years ago Rory Stark ★ 5.2k

score 0 · Answer 5 · 2017-06-08

0

Entering edit mode

mforoo1 • 0

@mforoo1-10771

Last seen 7.7 years ago

thanks, I am going to install the latest version for finding DB. I have another question:

I have 3 replicates in each condition nad I tried to find the overlap between them. I did the dba.overlap

dba.overlap(H3K4me3, H3K4me3$masks$C, mode=DBA_OLAP_RATE)

and I got the result: 1] 41820 24549 16934

I need to get the peaks and write them in a table, how can I get the value [2]? the one that shows Uniques set appearing in at least two peak sets?

I read your answer here: DiffBind - overlap between different peak callers about writing a table but the mode there was DBA_OLAP_PEAKS, but I want to have RATE mode to get the peaks that are present at least in my two replicates.

Appreciate your time

Maryam

ADD COMMENT • link 7.8 years ago mforoo1 • 0

0

Entering edit mode

You can retrieve the merged binding sites you want using dba.peakset():

consensusC <- dba(H3K4me3, mask=H3K4me3$masks$C, minOvlerap=2)
overlapC2 <- dba.peakset(consensusC, bRetrieve=TRUE, 
                         writeFile="consensus_C_overlap2.txt")

Alternatively:

consensus2 <- dba.peakset(H3K4me3, consensus=DBA_TREATMENT, minOverlap=2)
overlapC2 <- dba.peakset(consensus2, 7, bRetrieve=TRUE, 
                         writeFile="consensus_C_overlap2.txt")

Cheers-

Rory

ADD REPLY • link 7.8 years ago Rory Stark ★ 5.2k

0

Entering edit mode

Hello Rory,

Many thanks for your help. I tried the first option but I got the error :

Error in dba(H3K4me3, mask = H3K4me3$masks$C, minOvlerap = 2) :
unused argument (minOvlerap = 2)

the second option gave me the peakset, I was wondering how C was specified? How can I do it for P samples as well? what is 7? I checked the manual but I could not find it.

ADD REPLY • link 7.8 years ago mforoo1 • 0

0

Entering edit mode

This is a typo. It should be minOverlap=2.

ADD REPLY • link 7.8 years ago Rory Stark ★ 5.2k

0

Entering edit mode

I got it, I should put 8. sorry for asking a lot of questions.

ADD REPLY • link 7.8 years ago mforoo1 • 0

score 0 · Answer 6 · 2017-06-09

I installed version 2.4.3. But I still could not get the DB.

1 Contrast:
Group1 Members1 Group2 Members2 DB.DESeq2
1 C 3 P 3 0
Error in pv.getPlotData(DBA, attributes = attributes, contrast = contrast, :
Unable to plot -- no sites within threshold
Calls: plot ... plot -> plot.DBA -> dba.plotHeatmap -> pv.getPlotData
Execution halted
Fri Jun 9 09:19:10 CDT 2017

score 0 · Answer 7 · 2017-06-10

0

Entering edit mode

Rory Stark ★ 5.2k

@rory-stark-5741

Last seen 3 months ago

Cambridge, UK

As the error states, there are no sites identified as being significantly differentially bound using the default threshold (FDR < 0).

It may be informative to do an MA plot (dba.plotMA). You can also retrieve the statistics for all the sites in your experiment by calling dba.report() with th=1 and bCounts=TRUE. You should see either that there are few sites with difference (low absolute fold changes), and/or that the sites with higher fold changes have high variance in the read counts.

I'm not sure how you did the read counts, but for H3K4me3 we generally set the summits parameter in the call to dba.count(), for example summits=75 which will focus the analysis on 150bp intervals. This can sometimes lower variance in the read counts (as less background reads are included).

Regards-

Rory

ADD COMMENT • link 7.8 years ago Rory Stark ★ 5.2k

0

Entering edit mode

Thanks a lot. I did the dba.count with summits=75 and also get 0 again. I retrieved all sites by calling dba.report() with th=1 and bCounts=TRUEcalling dba.report() with th=1 and bCounts=TRUE. it is wrierd that i got the same FDR for all sites.

Chr Start End Conc Conc_C Conc_P Fold p-value FDR WTC1
20821 Chr6 25297017 25297167 4.34 5.29 0.41 4.88 7.31e-05 0.453 13.35
20640 Chr6 22656555 22656705 4.29 5.23 0.65 4.59 2.31e-04 0.453 19.73
10372 Chr2 26467057 26467207 2.89 3.83 -0.68 4.51 3.39e-04 0.453 15.09
2495 Chr1 31428021 31428171 4.14 5.05 1.18 3.87 3.67e-04 0.453 20.89
5822 Chr11 8140152 8140302 3.81 4.72 0.70 4.02 4.70e-04 0.453 9.29

.

ADD REPLY • link 7.8 years ago mforoo1 • 0

score 0 · Answer 8 · 2017-06-14

0

Entering edit mode

mforoo1 • 0

@mforoo1-10771

Last seen 7.7 years ago

Thanks a lot. I did the dba.count with summits=75 and also get 0 again. I retrieved all sites by calling dba.report() with th=1 and bCounts=TRUEcalling dba.report() with th=1 and bCounts=TRUE. it is wrierd that i got the same FDR for all sites.

Chr Start End Conc Conc_C Conc_P Fold p-value FDR WTC1
20821 Chr6 25297017 25297167 4.34 5.29 0.41 4.88 7.31e-05 0.453 13.35
20640 Chr6 22656555 22656705 4.29 5.23 0.65 4.59 2.31e-04 0.453 19.73
10372 Chr2 26467057 26467207 2.89 3.83 -0.68 4.51 3.39e-04 0.453 15.09
2495 Chr1 31428021 31428171 4.14 5.05 1.18 3.87 3.67e-04 0.453 20.89
5822 Chr11 8140152 8140302 3.81 4.72 0.70 4.02 4.70e-04 0.453 9.29

.

ADD COMMENT • link 7.8 years ago mforoo1 • 0

0

Entering edit mode

You may want to plot your p-value distribution:

> DBreport <- dba.report(myDBA, th=1)
> hist(DBreport$"p-value")

If the distribution is relatively flat, the FDR correction will be very harsh as that looks like an experiment where everything fit the null hypothesis. Here is a link to a good explanation of this:

How to interpret a p-value histogram

For example, if you plot the p-value distribution for the vignette example:

> data(tamoxifen_analysis)
> DBreport <- dba.report(tamoxifen,th=1)
> hist(DBreport$"p-value")

you see a clear enrichment of low p-values, indicating that there were peaks that did not fit the null hypothesis (that there are no significant changes).

Getting the report with the counts (bCounts=TRUE) may be helpful. Your data seems to have some large fold changes, but if the variance of the read counts is high, the experiment may be underpowered (not enough replicates), hence the high FDR values.

-Rory

ADD REPLY • link 7.8 years ago Rory Stark ★ 5.2k