got 0 significant differential bound (DB)
8
0
Entering edit mode
mforoo1 • 0
@mforoo1-10771
Last seen 7.3 years ago

Hello,

I run diffbind to find DB between my two samples( control and phosphate deficiency) with two replicates. i choose deseq2 method for dba.analyzes and I got this:  

1 Contrast:
  Group1 Members1 Group2 Members2 DB.DESeq2
1      C        3      P        3         0
Error in pv.DBAreport(pv, contrast = contrast, method = method, th = th,  : 
  edgeR analysis has not been run for this contrast
Calls: plot ... plot.DBA -> dba.plotHeatmap -> pv.getPlotData -> pv.DBAreport
Execution halted

I used EDGEr as well, for that one I got 7 DB. I run others peakcalling software to find DB and I got some DB , but I could not understand why I did not get any DB from diffbind. 

my samplesheet :

SampleID Tissue  Factor Condition Treatment Replicate
1     WTC1  Shoot H3K4me3        WT         C         1
2     WTC2  Shoot H3K4me3        WT         C         2
3     WTC3  Shoot H3K4me3        WT         C         3
4     WTP1  Shoot H3K4me3        WT         P         1
5     WTP2  Shoot H3K4me3        WT         P         2
6     WTP3  Shoot H3K4me3        WT         P         3
                              bamReads ControlID                 bamControl
1         rep1_WT_C_H3K4me3_bowtie.bam   InPutC1 rep1_WT_C_input_bowtie.bam
2 rep2_WT_C_H3K4me3_trimmed_bowtie.bam   InPutC2 rep2_WT_C_input_bowtie.bam
3         rep3_WT_C_H3K4me3_bowtie.bam   InPutC3 rep2_WT_C_input_bowtie.bam
4         rep1_WT_P_H3K4me3_bowtie.bam   InPutP1 rep1_WT_P_input_bowtie.bam
5         rep2_WT_P_H3K4me3_bowtie.bam   InPutP2 rep2_WT_P_input_bowtie.bam
6         rep3_WT_P_H3K4me3_bowtie.bam   InPutP3 rep2_WT_P_input_bowtie.bam
                                                                  Peaks
1         sorted_rep1_WT_C_H3K4me3_bowtie-W200-G200-FDR0.001-island.bed
2 sorted_rep2_WT_C_H3K4me3_trimmed_bowtie-W200-G200-FDR0.001-island.bed
3         sorted_rep3_WT_C_H3K4me3_bowtie-W200-G200-FDR0.001-island.bed
4         sorted_rep1_WT_P_H3K4me3_bowtie-W200-G200-FDR0.001-island.bed
5         sorted_rep2_WT_P_H3K4me3_bowtie-W200-G200-FDR0.001-island.bed
6         sorted_rep3_WT_P_H3K4me3_bowtie-W200-G200-FDR0.001-island.bed
  PeakCaller
1        bed
2        bed
3        bed
4        bed
5        bed
6        bed

diffbind • 2.9k views
ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 4 weeks ago
Cambridge, UK

Could you let me know what version of DiffBind you are using [output of sessionInfo()]?

-Rory

 

ADD COMMENT
0
Entering edit mode
mforoo1 • 0
@mforoo1-10771
Last seen 7.3 years ago

hello, 

thanks for responding to my question. It is  DiffBind_1.16.3.

ADD COMMENT
0
Entering edit mode
mforoo1 • 0
@mforoo1-10771
Last seen 7.3 years ago

Do you think I should update it to new version to get the DB?
 

ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 4 weeks ago
Cambridge, UK

Yes, this is a very old version and is no longer unsupported. The issues you are experiencing have been addressed in subsequent versions. You should try this again with a more recent release.

Cheers-

Rory

ADD COMMENT
0
Entering edit mode
mforoo1 • 0
@mforoo1-10771
Last seen 7.3 years ago

thanks, I am going to install the latest version for finding DB. I have another question: 

I have 3 replicates in each condition nad I tried to find the overlap between them. I did the dba.overlap 

dba.overlap(H3K4me3, H3K4me3$masks$C, mode=DBA_OLAP_RATE)

and I got the result: 1] 41820 24549 16934

I need to get the peaks and write them in a table, how can I get the value [2]? the one that shows Uniques set appearing in at least two peak sets? 

I read your answer here:  DiffBind - overlap between different peak callers about writing a table but the mode there was DBA_OLAP_PEAKS, but I want to have RATE mode to get the peaks that are present at least in my two replicates.

Appreciate your time

Maryam 

 

 

 

ADD COMMENT
0
Entering edit mode

You can retrieve the merged binding sites you want using dba.peakset():

consensusC <- dba(H3K4me3, mask=H3K4me3$masks$C, minOvlerap=2)
overlapC2 <- dba.peakset(consensusC, bRetrieve=TRUE, 
                         writeFile="consensus_C_overlap2.txt")

Alternatively:

consensus2 <- dba.peakset(H3K4me3, consensus=DBA_TREATMENT, minOverlap=2)
overlapC2 <- dba.peakset(consensus2, 7, bRetrieve=TRUE, 
                         writeFile="consensus_C_overlap2.txt")

Cheers-

Rory

ADD REPLY
0
Entering edit mode

Hello Rory, 

Many thanks for your help. I tried the first option but I got the error : 

Error in dba(H3K4me3, mask = H3K4me3$masks$C, minOvlerap = 2) : 
  unused argument (minOvlerap = 2)

the second option gave me the peakset, I was wondering how C was specified?  How can I do it for P samples as well? what is 7? I checked the manual but I could not find it. 

ADD REPLY
0
Entering edit mode

This is a typo. It should be minOverlap=2.

ADD REPLY
0
Entering edit mode

I got it, I should put 8. sorry for asking a lot of questions. 

ADD REPLY
0
Entering edit mode
mforoo1 • 0
@mforoo1-10771
Last seen 7.3 years ago

I installed version 2.4.3. But I still could not get the DB. 

1 Contrast:
  Group1 Members1 Group2 Members2 DB.DESeq2
1      C        3      P        3         0
Error in pv.getPlotData(DBA, attributes = attributes, contrast = contrast,  : 
  Unable to plot -- no sites within threshold
Calls: plot ... plot -> plot.DBA -> dba.plotHeatmap -> pv.getPlotData
Execution halted
Fri Jun  9 09:19:10 CDT 2017

 

 

ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 4 weeks ago
Cambridge, UK

As the error states, there are no sites identified as being significantly differentially bound using the default threshold (FDR < 0).

It may be informative to do an MA plot (dba.plotMA). You can also retrieve the statistics for all the sites in your experiment by calling dba.report() with th=1 and bCounts=TRUE. You should see either that there are few sites with difference (low absolute fold changes), and/or that the sites with higher fold changes have high variance in the read counts.

I'm not sure how you did the read counts, but for H3K4me3 we generally set the summits parameter in the call to dba.count(), for example summits=75 which will focus the analysis on 150bp intervals. This can sometimes lower variance in the read counts (as less background reads are included).

Regards-

Rory

ADD COMMENT
0
Entering edit mode

Thanks a lot. I did the dba.count with summits=75 and also get 0 again. I retrieved all sites by calling dba.report() with th=1 and bCounts=TRUEcalling dba.report() with th=1 and bCounts=TRUE. it is wrierd that i got the same FDR for all sites. 

       Chr    Start      End  Conc Conc_C Conc_P  Fold  p-value   FDR  WTC1
20821  Chr6 25297017 25297167  4.34   5.29   0.41  4.88 7.31e-05 0.453 13.35
20640  Chr6 22656555 22656705  4.29   5.23   0.65  4.59 2.31e-04 0.453 19.73
10372  Chr2 26467057 26467207  2.89   3.83  -0.68  4.51 3.39e-04 0.453 15.09
2495   Chr1 31428021 31428171  4.14   5.05   1.18  3.87 3.67e-04 0.453 20.89
5822  Chr11  8140152  8140302  3.81   4.72   0.70  4.02 4.70e-04 0.453  9.29

.

.

.

 

 

 

ADD REPLY
0
Entering edit mode
mforoo1 • 0
@mforoo1-10771
Last seen 7.3 years ago

Thanks a lot. I did the dba.count with summits=75 and also get 0 again. I retrieved all sites by calling dba.report() with th=1 and bCounts=TRUEcalling dba.report() with th=1 and bCounts=TRUE. it is wrierd that i got the same FDR for all sites. 

       Chr    Start      End  Conc Conc_C Conc_P  Fold  p-value   FDR  WTC1
20821  Chr6 25297017 25297167  4.34   5.29   0.41  4.88 7.31e-05 0.453 13.35
20640  Chr6 22656555 22656705  4.29   5.23   0.65  4.59 2.31e-04 0.453 19.73
10372  Chr2 26467057 26467207  2.89   3.83  -0.68  4.51 3.39e-04 0.453 15.09
2495   Chr1 31428021 31428171  4.14   5.05   1.18  3.87 3.67e-04 0.453 20.89
5822  Chr11  8140152  8140302  3.81   4.72   0.70  4.02 4.70e-04 0.453  9.29

.

ADD COMMENT
0
Entering edit mode

You may want to plot your p-value distribution:

> DBreport <- dba.report(myDBA, th=1)
> hist(DBreport$"p-value")

If the distribution is relatively flat, the FDR correction will be very harsh as that looks like an experiment where everything fit the null hypothesis. Here is a link to a good explanation of this:

How to interpret a p-value histogram

For example, if you plot the p-value distribution for the vignette example:

> data(tamoxifen_analysis)
> DBreport <- dba.report(tamoxifen,th=1)
> hist(DBreport$"p-value")

you see a clear enrichment of low p-values, indicating that there were peaks that did not fit the null hypothesis (that there are no significant changes).

Getting the report with the counts (bCounts=TRUE) may be helpful. Your data seems to have some large fold changes, but if the variance of the read counts is high, the experiment may be underpowered (not enough replicates), hence the high FDR values.

-Rory

ADD REPLY

Login before adding your answer.

Traffic: 591 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6