Curious behaviour of ChIPpeakAnno
4
0
Entering edit mode
Brad Davis ▴ 20
@brad-davis-5385
Last seen 9.8 years ago
Canada

Hello everyone,

I am getting some surprising behaviour out of the ChIPpeakAnno R package.  The question I’m trying to ask is ‘Which features intersect between samples ‘WT312’, ‘WT340’, and ‘WT354’, that also intersect with the intersection between samples ‘WT313’, ‘WT341’, and ‘WT355’. To visualize this, I am using makeVennDiagram to create two three-way venn diagrams: one for samples ‘WT312’, ‘WT340’, and ‘WT354’, and one for samples ‘WT313’, ‘WT341’, and ‘WT355’.  Then I use findOverlappingPeaks to make an intersection of WT312, WT340, and WT354, which I call WT312_WT340_WT354.  Separately, I also use findOverlappingPeaks to make an intersection of WT313, WT341, and WT355, which I call WT313_WT341_WT355.  Then I take these two three-way intersected lists and pass those back through to makeVennDiagrams.  The issue is that the total number of elements contained within the three-way venn diagrams for WT312-WT340-WT354 and WT313-WT341-WT355 do not sum up to the total number of elements from the two-way venn diagram.  I’m attaching three PNG files to show what I’m talking about.  Specifically, the number of elements that are found in WT312-WT340-WT354 in the two-way venn diagram exceeds the number of elements found to intersect between WT312-WT340-WT354 in the three-way venn diagram.  The same basic story exists for WT313-WT341-WT355.

I can’t understand what would cause this behaviour, but it is a bit concerning for me.  The intersection between the sets WT312-WT340-WT354 and WT313_WT341_WT355 should exactly sum to the intersection of the set WT312-WT340-WT354 and the the intersection of set WT313-WT341-WT355.

 

I'm pasting image links so you can see what I'm referring to.

 

487+395 = 882, not 852 

and

292+395 = 687, not 674 

 

WT312-WT340-WT354 venn diagram

 

WT313_WT341_WT355WT312-WT340-WT354_WT313-WT341-WT355

Thanks,
Brad Davis

chippeakanno • 2.9k views
ADD COMMENT
2
Entering edit mode
Ou, Jianhong ★ 1.3k
@ou-jianhong-4539
Last seen 1 day ago
United States

Hi Brad,

Thank you for your message. This question is very typical one for calculating intersection of peaks. As you know a peak is not a point, it is a range of continuous points. If we consider intersection of set A (1-2, 4-5, 7-9) with set B (2-8), how many intersect peaks should we got? This is a question. If I say 1, but setA has three peaks. If I say 3 but setB has only one peak. In ChIPpeakAnno (release version), we are using the minimal number for intersect peaks. That is to say we are using 1 in this case. This is way you will get smaller number than it looks like. 

I know this issue make people uncomfortable. I am thinking how to resolve this problem. I am thinking to build a new method to show the two numbers in the intersection part of venn diagram. I will let you know once it is done.

 

ADD COMMENT
1
Entering edit mode
Ou, Jianhong ★ 1.3k
@ou-jianhong-4539
Last seen 1 day ago
United States

Hi Brad,

Please have a try the newest devel version at http://pgfe.umassmed.edu/ChIPpeakAnno/. Try makeVennDiagram with connectedPeaks="keepAll"

ADD COMMENT
0
Entering edit mode

Hi Jianhong,

Thanks for pointing me to the latest development version of ChIPpeakAnno.  I am trying to use it but I have encountered what appears to be a bug.  While trying to use the function findOverlappingPeaks I get an error indicating that it cannot find the function .Deprecated.  I've pasted the snippet of code and error message below.

> WT312_WT340c= findOverlappingPeaks(WT312c, WT340c, select="all", connectedPeaks="keepAll")
Error in findOverlappingPeaks(WT312c, WT340c, select = "all", connectedPeaks = "keepAll") : 
  could not find function ".Deprected"

Do you have any suggestions on how to deal with this?

 

Thanks,

Brad

 

Edit: A partial solution to this seems to be to use findOverlapsOfPeaks instead of findOverlappingPeaks; however, it doesn't allow me to do what I was hoping to do.  I was using findOverlappingPeaks in a two step process to intersect three sets of intervals (and I did that twice on two separate sets of three sets of intervals).  Then I used the resulting two sets of intervals to make a two-way venn diagram.  Below is a snippet of code illustrating what I was doing:

 

#intersect normal peaks
WT312_WT340c= findOverlappingPeaks(WT312c, WT340c, select="all")
W
T312_WT340_WT354c = findOverlappingPeaks(WT312_WT340c$MergedPeaks, WT354c, select="all")

#intersect disease peaks
WT313_WT341c= findOverlappingPeaks(WT313c, WT341c, select="all")
WT313_WT341_WT355c = findOverlappingPeaks(WT313_WT341c$MergedPeaks, WT355c, select="all")

#make two way venn diagram 
makeVennDiagram(RangedDataList(WT312_WT340_WT354c$MergedPeaks, WT313_WT341_WT355c$MergedPeaks), NameOfPeaks=c("Normal", "Diseased"), fill=twocolors, col=twocolors, totalTest=100, main="allNormalIntersect intersected with allDiseasedIntersect\nthreshold=1.0", cex=3.0)  

 

There doesn't seem to be any way to do this with findOverlapOfPeaks, because it doesn't return a list of peaks in a format that can be used with makeVennDiagrams, nor can I use the output of findOverlapsOfPeaks as input for a second round of findOverlapOfPeaks.

Do you see what I'm trying to accomplish?

ADD REPLY
1
Entering edit mode

Thanks for reporting the bug. It is fixed. You could download the new devel version from http://pgfe.umassmed.edu/ChIPpeakAnno/ now. 

For the new version, you can try:

WT312_WT340_WT354c.ol <- findOverlapsOfPeaks(WT312c, WT340c, WT354c, maxgap=0)

WT313_WT341_WT355c.ol <- findOverlapsOfPeaks(WT313c, WT341c, WT355c, maxgap=0)

WT312_WT340_WT354c <- WT312_WT340_WT354c.ol$peaklist$`WT312c///WT340c///WT354c`

WT313_WT341_WT355c <- WT313_WT341_WT355c.ol$peaklist$`WT313c///WT341c///WT355c`

ol <- findOverlapsOfPeaks(WT312_WT340_WT354c, WT313_WT341_WT355c, maxgap=0, connectedPeaks="keepAll")

makeVennDiagram(ol, NameOfPeaks=c("Normal", "Diseased"), fill=twocolors, col=twocolors, main="allNormalIntersect intersected with allDiseasedIntersect\nthreshold=1.0", cex=3.0)

Let me known it work or not.

ADD REPLY
0
Entering edit mode

Hi Jianhong,

Thank you for your quick response. I have installed the latest version of ChIPpeakAnno and I am trying out the example code you have provided me.  

What is the best way for me to make the two three-way venn diagrams so that they are comparable with this two-way venn diagram?  My gut reaction is to use 

makeVennDiagram(RangedDataList(WT312c, WT340c, WT354c), NameOfPeaks=c("WT312", "WT340", "WT354"),fill=brewer.pal(n=3, "Set1"), col=brewer.pal(n=3, "Set1"),scaled=FALSE, totalTest=100, euler.d=F, main=sprintf("%s: enriched regions, E.T. = %2.1f", mark, threshold), cex=3.0, connectedPeaks="keepAll");

Would it be easier if I sent you the bed files I'm working with?

ADD REPLY
0
Entering edit mode

It will be great if you can send me the bed files you are working at. I really want some real data to test my code.

Another thing, you need to determine the total test number by yourself, for example some reasonable number of possible binding sites.

ADD REPLY
0
Entering edit mode

Hi Jianhong,

I've tried the latest version of the code and worked through your example, but the numbers still do not add up correctly.  The total number of peaks in the two way venn diagram (i.e. both for WT313_WT313_WT355 alone and the intersection between WT313_WT341_WT355 and WT312_WT340_WT354) does not equal the number of peaks identified in the intersection between WT313, WT341, and WT355 in the three way venn diagram.  The numbers are off by a few hundred.  For example, for the diseased samples (WT313, WT341. WT355), the number of features in the two-way venn diagram are 2149 and 1163 (1217.1245) for the diseased exclusive and the intersection, respectively), while the number of features for the three way venn diagram for the diseased samples equals 3502 (4030/3740/3927).  I can send you the bed files that I am working with if that would help.

Brad

ADD REPLY
0
Entering edit mode

could you show me the output of

ol$venn_cnt

WT312_WT340_WT354c.ol$venn_cnt

WT313_WT341_WT355c.ol$venn_cnt

ADD REPLY
0
Entering edit mode
Brad Davis ▴ 20
@brad-davis-5385
Last seen 9.8 years ago
Canada

Hello Jianhong,

Thank you for your reply.  I understand your example and your point.

Cheers,

Brad

ADD COMMENT
0
Entering edit mode
Brad Davis ▴ 20
@brad-davis-5385
Last seen 9.8 years ago
Canada

Hi Jianhong,

It seems like this behaviour should be related to the 'select' parameter option in findOverlappingPeaks command?  If so, shouldn't there be a way to generate identical venn diagrams by setting the 'select' command to a defined argument, such as 'left'?

 

Also, would it be possible to report the percentage (as opposed to the raw number) for each of the elements in the venn diagram? That would make it easier to compare between comparisons.

ADD COMMENT
0
Entering edit mode

good suggestion. We will consider about that.

ADD REPLY

Login before adding your answer.

Traffic: 517 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6