ChIPpeakAnno - VennDiagram P-value - NaN
1
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 13 months ago
United States
Hi Khademul, I just found a block of code that was added in the new version that caused it run significantly slower. I will update the code this weekend that should help with the speed issue. Thanks! Kind regards, Julie On 8/9/10 10:30 AM, "Julie Zhu" <julie.zhu at="" umassmed.edu=""> wrote: > Hi Khademul, > > Regarding the speed concern with very big bed files, splitting the files > chromosome by chromosome would help if you can run R in a cluster. > > Regarding p-value for overlap, set totalTest appropriately is critical. Please > refer to the posts. > > http://permalink.gmane.org/gmane.science.biology.informatics.conduct or/29476 > http://permalink.gmane.org/gmane.science.biology.informatics.conduct or/30115 > > Kind regards, > > Julie > > > On 8/9/10 6:02 AM, "Khademul Islam" <khademul.islam at="" gmail.com=""> wrote: > > Hi Julie, > > Nice to see published ChIPpeakAnno paper. > > I was trying to do overlapping between my ChIPseq peak bed file (peak1: ~3100 > peaks) and Exon_Intron Boundary bed file ( peak2: ~400000 ). > > There are two concerns: > > 1. Its takes too long time (overnight) to calculate all these even with > powerful machine........., specially when one bed file is too big. > > 2. It produced "NaN" for p-value > > > $p.value > [1] NaN > > $vennCounts > peak1 peak2 Counts > [1,] 0 0 -398821 > [2,] 0 1 395828 > [3,] 1 0 2253 > [4,] 1 1 840 > attr(,"class") > [1] "VennCounts" > > > So it counts "-" (minus) -398821 ??? > > There were 22 warnings but no other error. Warning says that it produces "NaN" > > I pasted warnings below. I wanted to attach the bed files if you need to > check, but its too large. Well, in case if you need it, let me know, I will > upload it in FTP site. I got the venn diagram pic without any problem > (attached). > > Command line was: > > > ol <- findOverlappingPeaks(peaks1, peaks2, maxgap=1, multiple=F, > NameOfPeaks1="Peak1", NameOfPeaks2="Peak2") > > vdg <- makeVennDiagram(RangedDataList(peaks1, peaks2), NameOfPeaks=c("peak1", > "peak2"), maxgap=0, totalTest=100, cex = 1, counts.col = "red") > >> dev.copy2eps() > >> vdg > > > So, my question is, how can I get proper p-value from the overlap, what I have > to do to fix it? > > Thank you, > > Khademul > > > >> sessionInfo() > R version 2.11.1 (2010-05-31) > x86_64-unknown-linux-gnu > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] ChIPpeakAnno_1.4.1 > ..... > >> warnings() > Warning messages: > 1: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 2: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 3: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 4: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 5: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 6: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 7: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 8: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 9: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 10: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 11: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 12: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 13: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 14: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 15: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 16: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 17: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 18: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 19: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 20: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 21: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 22: In phyper(q, m, n, k, lower.tail, log.p) : NaNs produced > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ChIPSeq chipseq ChIPpeakAnno ChIPSeq chipseq ChIPpeakAnno • 1.3k views
ADD COMMENT
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 13 months ago
United States
Hi Khademul, The speed of findOverlappingPeaks function should be more reasonable now ( version 1.4.2 or 1.5.5). Please let me know if you still have speed issue. Thanks! Kind regards, Julie On 9/3/10 3:59 PM, "Zhu, Julie" <julie.zhu@umassmed.edu> wrote: Hi Khademul, I just found a block of code that was added in the new version that caused it run significantly slower. I will update the code this weekend that should help with the speed issue. Thanks! Kind regards, Julie On 8/9/10 10:30 AM, "Julie Zhu" <julie.zhu@umassmed.edu> wrote: > Hi Khademul, > > Regarding the speed concern with very big bed files, splitting the files > chromosome by chromosome would help if you can run R in a cluster. > > Regarding p-value for overlap, set totalTest appropriately is critical. Please > refer to the posts. > > http://permalink.gmane.org/gmane.science.biology.informatics.conduct or/29476 > http://permalink.gmane.org/gmane.science.biology.informatics.conduct or/30115 > > Kind regards, > > Julie > > > On 8/9/10 6:02 AM, "Khademul Islam" <khademul.islam@gmail.com> wrote: > > Hi Julie, > > Nice to see published ChIPpeakAnno paper. > > I was trying to do overlapping between my ChIPseq peak bed file (peak1: ~3100 > peaks) and Exon_Intron Boundary bed file ( peak2: ~400000 ). > > There are two concerns: > > 1. Its takes too long time (overnight) to calculate all these even with > powerful machine........., specially when one bed file is too big. > > 2. It produced "NaN" for p-value > > > $p.value > [1] NaN > > $vennCounts > peak1 peak2 Counts > [1,] 0 0 -398821 > [2,] 0 1 395828 > [3,] 1 0 2253 > [4,] 1 1 840 > attr(,"class") > [1] "VennCounts" > > > So it counts "-" (minus) -398821 ??? > > There were 22 warnings but no other error. Warning says that it produces "NaN" > > I pasted warnings below. I wanted to attach the bed files if you need to > check, but its too large. Well, in case if you need it, let me know, I will > upload it in FTP site. I got the venn diagram pic without any problem > (attached). > > Command line was: > > > ol <- findOverlappingPeaks(peaks1, peaks2, maxgap=1, multiple=F, > NameOfPeaks1="Peak1", NameOfPeaks2="Peak2") > > vdg <- makeVennDiagram(RangedDataList(peaks1, peaks2), NameOfPeaks=c("peak1", > "peak2"), maxgap=0, totalTest=100, cex = 1, counts.col = "red") > >> dev.copy2eps() > >> vdg > > > So, my question is, how can I get proper p-value from the overlap, what I have > to do to fix it? > > Thank you, > > Khademul > > > >> sessionInfo() > R version 2.11.1 (2010-05-31) > x86_64-unknown-linux-gnu > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] ChIPpeakAnno_1.4.1 > ..... > >> warnings() > Warning messages: > 1: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 2: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 3: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 4: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 5: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 6: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 7: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 8: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 9: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 10: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 11: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 12: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 13: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 14: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 15: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 16: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 17: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 18: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 19: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 20: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 21: In .local(query, subject, maxgap, minoverlap, type, select, ... : > argument 'multiple' is deprecated; use 'select'. > 22: In phyper(q, m, n, k, lower.tail, log.p) : NaNs produced > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi Julie, *"ChIPpeakAnno"* -- overlapping bed files and significant test: --- I have run my files with stable release version 1.4.2 for overlapping features and now it is amazingly fast !! Thanks. However, I have found that if I am only interested overlaping function (* findOverlappingPeaks*) and not doing the earlier steps for finding the nearest feature.......... like "annotate to nearest TSS" (*annotatedPeak*) .............. then the function looks for "*strand*" column .................. and if there is no strand column in the file, it complains: * length of 'dimnames' [2] not equal to array extent* Interestingly it is looking for strand column from second file only ................... *r2 = cbind(rownames(Peaks2), start(Peaks2), end(Peaks2), Peaks2$strand) colnames(r2) = c(NameOfPeaks2, paste(NameOfPeaks2, "start", sep = "_"), paste(NameOfPeaks2, "end", sep = "_"), "strand")* So, I just added pseudo strand column (positive strand) in my file and it went well. May be you could change this to default positive strand if strand info is not provided or if it is not getting this pseudo strand information who skips earlier steps. Anyway, thanks again for this wonderful package best regards Khademul On Mon, Sep 6, 2010 at 4:26 AM, Zhu, Julie <julie.zhu@umassmed.edu> wrote: > Hi Khademul, > > The speed of findOverlappingPeaks function should be more reasonable now ( > version 1.4.2 or 1.5.5). Please let me know if you still have speed issue. > Thanks! > > Kind regards, > > Julie > > On 9/3/10 3:59 PM, "Zhu, Julie" <julie.zhu@umassmed.edu> wrote: > > Hi Khademul, > > I just found a block of code that was added in the new version that caused > it run significantly slower. I will update the code this weekend that > should > help with the speed issue. Thanks! > > Kind regards, > > Julie > > > On 8/9/10 10:30 AM, "Julie Zhu" <julie.zhu@umassmed.edu> wrote: > > > Hi Khademul, > > > > Regarding the speed concern with very big bed files, splitting the files > > chromosome by chromosome would help if you can run R in a cluster. > > > > Regarding p-value for overlap, set totalTest appropriately is critical. > Please > > refer to the posts. > > > > > http://permalink.gmane.org/gmane.science.biology.informatics.conduct or/29476 > > > http://permalink.gmane.org/gmane.science.biology.informatics.conduct or/30115 > > > > Kind regards, > > > > Julie > > > > > > On 8/9/10 6:02 AM, "Khademul Islam" <khademul.islam@gmail.com> wrote: > > > > Hi Julie, > > > > Nice to see published ChIPpeakAnno paper. > > > > I was trying to do overlapping between my ChIPseq peak bed file (peak1: > ~3100 > > peaks) and Exon_Intron Boundary bed file ( peak2: ~400000 ). > > > > There are two concerns: > > > > 1. Its takes too long time (overnight) to calculate all these even with > > powerful machine........., specially when one bed file is too big. > > > > 2. It produced "NaN" for p-value > > > > > > $p.value > > [1] NaN > > > > $vennCounts > > peak1 peak2 Counts > > [1,] 0 0 -398821 > > [2,] 0 1 395828 > > [3,] 1 0 2253 > > [4,] 1 1 840 > > attr(,"class") > > [1] "VennCounts" > > > > > > So it counts "-" (minus) -398821 ??? > > > > There were 22 warnings but no other error. Warning says that it produces > "NaN" > > > > I pasted warnings below. I wanted to attach the bed files if you need to > > check, but its too large. Well, in case if you need it, let me know, I > will > > upload it in FTP site. I got the venn diagram pic without any problem > > (attached). > > > > Command line was: > > > > > > ol <- findOverlappingPeaks(peaks1, peaks2, maxgap=1, multiple=F, > > NameOfPeaks1="Peak1", NameOfPeaks2="Peak2") > > > > vdg <- makeVennDiagram(RangedDataList(peaks1, peaks2), > NameOfPeaks=c("peak1", > > "peak2"), maxgap=0, totalTest=100, cex = 1, counts.col = "red") > > > >> dev.copy2eps() > > > >> vdg > > > > > > So, my question is, how can I get proper p-value from the overlap, what I > have > > to do to fix it? > > > > Thank you, > > > > Khademul > > > > > > > >> sessionInfo() > > R version 2.11.1 (2010-05-31) > > x86_64-unknown-linux-gnu > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > > other attached packages: > > [1] ChIPpeakAnno_1.4.1 > > ..... > > > >> warnings() > > Warning messages: > > 1: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 2: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 3: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 4: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 5: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 6: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 7: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 8: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 9: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 10: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 11: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 12: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 13: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 14: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 15: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 16: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 17: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 18: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 19: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 20: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 21: In .local(query, subject, maxgap, minoverlap, type, select, ... : > > argument 'multiple' is deprecated; use 'select'. > > 22: In phyper(q, m, n, k, lower.tail, log.p) : NaNs produced > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 706 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6