Entering edit mode
Hi Khademul,
Thanks for the positive feedback and great suggestion! I will update
the package accordingly.
Best regards,
Julie
On 9/22/10 5:31 AM, "Khademul Islam" <khademul.islam@gmail.com> wrote:
Hi Julie,
"ChIPpeakAnno" -- overlapping bed files and significant test:
--- I have run my files with stable release version 1.4.2 for
overlapping features and now it is amazingly fast !! Thanks.
However, I have found that if I am only interested overlaping function
(findOverlappingPeaks) and not doing the earlier steps for finding the
nearest feature.......... like "annotate to nearest TSS"
(annotatedPeak) .............. then the function looks for "strand"
column .................. and if there is no strand column in the
file, it complains:
length of 'dimnames' [2] not equal to array extent
Interestingly it is looking for strand column from second file only
...................
r2 = cbind(rownames(Peaks2), start(Peaks2), end(Peaks2),
Peaks2$strand)
colnames(r2) = c(NameOfPeaks2, paste(NameOfPeaks2, "start",
sep = "_"), paste(NameOfPeaks2, "end", sep = "_"), "strand")
So, I just added pseudo strand column (positive strand) in my file and
it went well. May be you could change this to default positive strand
if strand info is not provided or if it is not getting this pseudo
strand information who skips earlier steps.
Anyway, thanks again for this wonderful package
best regards
Khademul
On Mon, Sep 6, 2010 at 4:26 AM, Zhu, Julie <julie.zhu@umassmed.edu>
wrote:
Hi Khademul,
The speed of findOverlappingPeaks function should be more reasonable
now ( version 1.4.2 or 1.5.5). Please let me know if you still have
speed issue. Thanks!
Kind regards,
Julie
On 9/3/10 3:59 PM, "Zhu, Julie" <julie.zhu@umassmed.edu> wrote:
Hi Khademul,
I just found a block of code that was added in the new version that
caused
it run significantly slower. I will update the code this weekend that
should
help with the speed issue. Thanks!
Kind regards,
Julie
On 8/9/10 10:30 AM, "Julie Zhu" <julie.zhu@umassmed.edu> wrote:
> Hi Khademul,
>
> Regarding the speed concern with very big bed files, splitting the
files
> chromosome by chromosome would help if you can run R in a cluster.
>
> Regarding p-value for overlap, set totalTest appropriately is
critical. Please
> refer to the posts.
>
> http://permalink.gmane.org/gmane.science.biology.informatics.conduct
or/29476
> http://permalink.gmane.org/gmane.science.biology.informatics.conduct
or/30115
>
> Kind regards,
>
> Julie
>
>
> On 8/9/10 6:02 AM, "Khademul Islam" <khademul.islam@gmail.com>
wrote:
>
> Hi Julie,
>
> Nice to see published ChIPpeakAnno paper.
>
> I was trying to do overlapping between my ChIPseq peak bed file
(peak1: ~3100
> peaks) and Exon_Intron Boundary bed file ( peak2: ~400000 ).
>
> There are two concerns:
>
> 1. Its takes too long time (overnight) to calculate all these even
with
> powerful machine........., specially when one bed file is too big.
>
> 2. It produced "NaN" for p-value
>
>
> $p.value
> [1] NaN
>
> $vennCounts
> peak1 peak2 Counts
> [1,] 0 0 -398821
> [2,] 0 1 395828
> [3,] 1 0 2253
> [4,] 1 1 840
> attr(,"class")
> [1] "VennCounts"
>
>
> So it counts "-" (minus) -398821 ???
>
> There were 22 warnings but no other error. Warning says that it
produces "NaN"
>
> I pasted warnings below. I wanted to attach the bed files if you
need to
> check, but its too large. Well, in case if you need it, let me know,
I will
> upload it in FTP site. I got the venn diagram pic without any
problem
> (attached).
>
> Command line was:
>
>
> ol <- findOverlappingPeaks(peaks1, peaks2, maxgap=1, multiple=F,
> NameOfPeaks1="Peak1", NameOfPeaks2="Peak2")
>
> vdg <- makeVennDiagram(RangedDataList(peaks1, peaks2),
NameOfPeaks=c("peak1",
> "peak2"), maxgap=0, totalTest=100, cex = 1, counts.col = "red")
>
>> dev.copy2eps()
>
>> vdg
>
>
> So, my question is, how can I get proper p-value from the overlap,
what I have
> to do to fix it?
>
> Thank you,
>
> Khademul
>
>
>
>> sessionInfo()
> R version 2.11.1 (2010-05-31)
> x86_64-unknown-linux-gnu
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] ChIPpeakAnno_1.4.1
> .....
>
>> warnings()
> Warning messages:
> 1: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 2: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 3: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 4: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 5: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 6: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 7: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 8: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 9: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 10: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 11: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 12: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 13: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 14: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 15: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 16: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 17: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 18: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 19: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 20: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 21: In .local(query, subject, maxgap, minoverlap, type, select, ...
:
> argument 'multiple' is deprecated; use 'select'.
> 22: In phyper(q, m, n, k, lower.tail, log.p) : NaNs produced
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
[[alternative HTML version deleted]]