Hypergeometric test in ChIPpeakAnno
1
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 13 months ago
United States
Abihishek, This is a very good question which has been very nicely addressed by Noah. Please follow the following email threads in the Biocondutor mailing archives. https://stat.ethz.ch/pipermail/bioconductor/2010-November/036540.html http://permalink.gmane.org/gmane.science.biology.informatics.conductor /29476 http://permalink.gmane.org/gmane.science.biology.informatics.conductor /30115 Please cc bioconductor <bioconductor at="" stat.math.ethz.ch=""> so that others could benefit and/or contribute. Thanks! Best regards, Julie On 6/28/11 4:29 AM, "Abhishek Singh" <abhisheksinghnl at="" gmail.com=""> wrote: > Dear prof. Julie, > > I have one more question regarding the hypergeometric test implemented in > ChIPpeakAnno package for construction of Venn diagram. > > The command you gave for the sample data in the article is: > >> makeVennDiagram(RangedDataList(Peaks.Ste12.Replicate1, >> Peaks.Ste12.Replicate2, Peaks.Ste12.Replicate3), NameOfPeaks = >> c("Replicate1","Replicate2","Replicate3"), maxgap = 0, totalTest = 1580) > > Where totalTest indicates how many peaks in total that is used in > hypergeometric test (as indicated in article). > > Imagine I have a three data sets: > (a) Dataset A has 100 peaks > (b) Dataset B has 150 peaks > (c) Dataset C has 75 peaks > > How can I compute the value of totalTest for these three data sets? > > Thank you for? your time, > Looking forward for your reply, > > Regards > Abhishek > > > On Mon, Jun 27, 2011 at 7:45 PM, Abhishek Singh <abhisheksinghnl at="" gmail.com=""> > wrote: >> Dear Prof. Julie, >> >> Thank you, for providing code. >> >> Best Regards, >> Abhishek >> >> >> On Mon, Jun 27, 2011 at 5:47 PM, Zhu, Lihua (Julie) <julie.zhu at="" umassmed.edu=""> >> wrote: >>> Abhishek, >>> >>> Please try the following code snippets assuming your bed file is test1.bed >>> without header. >>> >>> library(ChIPpeakAnno) >>> test1.bed=read.table("~/Document/test1.bed", sep="\t", skip=0, header=FALSE) >>> myPeakList = BED2RangedData(test1.bed,header=FALSE) >>> ?? >>> >>> Now you can use annotatePeakInBatch to annotate myPeakList. >>> >>> For detailed information on how to use ChIPpeakAnno package, please refer to >>> http://www.bioconductor.org/packages/2.8/bioc/vignettes/ChIPpeakAn no/inst/do >>> c/ChIPpeakAnno.pdf, >>> ?http://www.bioconductor.org/help/course- materials/2010/BioC2010/BioC2010_Ch >>> IPpeakAnno.pdf >>> And Zhu L.J. et al. (2010) ChIPpeakAnno: a Bioconductor package to annotate >>> ChIP-seq and ChIP-chip data. BMC Bioinformatics 2010, >>> 11:237doi:10.1186/1471-2105-11-237. >>> >>> >>> Best regards, >>> >>> Julie >>> >>> >>> On 6/27/11 7:40 AM, "Abhishek Singh" <abhisheksinghnl at="" gmail.com="">>> <http: abhisheksinghnl="" at="" gmail.com=""> > wrote: >>> >>>> hi! >>>> >>>> I was trying to use your R package ChIPpeakAnno to annotate my peak files >>>> which are in .bed format. >>>> >>>> Somehow I am unable to tell your package to load my input files and perform >>>> analysis. >>>> >>>> To brief you what exactly I intend to do, I have a peak file (form MACS in >>>> .bed format) and I want to give this file as an input to your package in R >>>> (which is already installed). >>>> >>>> could you roughly tell me what exactly should I do so that the package >>>> starts reading my files as an input. >>>> >>>> Thank you for your time. >>>> >>>> Looking forward for your reply. >>>> >>>> Regards >>>> Abhishek A. Singh >>>> >> > >
annotate ChIPpeakAnno annotate ChIPpeakAnno • 1.3k views
ADD COMMENT
0
Entering edit mode
@abhishek-singh-4725
Last seen 9 months ago
France
Dear Prof. Julie, What I could comprehend from the examples and the threads, the best way to compute the totalset value would be, to merge all the peak files understudy (for which a VennDiagram is desired) and than count number of genomic regions present in the merged peak file. The total number of genomic regions present in merged peakfile can be used as the value for the Totalset. Please correct me if I am wrong. Regards Abhishek On Tue, Jun 28, 2011 at 4:02 PM, Zhu, Lihua (Julie) <julie.zhu@umassmed.edu>wrote: > Abihishek, > > This is a very good question which has been very nicely addressed by Noah. > Please follow the following email threads in the Biocondutor mailing > archives. > https://stat.ethz.ch/pipermail/bioconductor/2010-November/036540.html > > http://permalink.gmane.org/gmane.science.biology.informatics.conduct or/29476 > > http://permalink.gmane.org/gmane.science.biology.informatics.conduct or/30115 > > Please cc bioconductor <bioconductor@stat.math.ethz.ch> so that others > could > benefit and/or contribute. Thanks! > > Best regards, > > Julie > > > On 6/28/11 4:29 AM, "Abhishek Singh" <abhisheksinghnl@gmail.com> wrote: > > > Dear prof. Julie, > > > > I have one more question regarding the hypergeometric test implemented in > > ChIPpeakAnno package for construction of Venn diagram. > > > > The command you gave for the sample data in the article is: > > > >> makeVennDiagram(RangedDataList(Peaks.Ste12.Replicate1, > >> Peaks.Ste12.Replicate2, Peaks.Ste12.Replicate3), NameOfPeaks = > >> c("Replicate1","Replicate2","Replicate3"), maxgap = 0, totalTest = 1580) > > > > Where totalTest indicates how many peaks in total that is used in > > hypergeometric test (as indicated in article). > > > > Imagine I have a three data sets: > > (a) Dataset A has 100 peaks > > (b) Dataset B has 150 peaks > > (c) Dataset C has 75 peaks > > > > How can I compute the value of totalTest for these three data sets? > > > > Thank you for your time, > > Looking forward for your reply, > > > > Regards > > Abhishek > > > > > > On Mon, Jun 27, 2011 at 7:45 PM, Abhishek Singh < > abhisheksinghnl@gmail.com> > > wrote: > >> Dear Prof. Julie, > >> > >> Thank you, for providing code. > >> > >> Best Regards, > >> Abhishek > >> > >> > >> On Mon, Jun 27, 2011 at 5:47 PM, Zhu, Lihua (Julie) < > Julie.Zhu@umassmed.edu> > >> wrote: > >>> Abhishek, > >>> > >>> Please try the following code snippets assuming your bed file is > test1.bed > >>> without header. > >>> > >>> library(ChIPpeakAnno) > >>> test1.bed=read.table("~/Document/test1.bed", sep="\t", skip=0, > header=FALSE) > >>> myPeakList = BED2RangedData(test1.bed,header=FALSE) > >>> > >>> > >>> Now you can use annotatePeakInBatch to annotate myPeakList. > >>> > >>> For detailed information on how to use ChIPpeakAnno package, please > refer to > >>> > http://www.bioconductor.org/packages/2.8/bioc/vignettes/ChIPpeakAnno /inst/do > >>> c/ChIPpeakAnno.pdf, > >>> > http://www.bioconductor.org/help/course- materials/2010/BioC2010/BioC2010_Ch > >>> IPpeakAnno.pdf > >>> And Zhu L.J. et al. (2010) ChIPpeakAnno: a Bioconductor package to > annotate > >>> ChIP-seq and ChIP-chip data. BMC Bioinformatics 2010, > >>> 11:237doi:10.1186/1471-2105-11-237. > >>> > >>> > >>> Best regards, > >>> > >>> Julie > >>> > >>> > >>> On 6/27/11 7:40 AM, "Abhishek Singh" <abhisheksinghnl@gmail.com> >>> <http: abhisheksinghnl@gmail.com=""> > wrote: > >>> > >>>> hi! > >>>> > >>>> I was trying to use your R package ChIPpeakAnno to annotate my peak > files > >>>> which are in .bed format. > >>>> > >>>> Somehow I am unable to tell your package to load my input files and > perform > >>>> analysis. > >>>> > >>>> To brief you what exactly I intend to do, I have a peak file (form > MACS in > >>>> .bed format) and I want to give this file as an input to your package > in R > >>>> (which is already installed). > >>>> > >>>> could you roughly tell me what exactly should I do so that the package > >>>> starts reading my files as an input. > >>>> > >>>> Thank you for your time. > >>>> > >>>> Looking forward for your reply. > >>>> > >>>> Regards > >>>> Abhishek A. Singh > >>>> > >> > > > > > > > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Abhishek, The totalTest is the total number of potential genomic regions you sampled to obtain the peaks. It should be much larger than the number of peaks in any of your peak files. Using merged peak file would most likely lead to underestimate of the totalTest. Noah has given excellent suggestions on estimating totalTest for ChIP- seq experiment for different scenarios at https://stat.ethz.ch/pipermail/bioconductor/2010-November/036540.html. FYI, I will be giving a practical tutoring in the Bioconductor meeting in Seattle https://secure.bioconductor.org/BioC2011/labs.php . We could discuss it further face to face if you happen to attend the meeting as well. Otherwise, we could schedule a time to talk if needed. Best regards, Julie On 6/28/11 10:47 AM, "Abhishek Singh" <abhisheksinghnl at="" gmail.com=""> wrote: > > Dear Prof. Julie, > > What I could comprehend from the examples and the threads, the best way to > compute? the totalset value would be, to merge all the peak files understudy > (for which a VennDiagram is desired) and than count number of genomic regions > present in the merged peak file. The total number of genomic regions present > in merged peakfile can be used as the value for the Totalset. > > Please correct me if I am wrong. > > Regards > Abhishek > > On Tue, Jun 28, 2011 at 4:02 PM, Zhu, Lihua (Julie) <julie.zhu at="" umassmed.edu=""> > wrote: >> Abihishek, >> >> This is a very good question which has been very nicely addressed by Noah. >> Please follow the following email threads in the Biocondutor mailing >> archives. >> https://stat.ethz.ch/pipermail/bioconductor/2010-November/036540.html >> http://permalink.gmane.org/gmane.science.biology.informatics.conduc tor/29476 >> http://permalink.gmane.org/gmane.science.biology.informatics.conduc tor/30115 >> >> Please cc bioconductor <bioconductor at="" stat.math.ethz.ch=""> so that others could >> benefit and/or contribute. Thanks! >> >> Best regards, >> >> Julie >> >> >> On 6/28/11 4:29 AM, "Abhishek Singh" <abhisheksinghnl at="" gmail.com=""> wrote: >> >>> Dear prof. Julie, >>> >>> I have one more question regarding the hypergeometric test implemented in >>> ChIPpeakAnno package for construction of Venn diagram. >>> >>> The command you gave for the sample data in the article is: >>> >>>> makeVennDiagram(RangedDataList(Peaks.Ste12.Replicate1, >>>> Peaks.Ste12.Replicate2, Peaks.Ste12.Replicate3), NameOfPeaks = >>>> c("Replicate1","Replicate2","Replicate3"), maxgap = 0, totalTest = 1580) >>> >>> Where totalTest indicates how many peaks in total that is used in >>> hypergeometric test (as indicated in article). >>> >>> Imagine I have a three data sets: >>> (a) Dataset A has 100 peaks >>> (b) Dataset B has 150 peaks >>> (c) Dataset C has 75 peaks >>> >>> How can I compute the value of totalTest for these three data sets? >>> >>> Thank you for? your time, >>> Looking forward for your reply, >>> >>> Regards >>> Abhishek >>> >>> >>> On Mon, Jun 27, 2011 at 7:45 PM, Abhishek Singh <abhisheksinghnl at="" gmail.com=""> >>> wrote: >>>> Dear Prof. Julie, >>>> >>>> Thank you, for providing code. >>>> >>>> Best Regards, >>>> Abhishek >>>> >>>> >>>> On Mon, Jun 27, 2011 at 5:47 PM, Zhu, Lihua (Julie) >>>> <julie.zhu at="" umassmed.edu=""> >>>> wrote: >>>>> Abhishek, >>>>> >>>>> Please try the following code snippets assuming your bed file is test1.bed >>>>> without header. >>>>> >>>>> library(ChIPpeakAnno) >>>>> test1.bed=read.table("~/Document/test1.bed", sep="\t", skip=0, >>>>> header=FALSE) >>>>> myPeakList = BED2RangedData(test1.bed,header=FALSE) >>>>> ?? >>>>> >>>>> Now you can use annotatePeakInBatch to annotate myPeakList. >>>>> >>>>> For detailed information on how to use ChIPpeakAnno package, please refer >>>>> to >>>>> http://www.bioconductor.org/packages/2.8/bioc/vignettes/ChIPpeak Anno/inst/ >>>>> do >>>>> c/ChIPpeakAnno.pdf, >>>>> ?http://www.bioconductor.org/help/course- materials/2010/BioC2010/BioC2010_ >>>>> Ch >>>>> IPpeakAnno.pdf >>>>> And Zhu L.J. et al. (2010) ChIPpeakAnno: a Bioconductor package to >>>>> annotate >>>>> ChIP-seq and ChIP-chip data. BMC Bioinformatics 2010, >>>>> 11:237doi:10.1186/1471-2105-11-237. >>>>> >>>>> >>>>> Best regards, >>>>> >>>>> Julie >>>>> >>>>> >>>>> On 6/27/11 7:40 AM, "Abhishek Singh" <abhisheksinghnl at="" gmail.com="">>>>> <http: abhisheksinghnl="" at="" gmail.com="" <http:="" gmail.com=""> > > wrote: >>>>> >>>>>> hi! >>>>>> >>>>>> I was trying to use your R package ChIPpeakAnno to annotate my peak files >>>>>> which are in .bed format. >>>>>> >>>>>> Somehow I am unable to tell your package to load my input files and >>>>>> perform >>>>>> analysis. >>>>>> >>>>>> To brief you what exactly I intend to do, I have a peak file (form MACS >>>>>> in >>>>>> .bed format) and I want to give this file as an input to your package in >>>>>> R >>>>>> (which is already installed). >>>>>> >>>>>> could you roughly tell me what exactly should I do so that the package >>>>>> starts reading my files as an input. >>>>>> >>>>>> Thank you for your time. >>>>>> >>>>>> Looking forward for your reply. >>>>>> >>>>>> Regards >>>>>> Abhishek A. Singh >>>>>> >>>> >>> >>> >> >> > >
ADD REPLY

Login before adding your answer.

Traffic: 703 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6