ChIPpeakAnno::getEnrichedGo crashes but I don't know why

0

Entering edit mode

Eric Cabot ▴ 40

@eric-cabot-4419

Last seen 10.1 years ago

I am a relatively new Bioconductor user and I am trying to analyze some ChIP-seq results that came from QuEST using the ChIPpeakAnno package. After importing the regions of interest into RangedData objects and doing the following: ENSEMBLE_GENES_MART<-useMart(biomart="ensembl",dataset="hsapiens_gene_ ensembl") ENSEMBL_ExonPlus_Annotation<-getAnnotation(ENSEMBLE_GENES_MART, featureType="ExonPlusUtr") I had no problem annotating and generating a Venn diagram to show the overlaps between my three sets of peaks. To annotate, I used: annotated_regions=annotatePeakInBatch(myranged, AnnotationData=ENSEMBL_ExonPlus_Annotation) But I cannot seem to get the getEnrichedGo method to work on this (or my other two annotated regions). Here is a typical command line: my_enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.db",m axP=0.01,multiAdj=TRUE,minGOterm=1, multiAdjMethod="BH",feature_id_type="ensembl_gene_id") and here is a typical error message: enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.db",maxP =0.01,multiAdj=TRUE,minGOterm=1,feature_id_type="ensembl_gene_id") Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { : argument is of length zero Which leads me to ask: 1) Is this error message supposed to be meaningful to me-i.e. a user- or is it something that I should be sending to the developer of the package? 2) Is there anything obvious from this that suggests what corrective action I should be taking? Eric Cabot University of Wisconsin

annotate ChIPpeakAnno annotate ChIPpeakAnno • 2.0k views

ADD COMMENT • link updated 13.8 years ago by Julie Zhu ★ 4.3k • written 13.8 years ago by Eric Cabot ▴ 40

0

Entering edit mode

Julie Zhu ★ 4.3k

@julie-zhu-3596

Last seen 11 months ago

United States

Hi Eric, Could you please post the session information with sessionInfo() command? Could you please also send a few ensembl IDs in your annotated dataset? Thanks! Best regards, Julie On 1/4/11 6:51 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: > I am a relatively new Bioconductor user and I am trying to analyze some > ChIP-seq results that came from QuEST using the ChIPpeakAnno package. > > After importing the regions of interest into RangedData objects and doing > the following: > > ENSEMBLE_GENES_MART<-useMart(biomart="ensembl",dataset="hsapiens_gene_ ensembl"> ) > ENSEMBL_ExonPlus_Annotation<-getAnnotation(ENSEMBLE_GENES_MART, > featureType="ExonPlusUtr") > > > I had no problem annotating and generating a Venn diagram to show the > overlaps between my three sets of peaks. To annotate, I used: > > annotated_regions=annotatePeakInBatch(myranged, > AnnotationData=ENSEMBL_ExonPlus_Annotation) > > > But I cannot seem to get the getEnrichedGo method to work on this (or my > other two annotated regions). Here is a typical command line: > > > my_enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.db" ,maxP=0.01 > ,multiAdj=TRUE,minGOterm=1, > multiAdjMethod="BH",feature_id_type="ensembl_gene_id") > > and here is a typical error message: > > enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.db",ma xP=0.01,mu > ltiAdj=TRUE,minGOterm=1,feature_id_type="ensembl_gene_id") > Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { : > argument is of length zero > > > Which leads me to ask: > > 1) Is this error message supposed to be meaningful to me-i.e. a user-or is > it something that I should be sending to the developer of the package? > > 2) Is there anything obvious from this that suggests what corrective > action I should be taking? > > > Eric Cabot > University of Wisconsin > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD COMMENT • link 13.8 years ago Julie Zhu ★ 4.3k

0

Entering edit mode

Hi Julie, Thank you for your response. Here is the sessionInfo and traceback output and also a few lines of "my_annotated_regions". Regards, Eric Cabot > sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ChIPpeakAnno_1.6.0 limma_3.6.9 [3] org.Hs.eg.db_2.4.6 GO.db_2.4.5 [5] RSQLite_0.9-4 DBI_0.2-5 [7] AnnotationDbi_1.12.0 BSgenome.Ecoli.NCBI.20080805_1.3.16 [9] BSgenome_1.18.2 GenomicRanges_1.2.2 [11] Biostrings_2.18.2 IRanges_1.8.8 [13] multtest_2.6.0 Biobase_2.10.0 [15] biomaRt_2.6.0 loaded via a namespace (and not attached): [1] MASS_7.3-9 RCurl_1.5-0 splines_2.12.1 survival_2.36-2 [5] tools_2.12.1 XML_3.2-0 > my_enrichedGO<-getEnrichedGO(my_annotated_regions,orgAnn="org.Hs.eg.db ",maxP=0.01,multiAdj=FALSE,minGOterm=1,feature_id_type="ensembl_gene_i d") Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { : argument is of length zero > traceback() 2: addAncestors(this.GO[this.GO[, 3] == "BP", ], "bp") 1: getEnrichedGO(FC2_annotated_regions, orgAnn = "org.Hs.eg.db", maxP = 0.01, multiAdj = FALSE, minGOterm = 1, feature_id_type = "ensembl_gene_id") > as.data.frame(my_annotated_regions[1:15,]) space start end width names peak strand 1 1 241997936 241998205 270 R-10060 ENSE00001749374 R-10060 + 2 1 237109743 237110002 260 R-10082 ENSE00001643382 R-10082 + 3 1 236080267 236080415 149 R-10086 ENSE00001807176 R-10086 + 4 1 233853245 233853514 270 R-10096 ENSE00001776382 R-10096 + 5 1 233727956 233728104 149 R-10097 ENSE00001442190 R-10097 + 6 1 230728554 230728823 270 R-10108 ENSE00001731401 R-10108 + 7 1 229687129 229687277 149 R-10113 ENSE00001439385 R-10113 + 8 1 228943263 228943412 150 R-10121 ENSE00001903546 R-10121 + 9 1 218358885 218359176 292 R-10157 ENSE00001439386 R-10157 + 10 1 212254259 212254408 150 R-10179 ENSE00001624346 R-10179 + 11 1 210086264 210086513 250 R-10184 ENSE00001903225 R-10184 + 12 1 209863549 209863698 150 R-10185 ENSE00001336255 R-10185 + 13 1 207437117 207437264 148 R-10190 ENSE00001742112 R-10190 + 14 1 190352400 190352548 149 R-10246 ENSE00001782518 R-10246 + 15 1 184432607 184432755 149 R-10260 ENSE00001283926 R-10260 + feature start_position end_position insideFeature distancetoFeature 1 ENSE00001749374 241995237 241996089 downstream 2699 2 ENSE00001643382 237144639 237145008 upstream -34896 3 ENSE00001807176 236078715 236078821 downstream 1552 4 ENSE00001776382 233807017 233807237 downstream 46228 5 ENSE00001442190 233749750 233750272 upstream -21794 6 ENSE00001731401 230728406 230728586 overlapEnd 148 7 ENSE00001439385 229685652 229685769 downstream 1477 8 ENSE00001903546 228882063 228882416 downstream 61200 9 ENSE00001439386 218303137 218303294 downstream 55748 10 ENSE00001624346 212253973 212254092 downstream 286 11 ENSE00001903225 210111538 210111622 upstream -25274 12 ENSE00001336255 209859550 209859630 downstream 3999 13 ENSE00001742112 207438342 207438381 upstream -1225 14 ENSE00001782518 190331193 190331400 downstream 21207 15 ENSE00001283926 184446520 184446737 upstream -13913 shortestDistance fromOverlappingOrNearest 1 1847 NearestStart 2 34637 NearestStart 3 1446 NearestStart 4 46008 NearestStart 5 21646 NearestStart 6 32 NearestStart 7 1360 NearestStart 8 60847 NearestStart 9 55591 NearestStart 10 167 NearestStart 11 25025 NearestStart 12 3919 NearestStart 13 1078 NearestStart 14 21000 NearestStart 15 13765 NearestStart Zhu, Lihua (Julie) wrote: > Hi Eric, > > Could you please post the session information with sessionInfo() command? > Could you please also send a few ensembl IDs in your annotated dataset? > Thanks! > > Best regards, > > Julie > > > On 1/4/11 6:51 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: > >> I am a relatively new Bioconductor user and I am trying to analyze some >> ChIP-seq results that came from QuEST using the ChIPpeakAnno package. >> >> After importing the regions of interest into RangedData objects and doing >> the following: >> >> > ENSEMBLE_GENES_MART<-useMart(biomart="ensembl",dataset="hsapiens_gen e_ensembl"> > ) >> ENSEMBL_ExonPlus_Annotation<-getAnnotation(ENSEMBLE_GENES_MART, >> featureType="ExonPlusUtr") >> >> >> I had no problem annotating and generating a Venn diagram to show the >> overlaps between my three sets of peaks. To annotate, I used: >> >> annotated_regions=annotatePeakInBatch(myranged, >> AnnotationData=ENSEMBL_ExonPlus_Annotation) >> >> >> But I cannot seem to get the getEnrichedGo method to work on this (or my >> other two annotated regions). Here is a typical command line: >> >> >> my_enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.db ",maxP=0.01 >> ,multiAdj=TRUE,minGOterm=1, >> multiAdjMethod="BH",feature_id_type="ensembl_gene_id") >> >> and here is a typical error message: >> >> enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.db",m axP=0.01,mu >> ltiAdj=TRUE,minGOterm=1,feature_id_type="ensembl_gene_id") >> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { : >> argument is of length zero >> >> >> Which leads me to ask: >> >> 1) Is this error message supposed to be meaningful to me-i.e. a user-or is >> it something that I should be sending to the developer of the package? >> >> 2) Is there anything obvious from this that suggests what corrective >> action I should be taking? >> >> >> Eric Cabot >> University of Wisconsin >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >

ADD REPLY • link 13.8 years ago Eric Cabot ▴ 40

0

Entering edit mode

Eric, The annotated dataset has exon ID instead of gene ID while the getEnrichedGO is expecting feature_id_type="ensembl_gene_id". For a list of supported feature_id_type, please type ?getEnrichedGO. To use getEnrichedGO function, first get the TSS annotation. TSS.human.NCBI36 = getAnnotation(ENSEMBLE_GENES_MART, featureType="TSS") or use the build in TSS as data(TSS.human.NCBI36) Then annotate your peaks with TSS.human.NCBI36 followed by getEnrichedGO call. Please let me know if this works for you. Best regards, Julie On 1/5/11 12:29 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: > Hi Julie, > > Thank you for your response. > > Here is the sessionInfo and traceback output and also a few lines of > "my_annotated_regions". > > Regards, > > Eric Cabot > >> sessionInfo() > R version 2.12.1 (2010-12-16) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] ChIPpeakAnno_1.6.0 limma_3.6.9 > [3] org.Hs.eg.db_2.4.6 GO.db_2.4.5 > [5] RSQLite_0.9-4 DBI_0.2-5 > [7] AnnotationDbi_1.12.0 BSgenome.Ecoli.NCBI.20080805_1.3.16 > [9] BSgenome_1.18.2 GenomicRanges_1.2.2 > [11] Biostrings_2.18.2 IRanges_1.8.8 > [13] multtest_2.6.0 Biobase_2.10.0 > [15] biomaRt_2.6.0 > > loaded via a namespace (and not attached): > [1] MASS_7.3-9 RCurl_1.5-0 splines_2.12.1 survival_2.36-2 > [5] tools_2.12.1 XML_3.2-0 >> > my_enrichedGO<-getEnrichedGO(my_annotated_regions,orgAnn="org.Hs.eg. db",maxP=0 > .01,multiAdj=FALSE,minGOterm=1,feature_id_type="ensembl_gene_id") > Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { : > argument is of length zero >> traceback() > 2: addAncestors(this.GO[this.GO[, 3] == "BP", ], "bp") > 1: getEnrichedGO(FC2_annotated_regions, orgAnn = "org.Hs.eg.db", > maxP = 0.01, multiAdj = FALSE, minGOterm = 1, feature_id_type = > "ensembl_gene_id") > > > >> as.data.frame(my_annotated_regions[1:15,]) > space start end width names peak strand > 1 1 241997936 241998205 270 R-10060 ENSE00001749374 R-10060 + > 2 1 237109743 237110002 260 R-10082 ENSE00001643382 R-10082 + > 3 1 236080267 236080415 149 R-10086 ENSE00001807176 R-10086 + > 4 1 233853245 233853514 270 R-10096 ENSE00001776382 R-10096 + > 5 1 233727956 233728104 149 R-10097 ENSE00001442190 R-10097 + > 6 1 230728554 230728823 270 R-10108 ENSE00001731401 R-10108 + > 7 1 229687129 229687277 149 R-10113 ENSE00001439385 R-10113 + > 8 1 228943263 228943412 150 R-10121 ENSE00001903546 R-10121 + > 9 1 218358885 218359176 292 R-10157 ENSE00001439386 R-10157 + > 10 1 212254259 212254408 150 R-10179 ENSE00001624346 R-10179 + > 11 1 210086264 210086513 250 R-10184 ENSE00001903225 R-10184 + > 12 1 209863549 209863698 150 R-10185 ENSE00001336255 R-10185 + > 13 1 207437117 207437264 148 R-10190 ENSE00001742112 R-10190 + > 14 1 190352400 190352548 149 R-10246 ENSE00001782518 R-10246 + > 15 1 184432607 184432755 149 R-10260 ENSE00001283926 R-10260 + > feature start_position end_position insideFeature > distancetoFeature > 1 ENSE00001749374 241995237 241996089 downstream 2699 > 2 ENSE00001643382 237144639 237145008 upstream -34896 > 3 ENSE00001807176 236078715 236078821 downstream 1552 > 4 ENSE00001776382 233807017 233807237 downstream 46228 > 5 ENSE00001442190 233749750 233750272 upstream -21794 > 6 ENSE00001731401 230728406 230728586 overlapEnd 148 > 7 ENSE00001439385 229685652 229685769 downstream 1477 > 8 ENSE00001903546 228882063 228882416 downstream 61200 > 9 ENSE00001439386 218303137 218303294 downstream 55748 > 10 ENSE00001624346 212253973 212254092 downstream 286 > 11 ENSE00001903225 210111538 210111622 upstream -25274 > 12 ENSE00001336255 209859550 209859630 downstream 3999 > 13 ENSE00001742112 207438342 207438381 upstream -1225 > 14 ENSE00001782518 190331193 190331400 downstream 21207 > 15 ENSE00001283926 184446520 184446737 upstream -13913 > shortestDistance fromOverlappingOrNearest > 1 1847 NearestStart > 2 34637 NearestStart > 3 1446 NearestStart > 4 46008 NearestStart > 5 21646 NearestStart > 6 32 NearestStart > 7 1360 NearestStart > 8 60847 NearestStart > 9 55591 NearestStart > 10 167 NearestStart > 11 25025 NearestStart > 12 3919 NearestStart > 13 1078 NearestStart > 14 21000 NearestStart > 15 13765 NearestStart > > > Zhu, Lihua (Julie) wrote: >> Hi Eric, >> >> Could you please post the session information with sessionInfo() command? >> Could you please also send a few ensembl IDs in your annotated dataset? >> Thanks! >> >> Best regards, >> >> Julie >> >> >> On 1/4/11 6:51 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: >> >>> I am a relatively new Bioconductor user and I am trying to analyze some >>> ChIP-seq results that came from QuEST using the ChIPpeakAnno package. >>> >>> After importing the regions of interest into RangedData objects and doing >>> the following: >>> >>> >> ENSEMBLE_GENES_MART<-useMart(biomart="ensembl",dataset="hsapiens_ge ne_ensembl >> "> >> ) >>> ENSEMBL_ExonPlus_Annotation<-getAnnotation(ENSEMBLE_GENES_MART, >>> featureType="ExonPlusUtr") >>> >>> >>> I had no problem annotating and generating a Venn diagram to show the >>> overlaps between my three sets of peaks. To annotate, I used: >>> >>> annotated_regions=annotatePeakInBatch(myranged, >>> AnnotationData=ENSEMBL_ExonPlus_Annotation) >>> >>> >>> But I cannot seem to get the getEnrichedGo method to work on this (or my >>> other two annotated regions). Here is a typical command line: >>> >>> >>> my_enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.d b",maxP=0. >>> 01 >>> ,multiAdj=TRUE,minGOterm=1, >>> multiAdjMethod="BH",feature_id_type="ensembl_gene_id") >>> >>> and here is a typical error message: >>> >>> enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.db", maxP=0.01, >>> mu >>> ltiAdj=TRUE,minGOterm=1,feature_id_type="ensembl_gene_id") >>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { : >>> argument is of length zero >>> >>> >>> Which leads me to ask: >>> >>> 1) Is this error message supposed to be meaningful to me-i.e. a user-or is >>> it something that I should be sending to the developer of the package? >>> >>> 2) Is there anything obvious from this that suggests what corrective >>> action I should be taking? >>> >>> >>> Eric Cabot >>> University of Wisconsin >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >

ADD REPLY • link 13.8 years ago Julie Zhu ★ 4.3k

0

Entering edit mode

Julie Zhu ★ 4.3k

@julie-zhu-3596

Last seen 11 months ago

United States

Eric, You could convert the exon IDs associated with the peaks to ensemble gene IDs and input these ensemble gene IDs to the getEnrichedGO function. Best regards, Julie On 1/5/11 4:09 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: > Hi Julie, > > It may be a while before I get back to you on this, because I did my > mapping and ChIP-Seq analysis with Hg19 (NCBI 37), not Hg18 (NCBI 36). > I'm also a little concerned about using transcription start site > annotations rather than exons, because the the binding domains are not > thought to be restricted to only promoters. Any suggestions? > > Eric > > > > Zhu, Lihua (Julie) wrote: >> Eric, >> >> The annotated dataset has exon ID instead of gene ID while the getEnrichedGO >> is expecting feature_id_type="ensembl_gene_id". For a list of supported >> feature_id_type, please type ?getEnrichedGO. >> >> To use getEnrichedGO function, first get the TSS annotation. >> >> TSS.human.NCBI36 = getAnnotation(ENSEMBLE_GENES_MART, featureType="TSS") >> >> or use the build in TSS as >> >> data(TSS.human.NCBI36) >> >> Then annotate your peaks with TSS.human.NCBI36 followed by getEnrichedGO >> call. >> >> Please let me know if this works for you. >> >> Best regards, >> >> Julie >> >> >> >> >> On 1/5/11 12:29 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: >> >>> Hi Julie, >>> >>> Thank you for your response. >>> >>> Here is the sessionInfo and traceback output and also a few lines of >>> "my_annotated_regions". >>> >>> Regards, >>> >>> Eric Cabot >>> >>>> sessionInfo() >>> R version 2.12.1 (2010-12-16) >>> Platform: x86_64-unknown-linux-gnu (64-bit) >>> >>> locale: >>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 >>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] ChIPpeakAnno_1.6.0 limma_3.6.9 >>> [3] org.Hs.eg.db_2.4.6 GO.db_2.4.5 >>> [5] RSQLite_0.9-4 DBI_0.2-5 >>> [7] AnnotationDbi_1.12.0 >>> BSgenome.Ecoli.NCBI.20080805_1.3.16 >>> [9] BSgenome_1.18.2 GenomicRanges_1.2.2 >>> [11] Biostrings_2.18.2 IRanges_1.8.8 >>> [13] multtest_2.6.0 Biobase_2.10.0 >>> [15] biomaRt_2.6.0 >>> >>> loaded via a namespace (and not attached): >>> [1] MASS_7.3-9 RCurl_1.5-0 splines_2.12.1 survival_2.36-2 >>> [5] tools_2.12.1 XML_3.2-0 >>> my_enrichedGO<-getEnrichedGO(my_annotated_regions,orgAnn="org.Hs.e g.db",maxP >>> =0 >>> .01,multiAdj=FALSE,minGOterm=1,feature_id_type="ensembl_gene_id") >>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { : >>> argument is of length zero >>>> traceback() >>> 2: addAncestors(this.GO[this.GO[, 3] == "BP", ], "bp") >>> 1: getEnrichedGO(FC2_annotated_regions, orgAnn = "org.Hs.eg.db", >>> maxP = 0.01, multiAdj = FALSE, minGOterm = 1, feature_id_type = >>> "ensembl_gene_id") >>> >>> >>> >>>> as.data.frame(my_annotated_regions[1:15,]) >>> space start end width names peak strand >>> 1 1 241997936 241998205 270 R-10060 ENSE00001749374 R-10060 + >>> 2 1 237109743 237110002 260 R-10082 ENSE00001643382 R-10082 + >>> 3 1 236080267 236080415 149 R-10086 ENSE00001807176 R-10086 + >>> 4 1 233853245 233853514 270 R-10096 ENSE00001776382 R-10096 + >>> 5 1 233727956 233728104 149 R-10097 ENSE00001442190 R-10097 + >>> 6 1 230728554 230728823 270 R-10108 ENSE00001731401 R-10108 + >>> 7 1 229687129 229687277 149 R-10113 ENSE00001439385 R-10113 + >>> 8 1 228943263 228943412 150 R-10121 ENSE00001903546 R-10121 + >>> 9 1 218358885 218359176 292 R-10157 ENSE00001439386 R-10157 + >>> 10 1 212254259 212254408 150 R-10179 ENSE00001624346 R-10179 + >>> 11 1 210086264 210086513 250 R-10184 ENSE00001903225 R-10184 + >>> 12 1 209863549 209863698 150 R-10185 ENSE00001336255 R-10185 + >>> 13 1 207437117 207437264 148 R-10190 ENSE00001742112 R-10190 + >>> 14 1 190352400 190352548 149 R-10246 ENSE00001782518 R-10246 + >>> 15 1 184432607 184432755 149 R-10260 ENSE00001283926 R-10260 + >>> feature start_position end_position insideFeature >>> distancetoFeature >>> 1 ENSE00001749374 241995237 241996089 downstream >>> 2699 >>> 2 ENSE00001643382 237144639 237145008 upstream >>> -34896 >>> 3 ENSE00001807176 236078715 236078821 downstream >>> 1552 >>> 4 ENSE00001776382 233807017 233807237 downstream >>> 46228 >>> 5 ENSE00001442190 233749750 233750272 upstream >>> -21794 >>> 6 ENSE00001731401 230728406 230728586 overlapEnd >>> 148 >>> 7 ENSE00001439385 229685652 229685769 downstream >>> 1477 >>> 8 ENSE00001903546 228882063 228882416 downstream >>> 61200 >>> 9 ENSE00001439386 218303137 218303294 downstream >>> 55748 >>> 10 ENSE00001624346 212253973 212254092 downstream >>> 286 >>> 11 ENSE00001903225 210111538 210111622 upstream >>> -25274 >>> 12 ENSE00001336255 209859550 209859630 downstream >>> 3999 >>> 13 ENSE00001742112 207438342 207438381 upstream >>> -1225 >>> 14 ENSE00001782518 190331193 190331400 downstream >>> 21207 >>> 15 ENSE00001283926 184446520 184446737 upstream >>> -13913 >>> shortestDistance fromOverlappingOrNearest >>> 1 1847 NearestStart >>> 2 34637 NearestStart >>> 3 1446 NearestStart >>> 4 46008 NearestStart >>> 5 21646 NearestStart >>> 6 32 NearestStart >>> 7 1360 NearestStart >>> 8 60847 NearestStart >>> 9 55591 NearestStart >>> 10 167 NearestStart >>> 11 25025 NearestStart >>> 12 3919 NearestStart >>> 13 1078 NearestStart >>> 14 21000 NearestStart >>> 15 13765 NearestStart >>> >>> >>> Zhu, Lihua (Julie) wrote: >>>> Hi Eric, >>>> >>>> Could you please post the session information with sessionInfo() command? >>>> Could you please also send a few ensembl IDs in your annotated dataset? >>>> Thanks! >>>> >>>> Best regards, >>>> >>>> Julie >>>> >>>> >>>> On 1/4/11 6:51 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: >>>> >>>>> I am a relatively new Bioconductor user and I am trying to analyze some >>>>> ChIP-seq results that came from QuEST using the ChIPpeakAnno package. >>>>> >>>>> After importing the regions of interest into RangedData objects and doing >>>>> the following: >>>>> >>>>> >>>> ENSEMBLE_GENES_MART<-useMart(biomart="ensembl",dataset="hsapiens_ gene_ensem >>>> bl >>>> "> >>>> ) >>>>> ENSEMBL_ExonPlus_Annotation<-getAnnotation(ENSEMBLE_GENES_MART, >>>>> featureType="ExonPlusUtr") >>>>> >>>>> >>>>> I had no problem annotating and generating a Venn diagram to show the >>>>> overlaps between my three sets of peaks. To annotate, I used: >>>>> >>>>> annotated_regions=annotatePeakInBatch(myranged, >>>>> AnnotationData=ENSEMBL_ExonPlus_Annotation) >>>>> >>>>> >>>>> But I cannot seem to get the getEnrichedGo method to work on this (or my >>>>> other two annotated regions). Here is a typical command line: >>>>> >>>>> >>>>> my_enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg .db",maxP= >>>>> 0. >>>>> 01 >>>>> ,multiAdj=TRUE,minGOterm=1, >>>>> multiAdjMethod="BH",feature_id_type="ensembl_gene_id") >>>>> >>>>> and here is a typical error message: >>>>> >>>>> enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.db ",maxP=0.0 >>>>> 1, >>>>> mu >>>>> ltiAdj=TRUE,minGOterm=1,feature_id_type="ensembl_gene_id") >>>>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { : >>>>> argument is of length zero >>>>> >>>>> >>>>> Which leads me to ask: >>>>> >>>>> 1) Is this error message supposed to be meaningful to me-i.e. a user-or is >>>>> it something that I should be sending to the developer of the package? >>>>> >>>>> 2) Is there anything obvious from this that suggests what corrective >>>>> action I should be taking? >>>>> >>>>> >>>>> Eric Cabot >>>>> University of Wisconsin >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>> >> >> >

ADD COMMENT • link 13.8 years ago Julie Zhu ★ 4.3k

0

Entering edit mode

Julie, I don't know how to do that in R. I guess I can always export the annotations, do the association in Perl and re-import them as a vector to use with getEnrichedGo. Eric Zhu, Lihua (Julie) wrote: > Eric, > > You could convert the exon IDs associated with the peaks to ensemble gene > IDs and input these ensemble gene IDs to the getEnrichedGO function. > > Best regards, > > Julie > > > On 1/5/11 4:09 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: > >> Hi Julie, >> >> It may be a while before I get back to you on this, because I did my >> mapping and ChIP-Seq analysis with Hg19 (NCBI 37), not Hg18 (NCBI 36). >> I'm also a little concerned about using transcription start site >> annotations rather than exons, because the the binding domains are not >> thought to be restricted to only promoters. Any suggestions? >> >> Eric >> >> >> >> Zhu, Lihua (Julie) wrote: >>> Eric, >>> >>> The annotated dataset has exon ID instead of gene ID while the getEnrichedGO >>> is expecting feature_id_type="ensembl_gene_id". For a list of supported >>> feature_id_type, please type ?getEnrichedGO. >>> >>> To use getEnrichedGO function, first get the TSS annotation. >>> >>> TSS.human.NCBI36 = getAnnotation(ENSEMBLE_GENES_MART, featureType="TSS") >>> >>> or use the build in TSS as >>> >>> data(TSS.human.NCBI36) >>> >>> Then annotate your peaks with TSS.human.NCBI36 followed by getEnrichedGO >>> call. >>> >>> Please let me know if this works for you. >>> >>> Best regards, >>> >>> Julie >>> >>> >>> >>> >>> On 1/5/11 12:29 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: >>> >>>> Hi Julie, >>>> >>>> Thank you for your response. >>>> >>>> Here is the sessionInfo and traceback output and also a few lines of >>>> "my_annotated_regions". >>>> >>>> Regards, >>>> >>>> Eric Cabot >>>> >>>>> sessionInfo() >>>> R version 2.12.1 (2010-12-16) >>>> Platform: x86_64-unknown-linux-gnu (64-bit) >>>> >>>> locale: >>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 >>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> other attached packages: >>>> [1] ChIPpeakAnno_1.6.0 limma_3.6.9 >>>> [3] org.Hs.eg.db_2.4.6 GO.db_2.4.5 >>>> [5] RSQLite_0.9-4 DBI_0.2-5 >>>> [7] AnnotationDbi_1.12.0 >>>> BSgenome.Ecoli.NCBI.20080805_1.3.16 >>>> [9] BSgenome_1.18.2 GenomicRanges_1.2.2 >>>> [11] Biostrings_2.18.2 IRanges_1.8.8 >>>> [13] multtest_2.6.0 Biobase_2.10.0 >>>> [15] biomaRt_2.6.0 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] MASS_7.3-9 RCurl_1.5-0 splines_2.12.1 survival_2.36-2 >>>> [5] tools_2.12.1 XML_3.2-0 >>>> my_enrichedGO<-getEnrichedGO(my_annotated_regions,orgAnn="org.Hs. eg.db",maxP >>>> =0 >>>> .01,multiAdj=FALSE,minGOterm=1,feature_id_type="ensembl_gene_id") >>>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { : >>>> argument is of length zero >>>>> traceback() >>>> 2: addAncestors(this.GO[this.GO[, 3] == "BP", ], "bp") >>>> 1: getEnrichedGO(FC2_annotated_regions, orgAnn = "org.Hs.eg.db", >>>> maxP = 0.01, multiAdj = FALSE, minGOterm = 1, feature_id_type = >>>> "ensembl_gene_id") >>>> >>>> >>>> >>>>> as.data.frame(my_annotated_regions[1:15,]) >>>> space start end width names peak strand >>>> 1 1 241997936 241998205 270 R-10060 ENSE00001749374 R-10060 + >>>> 2 1 237109743 237110002 260 R-10082 ENSE00001643382 R-10082 + >>>> 3 1 236080267 236080415 149 R-10086 ENSE00001807176 R-10086 + >>>> 4 1 233853245 233853514 270 R-10096 ENSE00001776382 R-10096 + >>>> 5 1 233727956 233728104 149 R-10097 ENSE00001442190 R-10097 + >>>> 6 1 230728554 230728823 270 R-10108 ENSE00001731401 R-10108 + >>>> 7 1 229687129 229687277 149 R-10113 ENSE00001439385 R-10113 + >>>> 8 1 228943263 228943412 150 R-10121 ENSE00001903546 R-10121 + >>>> 9 1 218358885 218359176 292 R-10157 ENSE00001439386 R-10157 + >>>> 10 1 212254259 212254408 150 R-10179 ENSE00001624346 R-10179 + >>>> 11 1 210086264 210086513 250 R-10184 ENSE00001903225 R-10184 + >>>> 12 1 209863549 209863698 150 R-10185 ENSE00001336255 R-10185 + >>>> 13 1 207437117 207437264 148 R-10190 ENSE00001742112 R-10190 + >>>> 14 1 190352400 190352548 149 R-10246 ENSE00001782518 R-10246 + >>>> 15 1 184432607 184432755 149 R-10260 ENSE00001283926 R-10260 + >>>> feature start_position end_position insideFeature >>>> distancetoFeature >>>> 1 ENSE00001749374 241995237 241996089 downstream >>>> 2699 >>>> 2 ENSE00001643382 237144639 237145008 upstream >>>> -34896 >>>> 3 ENSE00001807176 236078715 236078821 downstream >>>> 1552 >>>> 4 ENSE00001776382 233807017 233807237 downstream >>>> 46228 >>>> 5 ENSE00001442190 233749750 233750272 upstream >>>> -21794 >>>> 6 ENSE00001731401 230728406 230728586 overlapEnd >>>> 148 >>>> 7 ENSE00001439385 229685652 229685769 downstream >>>> 1477 >>>> 8 ENSE00001903546 228882063 228882416 downstream >>>> 61200 >>>> 9 ENSE00001439386 218303137 218303294 downstream >>>> 55748 >>>> 10 ENSE00001624346 212253973 212254092 downstream >>>> 286 >>>> 11 ENSE00001903225 210111538 210111622 upstream >>>> -25274 >>>> 12 ENSE00001336255 209859550 209859630 downstream >>>> 3999 >>>> 13 ENSE00001742112 207438342 207438381 upstream >>>> -1225 >>>> 14 ENSE00001782518 190331193 190331400 downstream >>>> 21207 >>>> 15 ENSE00001283926 184446520 184446737 upstream >>>> -13913 >>>> shortestDistance fromOverlappingOrNearest >>>> 1 1847 NearestStart >>>> 2 34637 NearestStart >>>> 3 1446 NearestStart >>>> 4 46008 NearestStart >>>> 5 21646 NearestStart >>>> 6 32 NearestStart >>>> 7 1360 NearestStart >>>> 8 60847 NearestStart >>>> 9 55591 NearestStart >>>> 10 167 NearestStart >>>> 11 25025 NearestStart >>>> 12 3919 NearestStart >>>> 13 1078 NearestStart >>>> 14 21000 NearestStart >>>> 15 13765 NearestStart >>>> >>>> >>>> Zhu, Lihua (Julie) wrote: >>>>> Hi Eric, >>>>> >>>>> Could you please post the session information with sessionInfo() command? >>>>> Could you please also send a few ensembl IDs in your annotated dataset? >>>>> Thanks! >>>>> >>>>> Best regards, >>>>> >>>>> Julie >>>>> >>>>> >>>>> On 1/4/11 6:51 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: >>>>> >>>>>> I am a relatively new Bioconductor user and I am trying to analyze some >>>>>> ChIP-seq results that came from QuEST using the ChIPpeakAnno package. >>>>>> >>>>>> After importing the regions of interest into RangedData objects and doing >>>>>> the following: >>>>>> >>>>>> >>>>> ENSEMBLE_GENES_MART<-useMart(biomart="ensembl",dataset="hsapiens _gene_ensem >>>>> bl >>>>> "> >>>>> ) >>>>>> ENSEMBL_ExonPlus_Annotation<-getAnnotation(ENSEMBLE_GENES_MART, >>>>>> featureType="ExonPlusUtr") >>>>>> >>>>>> >>>>>> I had no problem annotating and generating a Venn diagram to show the >>>>>> overlaps between my three sets of peaks. To annotate, I used: >>>>>> >>>>>> annotated_regions=annotatePeakInBatch(myranged, >>>>>> AnnotationData=ENSEMBL_ExonPlus_Annotation) >>>>>> >>>>>> >>>>>> But I cannot seem to get the getEnrichedGo method to work on this (or my >>>>>> other two annotated regions). Here is a typical command line: >>>>>> >>>>>> >>>>>> my_enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.e g.db",maxP= >>>>>> 0. >>>>>> 01 >>>>>> ,multiAdj=TRUE,minGOterm=1, >>>>>> multiAdjMethod="BH",feature_id_type="ensembl_gene_id") >>>>>> >>>>>> and here is a typical error message: >>>>>> >>>>>> enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.d b",maxP=0.0 >>>>>> 1, >>>>>> mu >>>>>> ltiAdj=TRUE,minGOterm=1,feature_id_type="ensembl_gene_id") >>>>>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { : >>>>>> argument is of length zero >>>>>> >>>>>> >>>>>> Which leads me to ask: >>>>>> >>>>>> 1) Is this error message supposed to be meaningful to me-i.e. a user-or is >>>>>> it something that I should be sending to the developer of the package? >>>>>> >>>>>> 2) Is there anything obvious from this that suggests what corrective >>>>>> action I should be taking? >>>>>> >>>>>> >>>>>> Eric Cabot >>>>>> University of Wisconsin >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor at r-project.org >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>> >>> > >

ADD REPLY • link 13.8 years ago Eric Cabot ▴ 40

0

Entering edit mode

Julie, In the end, I did a quick detour through Perl and getEnrichedGo worked as expected. Thanks for the help. Eric Eric Cabot wrote: > Julie, > > I don't know how to do that in R. I guess I can always export the > annotations, do the association in Perl and re-import them as a vector > to use with getEnrichedGo. > > Eric > > Zhu, Lihua (Julie) wrote: >> Eric, >> >> You could convert the exon IDs associated with the peaks to ensemble gene >> IDs and input these ensemble gene IDs to the getEnrichedGO function. >> >> Best regards, >> >> Julie >> >> >> On 1/5/11 4:09 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: >> >>> Hi Julie, >>> >>> It may be a while before I get back to you on this, because I did my >>> mapping and ChIP-Seq analysis with Hg19 (NCBI 37), not Hg18 (NCBI 36). >>> I'm also a little concerned about using transcription start site >>> annotations rather than exons, because the the binding domains are not >>> thought to be restricted to only promoters. Any suggestions? >>> >>> Eric >>> >>> >>> >>> Zhu, Lihua (Julie) wrote: >>>> Eric, >>>> >>>> The annotated dataset has exon ID instead of gene ID while the >>>> getEnrichedGO >>>> is expecting feature_id_type="ensembl_gene_id". For a list of supported >>>> feature_id_type, please type ?getEnrichedGO. >>>> >>>> To use getEnrichedGO function, first get the TSS annotation. >>>> >>>> TSS.human.NCBI36 = getAnnotation(ENSEMBLE_GENES_MART, >>>> featureType="TSS") >>>> >>>> or use the build in TSS as >>>> >>>> data(TSS.human.NCBI36) >>>> >>>> Then annotate your peaks with TSS.human.NCBI36 followed by >>>> getEnrichedGO >>>> call. >>>> >>>> Please let me know if this works for you. >>>> >>>> Best regards, >>>> >>>> Julie >>>> >>>> >>>> >>>> >>>> On 1/5/11 12:29 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: >>>> >>>>> Hi Julie, >>>>> >>>>> Thank you for your response. >>>>> >>>>> Here is the sessionInfo and traceback output and also a few lines of >>>>> "my_annotated_regions". >>>>> >>>>> Regards, >>>>> >>>>> Eric Cabot >>>>> >>>>>> sessionInfo() >>>>> R version 2.12.1 (2010-12-16) >>>>> Platform: x86_64-unknown-linux-gnu (64-bit) >>>>> >>>>> locale: >>>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 >>>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>>>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>>>> >>>>> attached base packages: >>>>> [1] stats graphics grDevices utils datasets methods base >>>>> >>>>> other attached packages: >>>>> [1] ChIPpeakAnno_1.6.0 limma_3.6.9 >>>>> [3] org.Hs.eg.db_2.4.6 GO.db_2.4.5 >>>>> [5] RSQLite_0.9-4 DBI_0.2-5 >>>>> [7] AnnotationDbi_1.12.0 >>>>> BSgenome.Ecoli.NCBI.20080805_1.3.16 >>>>> [9] BSgenome_1.18.2 GenomicRanges_1.2.2 >>>>> [11] Biostrings_2.18.2 IRanges_1.8.8 >>>>> [13] multtest_2.6.0 Biobase_2.10.0 >>>>> [15] biomaRt_2.6.0 >>>>> >>>>> loaded via a namespace (and not attached): >>>>> [1] MASS_7.3-9 RCurl_1.5-0 splines_2.12.1 survival_2.36-2 >>>>> [5] tools_2.12.1 XML_3.2-0 >>>>> my_enrichedGO<-getEnrichedGO(my_annotated_regions,orgAnn="org.Hs .eg.db",maxP >>>>> >>>>> =0 >>>>> .01,multiAdj=FALSE,minGOterm=1,feature_id_type="ensembl_gene_id") >>>>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { : >>>>> argument is of length zero >>>>>> traceback() >>>>> 2: addAncestors(this.GO[this.GO[, 3] == "BP", ], "bp") >>>>> 1: getEnrichedGO(FC2_annotated_regions, orgAnn = "org.Hs.eg.db", >>>>> maxP = 0.01, multiAdj = FALSE, minGOterm = 1, feature_id_type = >>>>> "ensembl_gene_id") >>>>> >>>>> >>>>> >>>>>> as.data.frame(my_annotated_regions[1:15,]) >>>>> space start end width names peak strand >>>>> 1 1 241997936 241998205 270 R-10060 ENSE00001749374 R-10060 + >>>>> 2 1 237109743 237110002 260 R-10082 ENSE00001643382 R-10082 + >>>>> 3 1 236080267 236080415 149 R-10086 ENSE00001807176 R-10086 + >>>>> 4 1 233853245 233853514 270 R-10096 ENSE00001776382 R-10096 + >>>>> 5 1 233727956 233728104 149 R-10097 ENSE00001442190 R-10097 + >>>>> 6 1 230728554 230728823 270 R-10108 ENSE00001731401 R-10108 + >>>>> 7 1 229687129 229687277 149 R-10113 ENSE00001439385 R-10113 + >>>>> 8 1 228943263 228943412 150 R-10121 ENSE00001903546 R-10121 + >>>>> 9 1 218358885 218359176 292 R-10157 ENSE00001439386 R-10157 + >>>>> 10 1 212254259 212254408 150 R-10179 ENSE00001624346 R-10179 + >>>>> 11 1 210086264 210086513 250 R-10184 ENSE00001903225 R-10184 + >>>>> 12 1 209863549 209863698 150 R-10185 ENSE00001336255 R-10185 + >>>>> 13 1 207437117 207437264 148 R-10190 ENSE00001742112 R-10190 + >>>>> 14 1 190352400 190352548 149 R-10246 ENSE00001782518 R-10246 + >>>>> 15 1 184432607 184432755 149 R-10260 ENSE00001283926 R-10260 + >>>>> feature start_position end_position insideFeature >>>>> distancetoFeature >>>>> 1 ENSE00001749374 241995237 241996089 downstream >>>>> 2699 >>>>> 2 ENSE00001643382 237144639 237145008 upstream >>>>> -34896 >>>>> 3 ENSE00001807176 236078715 236078821 downstream >>>>> 1552 >>>>> 4 ENSE00001776382 233807017 233807237 downstream >>>>> 46228 >>>>> 5 ENSE00001442190 233749750 233750272 upstream >>>>> -21794 >>>>> 6 ENSE00001731401 230728406 230728586 overlapEnd >>>>> 148 >>>>> 7 ENSE00001439385 229685652 229685769 downstream >>>>> 1477 >>>>> 8 ENSE00001903546 228882063 228882416 downstream >>>>> 61200 >>>>> 9 ENSE00001439386 218303137 218303294 downstream >>>>> 55748 >>>>> 10 ENSE00001624346 212253973 212254092 downstream >>>>> 286 >>>>> 11 ENSE00001903225 210111538 210111622 upstream >>>>> -25274 >>>>> 12 ENSE00001336255 209859550 209859630 downstream >>>>> 3999 >>>>> 13 ENSE00001742112 207438342 207438381 upstream >>>>> -1225 >>>>> 14 ENSE00001782518 190331193 190331400 downstream >>>>> 21207 >>>>> 15 ENSE00001283926 184446520 184446737 upstream >>>>> -13913 >>>>> shortestDistance fromOverlappingOrNearest >>>>> 1 1847 NearestStart >>>>> 2 34637 NearestStart >>>>> 3 1446 NearestStart >>>>> 4 46008 NearestStart >>>>> 5 21646 NearestStart >>>>> 6 32 NearestStart >>>>> 7 1360 NearestStart >>>>> 8 60847 NearestStart >>>>> 9 55591 NearestStart >>>>> 10 167 NearestStart >>>>> 11 25025 NearestStart >>>>> 12 3919 NearestStart >>>>> 13 1078 NearestStart >>>>> 14 21000 NearestStart >>>>> 15 13765 NearestStart >>>>> >>>>> >>>>> Zhu, Lihua (Julie) wrote: >>>>>> Hi Eric, >>>>>> >>>>>> Could you please post the session information with sessionInfo() >>>>>> command? >>>>>> Could you please also send a few ensembl IDs in your annotated >>>>>> dataset? >>>>>> Thanks! >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Julie >>>>>> >>>>>> >>>>>> On 1/4/11 6:51 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: >>>>>> >>>>>>> I am a relatively new Bioconductor user and I am trying to >>>>>>> analyze some >>>>>>> ChIP-seq results that came from QuEST using the ChIPpeakAnno >>>>>>> package. >>>>>>> >>>>>>> After importing the regions of interest into RangedData objects >>>>>>> and doing >>>>>>> the following: >>>>>>> >>>>>>> >>>>>> ENSEMBLE_GENES_MART<-useMart(biomart="ensembl",dataset="hsapien s_gene_ensem >>>>>> >>>>>> bl >>>>>> "> >>>>>> ) >>>>>>> ENSEMBL_ExonPlus_Annotation<-getAnnotation(ENSEMBLE_GENES_MART, >>>>>>> featureType="ExonPlusUtr") >>>>>>> >>>>>>> >>>>>>> I had no problem annotating and generating a Venn diagram to show >>>>>>> the >>>>>>> overlaps between my three sets of peaks. To annotate, I used: >>>>>>> >>>>>>> annotated_regions=annotatePeakInBatch(myranged, >>>>>>> AnnotationData=ENSEMBL_ExonPlus_Annotation) >>>>>>> >>>>>>> >>>>>>> But I cannot seem to get the getEnrichedGo method to work on this >>>>>>> (or my >>>>>>> other two annotated regions). Here is a typical command line: >>>>>>> >>>>>>> >>>>>>> my_enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs. eg.db",maxP= >>>>>>> >>>>>>> 0. >>>>>>> 01 >>>>>>> ,multiAdj=TRUE,minGOterm=1, >>>>>>> multiAdjMethod="BH",feature_id_type="ensembl_gene_id") >>>>>>> >>>>>>> and here is a typical error message: >>>>>>> >>>>>>> enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg. db",maxP=0.0 >>>>>>> >>>>>>> 1, >>>>>>> mu >>>>>>> ltiAdj=TRUE,minGOterm=1,feature_id_type="ensembl_gene_id") >>>>>>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { : >>>>>>> argument is of length zero >>>>>>> >>>>>>> >>>>>>> Which leads me to ask: >>>>>>> >>>>>>> 1) Is this error message supposed to be meaningful to me-i.e. a >>>>>>> user-or is >>>>>>> it something that I should be sending to the developer of the >>>>>>> package? >>>>>>> >>>>>>> 2) Is there anything obvious from this that suggests what corrective >>>>>>> action I should be taking? >>>>>>> >>>>>>> >>>>>>> Eric Cabot >>>>>>> University of Wisconsin >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioconductor mailing list >>>>>>> Bioconductor at r-project.org >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>> Search the archives: >>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>> >>>> >> >>

ADD REPLY • link 13.8 years ago Eric Cabot ▴ 40

0

Entering edit mode

Eric, You might want to take a look at the getBM function in biomaRt package to see if you can do the conversion. Thanks! Best regards, Julie On 1/5/11 5:31 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: > Julie, > > I don't know how to do that in R. I guess I can always export the > annotations, do the association in Perl and re-import them as a vector to > use with getEnrichedGo. > > Eric > > Zhu, Lihua (Julie) wrote: >> Eric, >> >> You could convert the exon IDs associated with the peaks to ensemble gene >> IDs and input these ensemble gene IDs to the getEnrichedGO function. >> >> Best regards, >> >> Julie >> >> >> On 1/5/11 4:09 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: >> >>> Hi Julie, >>> >>> It may be a while before I get back to you on this, because I did my >>> mapping and ChIP-Seq analysis with Hg19 (NCBI 37), not Hg18 (NCBI 36). >>> I'm also a little concerned about using transcription start site >>> annotations rather than exons, because the the binding domains are not >>> thought to be restricted to only promoters. Any suggestions? >>> >>> Eric >>> >>> >>> >>> Zhu, Lihua (Julie) wrote: >>>> Eric, >>>> >>>> The annotated dataset has exon ID instead of gene ID while the >>>> getEnrichedGO >>>> is expecting feature_id_type="ensembl_gene_id". For a list of supported >>>> feature_id_type, please type ?getEnrichedGO. >>>> >>>> To use getEnrichedGO function, first get the TSS annotation. >>>> >>>> TSS.human.NCBI36 = getAnnotation(ENSEMBLE_GENES_MART, featureType="TSS") >>>> >>>> or use the build in TSS as >>>> >>>> data(TSS.human.NCBI36) >>>> >>>> Then annotate your peaks with TSS.human.NCBI36 followed by getEnrichedGO >>>> call. >>>> >>>> Please let me know if this works for you. >>>> >>>> Best regards, >>>> >>>> Julie >>>> >>>> >>>> >>>> >>>> On 1/5/11 12:29 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: >>>> >>>>> Hi Julie, >>>>> >>>>> Thank you for your response. >>>>> >>>>> Here is the sessionInfo and traceback output and also a few lines of >>>>> "my_annotated_regions". >>>>> >>>>> Regards, >>>>> >>>>> Eric Cabot >>>>> >>>>>> sessionInfo() >>>>> R version 2.12.1 (2010-12-16) >>>>> Platform: x86_64-unknown-linux-gnu (64-bit) >>>>> >>>>> locale: >>>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 >>>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>>>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>>>> >>>>> attached base packages: >>>>> [1] stats graphics grDevices utils datasets methods base >>>>> >>>>> other attached packages: >>>>> [1] ChIPpeakAnno_1.6.0 limma_3.6.9 >>>>> [3] org.Hs.eg.db_2.4.6 GO.db_2.4.5 >>>>> [5] RSQLite_0.9-4 DBI_0.2-5 >>>>> [7] AnnotationDbi_1.12.0 >>>>> BSgenome.Ecoli.NCBI.20080805_1.3.16 >>>>> [9] BSgenome_1.18.2 GenomicRanges_1.2.2 >>>>> [11] Biostrings_2.18.2 IRanges_1.8.8 >>>>> [13] multtest_2.6.0 Biobase_2.10.0 >>>>> [15] biomaRt_2.6.0 >>>>> >>>>> loaded via a namespace (and not attached): >>>>> [1] MASS_7.3-9 RCurl_1.5-0 splines_2.12.1 survival_2.36-2 >>>>> [5] tools_2.12.1 XML_3.2-0 >>>>> my_enrichedGO<-getEnrichedGO(my_annotated_regions,orgAnn="org.Hs .eg.db",ma >>>>> xP >>>>> =0 >>>>> .01,multiAdj=FALSE,minGOterm=1,feature_id_type="ensembl_gene_id") >>>>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { : >>>>> argument is of length zero >>>>>> traceback() >>>>> 2: addAncestors(this.GO[this.GO[, 3] == "BP", ], "bp") >>>>> 1: getEnrichedGO(FC2_annotated_regions, orgAnn = "org.Hs.eg.db", >>>>> maxP = 0.01, multiAdj = FALSE, minGOterm = 1, feature_id_type = >>>>> "ensembl_gene_id") >>>>> >>>>> >>>>> >>>>>> as.data.frame(my_annotated_regions[1:15,]) >>>>> space start end width names peak strand >>>>> 1 1 241997936 241998205 270 R-10060 ENSE00001749374 R-10060 + >>>>> 2 1 237109743 237110002 260 R-10082 ENSE00001643382 R-10082 + >>>>> 3 1 236080267 236080415 149 R-10086 ENSE00001807176 R-10086 + >>>>> 4 1 233853245 233853514 270 R-10096 ENSE00001776382 R-10096 + >>>>> 5 1 233727956 233728104 149 R-10097 ENSE00001442190 R-10097 + >>>>> 6 1 230728554 230728823 270 R-10108 ENSE00001731401 R-10108 + >>>>> 7 1 229687129 229687277 149 R-10113 ENSE00001439385 R-10113 + >>>>> 8 1 228943263 228943412 150 R-10121 ENSE00001903546 R-10121 + >>>>> 9 1 218358885 218359176 292 R-10157 ENSE00001439386 R-10157 + >>>>> 10 1 212254259 212254408 150 R-10179 ENSE00001624346 R-10179 + >>>>> 11 1 210086264 210086513 250 R-10184 ENSE00001903225 R-10184 + >>>>> 12 1 209863549 209863698 150 R-10185 ENSE00001336255 R-10185 + >>>>> 13 1 207437117 207437264 148 R-10190 ENSE00001742112 R-10190 + >>>>> 14 1 190352400 190352548 149 R-10246 ENSE00001782518 R-10246 + >>>>> 15 1 184432607 184432755 149 R-10260 ENSE00001283926 R-10260 + >>>>> feature start_position end_position insideFeature >>>>> distancetoFeature >>>>> 1 ENSE00001749374 241995237 241996089 downstream >>>>> 2699 >>>>> 2 ENSE00001643382 237144639 237145008 upstream >>>>> -34896 >>>>> 3 ENSE00001807176 236078715 236078821 downstream >>>>> 1552 >>>>> 4 ENSE00001776382 233807017 233807237 downstream >>>>> 46228 >>>>> 5 ENSE00001442190 233749750 233750272 upstream >>>>> -21794 >>>>> 6 ENSE00001731401 230728406 230728586 overlapEnd >>>>> 148 >>>>> 7 ENSE00001439385 229685652 229685769 downstream >>>>> 1477 >>>>> 8 ENSE00001903546 228882063 228882416 downstream >>>>> 61200 >>>>> 9 ENSE00001439386 218303137 218303294 downstream >>>>> 55748 >>>>> 10 ENSE00001624346 212253973 212254092 downstream >>>>> 286 >>>>> 11 ENSE00001903225 210111538 210111622 upstream >>>>> -25274 >>>>> 12 ENSE00001336255 209859550 209859630 downstream >>>>> 3999 >>>>> 13 ENSE00001742112 207438342 207438381 upstream >>>>> -1225 >>>>> 14 ENSE00001782518 190331193 190331400 downstream >>>>> 21207 >>>>> 15 ENSE00001283926 184446520 184446737 upstream >>>>> -13913 >>>>> shortestDistance fromOverlappingOrNearest >>>>> 1 1847 NearestStart >>>>> 2 34637 NearestStart >>>>> 3 1446 NearestStart >>>>> 4 46008 NearestStart >>>>> 5 21646 NearestStart >>>>> 6 32 NearestStart >>>>> 7 1360 NearestStart >>>>> 8 60847 NearestStart >>>>> 9 55591 NearestStart >>>>> 10 167 NearestStart >>>>> 11 25025 NearestStart >>>>> 12 3919 NearestStart >>>>> 13 1078 NearestStart >>>>> 14 21000 NearestStart >>>>> 15 13765 NearestStart >>>>> >>>>> >>>>> Zhu, Lihua (Julie) wrote: >>>>>> Hi Eric, >>>>>> >>>>>> Could you please post the session information with sessionInfo() command? >>>>>> Could you please also send a few ensembl IDs in your annotated dataset? >>>>>> Thanks! >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Julie >>>>>> >>>>>> >>>>>> On 1/4/11 6:51 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote: >>>>>> >>>>>>> I am a relatively new Bioconductor user and I am trying to analyze some >>>>>>> ChIP-seq results that came from QuEST using the ChIPpeakAnno package. >>>>>>> >>>>>>> After importing the regions of interest into RangedData objects and >>>>>>> doing >>>>>>> the following: >>>>>>> >>>>>>> >>>>>> ENSEMBLE_GENES_MART<-useMart(biomart="ensembl",dataset="hsapien s_gene_ens >>>>>> em >>>>>> bl >>>>>> "> >>>>>> ) >>>>>>> ENSEMBL_ExonPlus_Annotation<-getAnnotation(ENSEMBLE_GENES_MART, >>>>>>> featureType="ExonPlusUtr") >>>>>>> >>>>>>> >>>>>>> I had no problem annotating and generating a Venn diagram to show the >>>>>>> overlaps between my three sets of peaks. To annotate, I used: >>>>>>> >>>>>>> annotated_regions=annotatePeakInBatch(myranged, >>>>>>> AnnotationData=ENSEMBL_ExonPlus_Annotation) >>>>>>> >>>>>>> >>>>>>> But I cannot seem to get the getEnrichedGo method to work on this (or my >>>>>>> other two annotated regions). Here is a typical command line: >>>>>>> >>>>>>> >>>>>>> my_enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs. eg.db",max >>>>>>> P= >>>>>>> 0. >>>>>>> 01 >>>>>>> ,multiAdj=TRUE,minGOterm=1, >>>>>>> multiAdjMethod="BH",feature_id_type="ensembl_gene_id") >>>>>>> >>>>>>> and here is a typical error message: >>>>>>> >>>>>>> enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg. db",maxP=0 >>>>>>> .0 >>>>>>> 1, >>>>>>> mu >>>>>>> ltiAdj=TRUE,minGOterm=1,feature_id_type="ensembl_gene_id") >>>>>>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { : >>>>>>> argument is of length zero >>>>>>> >>>>>>> >>>>>>> Which leads me to ask: >>>>>>> >>>>>>> 1) Is this error message supposed to be meaningful to me-i.e. a user-or >>>>>>> is >>>>>>> it something that I should be sending to the developer of the package? >>>>>>> >>>>>>> 2) Is there anything obvious from this that suggests what corrective >>>>>>> action I should be taking? >>>>>>> >>>>>>> >>>>>>> Eric Cabot >>>>>>> University of Wisconsin >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioconductor mailing list >>>>>>> Bioconductor at r-project.org >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>> Search the archives: >>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>> >>>> >> >> >

ADD REPLY • link 13.8 years ago Julie Zhu ★ 4.3k

Login before adding your answer.