Eric,
You could convert the exon IDs associated with the peaks to ensemble
gene
IDs and input these ensemble gene IDs to the getEnrichedGO function.
Best regards,
Julie
On 1/5/11 4:09 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote:
> Hi Julie,
>
> It may be a while before I get back to you on this, because I did
my
> mapping and ChIP-Seq analysis with Hg19 (NCBI 37), not Hg18 (NCBI
36).
> I'm also a little concerned about using transcription start site
> annotations rather than exons, because the the binding domains are
not
> thought to be restricted to only promoters. Any suggestions?
>
> Eric
>
>
>
> Zhu, Lihua (Julie) wrote:
>> Eric,
>>
>> The annotated dataset has exon ID instead of gene ID while the
getEnrichedGO
>> is expecting feature_id_type="ensembl_gene_id". For a list of
supported
>> feature_id_type, please type ?getEnrichedGO.
>>
>> To use getEnrichedGO function, first get the TSS annotation.
>>
>> TSS.human.NCBI36 = getAnnotation(ENSEMBLE_GENES_MART,
featureType="TSS")
>>
>> or use the build in TSS as
>>
>> data(TSS.human.NCBI36)
>>
>> Then annotate your peaks with TSS.human.NCBI36 followed by
getEnrichedGO
>> call.
>>
>> Please let me know if this works for you.
>>
>> Best regards,
>>
>> Julie
>>
>>
>>
>>
>> On 1/5/11 12:29 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote:
>>
>>> Hi Julie,
>>>
>>> Thank you for your response.
>>>
>>> Here is the sessionInfo and traceback output and also a few lines
of
>>> "my_annotated_regions".
>>>
>>> Regards,
>>>
>>> Eric Cabot
>>>
>>>> sessionInfo()
>>> R version 2.12.1 (2010-12-16)
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>
>>> locale:
>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods
base
>>>
>>> other attached packages:
>>> [1] ChIPpeakAnno_1.6.0 limma_3.6.9
>>> [3] org.Hs.eg.db_2.4.6 GO.db_2.4.5
>>> [5] RSQLite_0.9-4 DBI_0.2-5
>>> [7] AnnotationDbi_1.12.0
>>> BSgenome.Ecoli.NCBI.20080805_1.3.16
>>> [9] BSgenome_1.18.2 GenomicRanges_1.2.2
>>> [11] Biostrings_2.18.2 IRanges_1.8.8
>>> [13] multtest_2.6.0 Biobase_2.10.0
>>> [15] biomaRt_2.6.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] MASS_7.3-9 RCurl_1.5-0 splines_2.12.1
survival_2.36-2
>>> [5] tools_2.12.1 XML_3.2-0
>>> my_enrichedGO<-getEnrichedGO(my_annotated_regions,orgAnn="org.Hs.e
g.db",maxP
>>> =0
>>> .01,multiAdj=FALSE,minGOterm=1,feature_id_type="ensembl_gene_id")
>>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { :
>>> argument is of length zero
>>>> traceback()
>>> 2: addAncestors(this.GO[this.GO[, 3] == "BP", ], "bp")
>>> 1: getEnrichedGO(FC2_annotated_regions, orgAnn = "org.Hs.eg.db",
>>> maxP = 0.01, multiAdj = FALSE, minGOterm = 1,
feature_id_type =
>>> "ensembl_gene_id")
>>>
>>>
>>>
>>>> as.data.frame(my_annotated_regions[1:15,])
>>> space start end width names
peak strand
>>> 1 1 241997936 241998205 270 R-10060 ENSE00001749374 R-10060
+
>>> 2 1 237109743 237110002 260 R-10082 ENSE00001643382 R-10082
+
>>> 3 1 236080267 236080415 149 R-10086 ENSE00001807176 R-10086
+
>>> 4 1 233853245 233853514 270 R-10096 ENSE00001776382 R-10096
+
>>> 5 1 233727956 233728104 149 R-10097 ENSE00001442190 R-10097
+
>>> 6 1 230728554 230728823 270 R-10108 ENSE00001731401 R-10108
+
>>> 7 1 229687129 229687277 149 R-10113 ENSE00001439385 R-10113
+
>>> 8 1 228943263 228943412 150 R-10121 ENSE00001903546 R-10121
+
>>> 9 1 218358885 218359176 292 R-10157 ENSE00001439386 R-10157
+
>>> 10 1 212254259 212254408 150 R-10179 ENSE00001624346 R-10179
+
>>> 11 1 210086264 210086513 250 R-10184 ENSE00001903225 R-10184
+
>>> 12 1 209863549 209863698 150 R-10185 ENSE00001336255 R-10185
+
>>> 13 1 207437117 207437264 148 R-10190 ENSE00001742112 R-10190
+
>>> 14 1 190352400 190352548 149 R-10246 ENSE00001782518 R-10246
+
>>> 15 1 184432607 184432755 149 R-10260 ENSE00001283926 R-10260
+
>>> feature start_position end_position insideFeature
>>> distancetoFeature
>>> 1 ENSE00001749374 241995237 241996089 downstream
>>> 2699
>>> 2 ENSE00001643382 237144639 237145008 upstream
>>> -34896
>>> 3 ENSE00001807176 236078715 236078821 downstream
>>> 1552
>>> 4 ENSE00001776382 233807017 233807237 downstream
>>> 46228
>>> 5 ENSE00001442190 233749750 233750272 upstream
>>> -21794
>>> 6 ENSE00001731401 230728406 230728586 overlapEnd
>>> 148
>>> 7 ENSE00001439385 229685652 229685769 downstream
>>> 1477
>>> 8 ENSE00001903546 228882063 228882416 downstream
>>> 61200
>>> 9 ENSE00001439386 218303137 218303294 downstream
>>> 55748
>>> 10 ENSE00001624346 212253973 212254092 downstream
>>> 286
>>> 11 ENSE00001903225 210111538 210111622 upstream
>>> -25274
>>> 12 ENSE00001336255 209859550 209859630 downstream
>>> 3999
>>> 13 ENSE00001742112 207438342 207438381 upstream
>>> -1225
>>> 14 ENSE00001782518 190331193 190331400 downstream
>>> 21207
>>> 15 ENSE00001283926 184446520 184446737 upstream
>>> -13913
>>> shortestDistance fromOverlappingOrNearest
>>> 1 1847 NearestStart
>>> 2 34637 NearestStart
>>> 3 1446 NearestStart
>>> 4 46008 NearestStart
>>> 5 21646 NearestStart
>>> 6 32 NearestStart
>>> 7 1360 NearestStart
>>> 8 60847 NearestStart
>>> 9 55591 NearestStart
>>> 10 167 NearestStart
>>> 11 25025 NearestStart
>>> 12 3919 NearestStart
>>> 13 1078 NearestStart
>>> 14 21000 NearestStart
>>> 15 13765 NearestStart
>>>
>>>
>>> Zhu, Lihua (Julie) wrote:
>>>> Hi Eric,
>>>>
>>>> Could you please post the session information with sessionInfo()
command?
>>>> Could you please also send a few ensembl IDs in your annotated
dataset?
>>>> Thanks!
>>>>
>>>> Best regards,
>>>>
>>>> Julie
>>>>
>>>>
>>>> On 1/4/11 6:51 PM, "Eric Cabot" <elcabot at="" gmail.com=""> wrote:
>>>>
>>>>> I am a relatively new Bioconductor user and I am trying to
analyze some
>>>>> ChIP-seq results that came from QuEST using the ChIPpeakAnno
package.
>>>>>
>>>>> After importing the regions of interest into RangedData objects
and doing
>>>>> the following:
>>>>>
>>>>>
>>>> ENSEMBLE_GENES_MART<-useMart(biomart="ensembl",dataset="hsapiens_
gene_ensem
>>>> bl
>>>> ">
>>>> )
>>>>> ENSEMBL_ExonPlus_Annotation<-getAnnotation(ENSEMBLE_GENES_MART,
>>>>> featureType="ExonPlusUtr")
>>>>>
>>>>>
>>>>> I had no problem annotating and generating a Venn diagram to
show the
>>>>> overlaps between my three sets of peaks. To annotate, I used:
>>>>>
>>>>> annotated_regions=annotatePeakInBatch(myranged,
>>>>> AnnotationData=ENSEMBL_ExonPlus_Annotation)
>>>>>
>>>>>
>>>>> But I cannot seem to get the getEnrichedGo method to work on
this (or my
>>>>> other two annotated regions). Here is a typical command line:
>>>>>
>>>>>
>>>>> my_enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg
.db",maxP=
>>>>> 0.
>>>>> 01
>>>>> ,multiAdj=TRUE,minGOterm=1,
>>>>> multiAdjMethod="BH",feature_id_type="ensembl_gene_id")
>>>>>
>>>>> and here is a typical error message:
>>>>>
>>>>> enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.db
",maxP=0.0
>>>>> 1,
>>>>> mu
>>>>> ltiAdj=TRUE,minGOterm=1,feature_id_type="ensembl_gene_id")
>>>>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { :
>>>>> argument is of length zero
>>>>>
>>>>>
>>>>> Which leads me to ask:
>>>>>
>>>>> 1) Is this error message supposed to be meaningful to me-i.e. a
user-or is
>>>>> it something that I should be sending to the developer of the
package?
>>>>>
>>>>> 2) Is there anything obvious from this that suggests what
corrective
>>>>> action I should be taking?
>>>>>
>>>>>
>>>>> Eric Cabot
>>>>> University of Wisconsin
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>>
https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>>
http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>
>>>>
>>
>>
>