Entering edit mode
Dear Carlos,
The code looks fine except the following line. You would want to use
MyPeakList instead of the first 6 records as MyPeakList[1:6,].
AnnotatedPeak = annotatePeakInBatch(MyPeakList[1:6,],
AnnotationData=annotation.cds)
Please let me know if this resolves the issue. Thanks!
Best regards,
Julie
On 3/1/12 12:13 PM, "Carlos Araya" <claraya@stanford.edu> wrote:
Dear Julie,
I'm ready to perform GO enrichment analysis on ~292 ChIP-seq datasets
and would like to use your ChIPpeakAnno package. However, I've
encountered problems that I believe are due to differences in the
gene/feature names used. Here is an excerpt of my code:
library(ChIPpeakAnno)
library(org.Ce.eg.db)
# load peak calls and genomic annotations, filter for gene (CDS)
annotations:
peaks.bed = read.table("YL474_HPL-2_L1_yale_stn_1x2_peaks.bed",
header=FALSE)
annotation = read.table("in2shape_wbComp.bed", header=TRUE)
annotation.cds = annotation[annotation$feature.type == "cds",]
summary(annotation.pct)
chrm start end feature
score strand name gene
chrI :4593 Min. : 58 Min. : 779 WBGene00000001:
0 .:30163 -:14976 2L52.1 : 0 WBGene00000001: 0
chrII :5195 1st Qu.: 4687094 1st Qu.: 4688288 WBGene00000002:
0 .: 0 2RSSE.1: 0 WBGene00000002: 0
chrIII:4345 Median : 8229719 Median : 8235458 WBGene00000003:
0 +:15187 2RSSE.2: 0 WBGene00000003: 0
chrIV :5126 Mean : 8415951 Mean : 8419438 WBGene00000004:
0 4R79.2a: 0 WBGene00000004: 0
chrM : 12 3rd Qu.:11924503 3rd Qu.:11929206 WBGene00000005:
0 4R79.2b: 0 WBGene00000005: 0
chrV :6763 Max. :20915455 Max. :20922707 (Other) :
0 (Other): 0 (Other) : 0
chrX :4129 NA's
:30163 NA's :30163 NA's :30163
# convert bed tables into ranged data:
MyPeakList = BED2RangedData(peaks.bed)
annotation.cds = BED2RangedData(annotation.cds)
# annotate peaks:
AnnotatedPeak = annotatePeakInBatch(MyPeakList[1:6,],
AnnotationData=annotation.cds)
as.data.frame(AnnotatedPeak)
space start end width names peak strand
feature start_position end_position insideFeature distancetoFeature
shortestDistance
1 I 16921 17165 245 MACS_peak_2 00006 MACS_peak_2 +
00006 11618 16804 downstream 5303
117
2 I 110632 110850 219 MACS_peak_4 00018 MACS_peak_4 +
00018 111036 112316 upstream -404
186
3 I 536767 536967 201 MACS_peak_6 00070 MACS_peak_6 +
00070 537125 541634 upstream -358
158
4 I 3680 4064 385 MACS_peak_1 02345 MACS_peak_1 -
02345 4221 10230 downstream 6550
157
5 I 108389 108741 353 MACS_peak_3 02354 MACS_peak_3 -
02354 101213 108161 upstream -228
228
6 I 315156 315412 257 MACS_peak_5 02372 MACS_peak_5 -
02372 310981 314992 upstream -164
164
fromOverlappingOrNearest
1 NearestStart
2 NearestStart
3 NearestStart
4 NearestStart
5 NearestStart
6 NearestStart
# execute enrichment analysis:
enrichedGO = getEnrichedGO(AnnotatedPeak, orgAnn = "org.Ce.eg.db",
maxP = 0.01, multiAdj = TRUE, minGOterm = 10, multiAdjMethod = "BH" )
Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { :
argument is of length zero
I've highlighted in blue the outputs of distinct commands and in red
the error that I obtain upon attempting GO analysis. Any ideas as to
how I should annotate my features (which codes I should use) or how to
prepare the objects in R for the enrichment analysis? Please note that
in my annotation BED file the WormBase identifiers for each gene are
also available (under the "gene" column). I noticed there is a
"org.Ce.egWORMBASE" object in the "org.Ce.eg.db" library which may be
useful, but I do not understand how to use it convert between gene
IDs.
Kind regards,
Carlos L. Araya, PhD
Stanford University | Department of Genetics
300 Pasteur Dr. #M348 | Stanford, CA 94305
m: 206.661.3648 | e: claraya@stanford.edu
[[alternative HTML version deleted]]