Hi,
I just have installed latest ChIPpeakAnno and tried example code and data. But got error. Same error with my data as well. How to solve this?
# Just another question: when it annotate to nearest TSS, does it use Summit or Start position from MACS file?
https://bioconductor.org/packages/devel/bioc/vignettes/ChIPpeakAnno/inst/doc/ChIPpeakAnno.html
macs <- system.file("extdata", "MACS_peaks.xls", package="ChIPpeakAnno")
macsOutput <- toGRanges(macs, format="MACS")
duplicated or NA names found. Rename all the names by numbers.
Many thanks,
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Fedora 24 (Workstation Edition)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel grid stats graphics grDevices utils
[8] datasets methods base
other attached packages:
[1] EnsDb.Hsapiens.v75_2.1.0 ensembldb_1.6.2 GenomicFeatures_1.26.0
[4] AnnotationDbi_1.36.0 Biobase_2.34.0 ChIPpeakAnno_3.8.9
[7] VennDiagram_1.6.17 futile.logger_1.4.3 GenomicRanges_1.26.1
[10] GenomeInfoDb_1.10.1 Biostrings_2.42.1 XVector_0.14.0
[13] IRanges_2.8.1 S4Vectors_0.12.1 BiocGenerics_0.20.0
Hi,
I am trying to make a custom annotation file to use with ChIPpeakAnno. I am starting with an Ensembl GTF file. The following command gives the error: duplicated or NA names found. Rename all the names by numbers.
annoData <- toGRanges(gff, format="GFF")
Which part of the GTF file does it not like?
If I run annotatePeakInBatch using this file:
I get the error:
Error in
rownames<-(
tmp, value = c("(-73.9,5e+03]", "(5e+03,9.99e+03]", : invalid rownames length In addition: Warning message: In annotatePeakInBatch(myPeakList = peaks, AnnotationData = annoData, : not all the seqnames of myPeakList is in the AnnotationData.
Could someone please explain what this means and what I need to change?
Thank you!
Hi,
You mentioned that you downloaded the annotation file as GTF format from Ensembl. If this is correct, toGranges with format = "GFF" is not correct since GTF format is different from GFF format. Without changing your code, could you please download the annotation file as a GFF file format instead? Alternatively, you can use the following code to get the annotation assuming that you are interested in the human gene annotation.
library(EnsDb.Hsapiens.v86) annoData <- toGRanges(EnsDb.Hsapiens.v86, feature="gene")
Best regards, Julie