Hi Marc and Others,
I am trying to use the wonderful package 'AnnotationHub' to retrieve some information, however, I found a little tricky problem-No Uniprot information for some other organisms in AnnotationHub, shown as below:
For Homo sapiens:
> library(AnnotationHub)
> hub <- AnnotationHub()
snapshotDate(): 2016-08-15
> query(hub, c("OrgDb","Homo sapiens"))
AnnotationHub with 1 record
# snapshotDate(): 2016-08-15
# names(): AH49582
# $dataprovider: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
# $species: Homo sapiens
# $rdataclass: OrgDb
# $title: org.Hs.eg.db.sqlite
# $description: NCBI gene ID based annotations about Homo sapiens
# $taxonomyid: 9606
# $genome: NCBI genomes
# $sourcetype: NCBI/ensembl
# $sourceurl: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/, ftp://ftp.ensembl.org/pub/current_fasta
# $sourcelastmodifieddate: NA
# $sourcesize: NA
# $tags: NCBI, Gene, Annotation
# retrieve record with 'object[["AH49582"]]'
> human<-hub[["AH49582"]]
loading from cache :/Users/RCPA/Documents/AppData/.AnnotationHub/56312?
> keytypes(human)
[1] "ACCNUM" "ALIAS" "ENSEMBL" "ENSEMBLPROT" "ENSEMBLTRANS" "ENTREZID" "ENZYME"
[8] "EVIDENCE" "EVIDENCEALL" "GENENAME" "GO" "GOALL" "IPI" "MAP"
[15] "OMIM" "ONTOLOGY" "ONTOLOGYALL" "PATH" "PFAM" "PMID" "PROSITE"
[22] "REFSEQ" "SYMBOL" "UCSCKG" "UNIGENE" "UNIPROT"
For Solanum lycopersicum:
> query(hub, c("OrgDb","Solanum lycopersicum"))
AnnotationHub with 2 records
# snapshotDate(): 2016-08-15
# $dataprovider: NCBI, ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
# $species: Solanum lycopersicum
# $rdataclass: OrgDb
# additional mcols(): taxonomyid, genome, description, tags, sourceurl, sourcetype
# retrieve records with, e.g., 'object[["AH13359"]]'
title
AH13359 | org.Solanum_lycopersicum.eg.sqlite
AH48047 | org.Solanum_lycopersicum.eg.sqlite
> tomato<-hub[["AH48047"]]
loading from cache :/Users/RCPA/Documents/AppData/.AnnotationHub/54353?
> keytypes(tomato)
[1] "ACCNUM" "ALIAS" "ENTREZID" "EVIDENCE" "EVIDENCEALL" "GENENAME" "GID" "GO"
[9] "GOALL" "ONTOLOGY" "ONTOLOGYALL" "PMID" "REFSEQ" "SYMBOL" "UNIGENE"
As you can see, unexpectedly, No "UNIPROT" in tomato! I think "UNIPROT" is one of the most basic information for any organism, it should be included.
Therefore, could you give some suggestion for this or provide an approach to add the "UNIPROT" information in it?
My sessionInfo():
> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] GenomeInfoDb_1.8.3 clusterProfiler_3.0.4 DOSE_2.10.7 org.Hs.eg.db_3.3.0 sqldf_0.4-10
[6] RSQLite_1.0.0 DBI_0.4-1 gsubfn_0.6-6 proto_0.3-10 AnnotationDbi_1.34.4
[11] IRanges_2.6.1 S4Vectors_0.10.2 Biobase_2.32.0 AnnotationHub_2.4.2 BiocGenerics_0.18.0
loaded via a namespace (and not attached):
[1] qvalue_2.4.2 shinyjs_0.6 reshape2_1.4.1
[4] lattice_0.20-33 splines_3.3.1 tcltk_3.3.1
[7] colorspace_1.2-6 miniUI_0.1.1 htmltools_0.3.5
[10] chron_2.3-47 interactiveDisplayBase_1.10.3 XML_3.98-1.4
[13] topGO_2.24.0 matrixStats_0.50.2 plyr_1.8.4
[16] stringr_1.0.0 munsell_0.4.3 GOSemSim_1.30.3
[19] gtable_0.2.0 SparseM_1.7 httpuv_1.3.3
[22] BiocInstaller_1.22.3 curl_1.1 GSEABase_1.34.0
[25] Rcpp_0.12.6 xtable_1.8-2 scales_0.4.0
[28] DO.db_2.9 graph_1.50.0 annotate_1.50.0
[31] mime_0.5 ggplot2_2.1.0 digest_0.6.10
[34] stringi_1.1.1 shiny_0.13.2 grid_3.3.1
[37] tools_3.3.1 magrittr_1.5 tibble_1.1
[40] GO.db_3.3.0 tidyr_0.5.1 rsconnect_0.4.3
[43] assertthat_0.1 httr_1.2.1 R6_2.1.2
[46] igraph_1.0.1
Thank a lot for helping^_^
Regards,
Shisheng
Hi Valerie Obenchain, I am facing a similar problem. I want to to a GO enrichment analysis (ORA and/or GESA) from proteomics dataset, so I have UniProt IDs. I was thinking to use clusterprofiler as it looks like an easy and effective package to use for this analysis. Unfortunately clusterprofiler supports only org.db of model organism. I am trying to use import solanum lycopersicum from AnnotationHub but, seems not working and also does't have UNIPROT IDs yet. Could you suggest anything? Any kind of solution is well appreciated. Thanks, Alberto.
Alberto You might get more assistance by creating a new post item for this. This original post is very old and there might not be as many followers that a new post would attract for answers. If you do I suggest selecting clusterprofiler and annotationhub in
Post Tags
as well as any others you think useful.