Question

gene enrichment using enrichGO function throuth annotationHub orgDb

0

Entering edit mode

bnajafpour • 0

@bnajafpour-23448

Last seen 5.0 years ago

Hello everyone, I have a question related to gene enrichment analysis using enrichGO function through clusterProfiler. As I am going to analyze nonmodel organism seabream so I used AnnotationHub to build OrgDb. I retrieved the annotation database

library(AnnotationHub) 
hub <- AnnotationHub()
query(hub, "Sparus")
Sparus <- hub[["AH76028"]]
Then I mapped a sample of the gene list as 
> or
  [1] 115567596 115588662 115582432 115571168 115587108 115566188
  [7] 115572180 115566795 115581069 115582779 115588636 115567341

But when I mapped using the enrichGo function I received below error

sample_test <- enrichGO(or, OrgDb=Sparus, pvalueCutoff=1, qvalueCutoff=1)
--> No gene can be mapped....
--> Expected input gene ID: 19555015,19555004,19555016,19555009,19555013,19555007
--> return NULL...

Actually, this is so strange because I could see that my IDs list is in the Sparus database when I check the keys

> keys(Sparus)
   [1] "19555004"  "19555005"  "19555006"  "19555007"  "19555008" 
   [6] "19555009"  "19555010"  "19555011"  "19555012"  "19555013" 
  [11] "19555014"  "19555015"  "19555016"  "115565810" "115565973"
  [16] "115566089" "115566151" "115566188" "115566653" "115566665"
  [21] "115566750" "115566976" "115567004" "115567006" "115567066"

I think the only problem is that in the annotation file it start with some ids that the pattern is different such as

 [1] "19555004"  "19555005"  "19555006"  "19555007"  "19555008" 
 [6] "19555009"  "19555010"  "19555011"  "19555012"  "19555013" 
 [11] "19555014"  "19555015"  "19555016"

Could anyone give a suggestion to me to solve this problem? As I realized editing orgDb class is not easy at least for me! I think if I could remove those first ides then it will work, but I don't know-how!

software error enrichGO annotationHub clusterProfiler Job • 2.4k views

ADD COMMENT • link updated 5.0 years ago by James W. MacDonald 68k • written 5.0 years ago by bnajafpour • 0

score 2 · Answer 1 · 2020-04-29

2

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 5 hours ago

United States

Here is your hint:

> Sparus
 OrgDb object:
 | DBSCHEMAVERSION: 2.1
 | DBSCHEMA: NOSCHEMA_DB
 | ORGANISM: Sparus aurata
 | SPECIES: Sparus aurata
 | CENTRALID: GID                     <----------------- That's the hint!
 | Taxonomy ID: 8175
 | Db type: OrgDb
 | Supporting package: AnnotationDbi

And so

> or <- head(keys(Sparus), 300)
> library(clusterProfiler)
> sample_test <- enrichGO(or, Sparus, "GID", pvalueCutoff = 1, qvalueCutoff=1)
> class(sample_test)
 [1] "enrichResult"
 attr(,"package")
 [1] "DOSE"

ADD COMMENT • link 5.0 years ago James W. MacDonald 68k

0

Entering edit mode

FWIW ... I posted something similar this morning but ended up deleting it. I thought that just the keyType was the issue but when I try and run the code it still gives me an ERROR with their particular set of keys

or = c(115567596, 115588662, 115582432, 115571168, 115587108, 115566188)
or %in% keys(Sparus, keyType="GID")
[1] TRUE TRUE TRUE TRUE TRUE TRUE
> sample_test <- enrichGO(or, OrgDb=Sparus, keyType="GID", pvalueCutoff=1, qvalueCutoff=1)
--> No gene can be mapped....
--> Expected input gene ID: 19555016,19555009,19555004,19555007,19555015,19555013
--> return NULL...

so there might be a bug somewhere in the clusterProfiler code

ADD REPLY • link 5.0 years ago shepherl 4.1k

0

Entering edit mode

Indeed.

> select(Sparus, or, "GOALL", "GID")
 'select()' returned 1:1 mapping between keys and columns
         GID GOALL
 1 115567596  <NA>
 2 115588662  <NA>
 3 115582432  <NA>
 4 115571168  <NA>
 5 115587108  <NA>
 6 115566188  <NA>

Can't get GO mappings if the genes don't have GO terms appended

ADD REPLY • link 5.0 years ago James W. MacDonald 68k

0

Entering edit mode

Hello James, Many thanks for your response, so It seems that for my intended species (Sparus aurata) because of annotation lack there is no GO term, then I should apply another approach. I am curious to know if there is no GO term so how some web services such as gprofiler can do a functional analysis? do such web services convert the IDs to their internal well-annotated databases such as human or zebrafish? Although, for seabream, the database exists in the ensemble and in grprofiler I can choose that!!

I appreciate it if anyone has an idea about this as well because it is important to know we get real functional analysis or just based on orthologue search.

ADD REPLY • link 5.0 years ago bnajafpour • 0

0

Entering edit mode

This support site isn't intended to provide information about anything but Bioconductor packages, so ideally you would figure out how g:Profiler does things by for example reading on their webpage. I did that for you, and it says they do mappings via Ensembl. Which you can hypothetically do as well, although I have no idea if you can use that as input for clusterprofiler.

library(biomaRt)
mart <- useEnsembl("ensembl","saurata_gene_ensembl", mirror = "useast")
z <- getBM(c("ensembl_gene_id","entrezgene_id","go_id"), "entrezgene_id", head(keys(Sparus), 50), mart)
> head(z)
      ensembl_gene_id entrezgene_id      go_id
 1 ENSSAUG00010019100     115565810
 2 ENSSAUG00010009929     115565973 GO:0009267
 3 ENSSAUG00010009929     115565973 GO:0032099
 4 ENSSAUG00010009929     115565973 GO:0008343
 5 ENSSAUG00010009929     115565973 GO:0000186
 6 ENSSAUG00010009929     115565973 GO:0005184

ADD REPLY • link 5.0 years ago James W. MacDonald 68k

0

Entering edit mode

Thank you, Games, Indeed, two methods that I have tried for functional analysis as enrichGO() of clusterprofile EX. enrichGO(V$id, OrgDb=, pvalueCutoff=1, qvalueCutoff=1) or params <-new("GOHyperGParams",annotation="org.Hs.eg",geneIds=V$id,universeGeneIds=z,ontology="BP",pvalueCutoff=hgCutoff,testDirection="over") They need to annotation which I think should be supported by Bioconductor such as org.Hs.eg or org.Dr.db so when we replaced the annotation file with our annotation file such as ones that we get through Annotation Hub() it gives error.

ADD REPLY • link 5.0 years ago bnajafpour • 0

0

Entering edit mode

You could try goana from the limma package, which can use a data.frame like what you get from biomaRt.

ADD REPLY • link 5.0 years ago James W. MacDonald 68k

0

Entering edit mode

Uh, that should be kegga. See here for more information.

ADD REPLY • link 5.0 years ago James W. MacDonald 68k