Hello everyone, I have a question related to gene enrichment analysis using enrichGO function through clusterProfiler. As I am going to analyze nonmodel organism seabream so I used AnnotationHub to build OrgDb. I retrieved the annotation database
library(AnnotationHub)
hub <- AnnotationHub()
query(hub, "Sparus")
Sparus <- hub[["AH76028"]]
Then I mapped a sample of the gene list as
> or
[1] 115567596 115588662 115582432 115571168 115587108 115566188
[7] 115572180 115566795 115581069 115582779 115588636 115567341
But when I mapped using the enrichGo function I received below error
sample_test <- enrichGO(or, OrgDb=Sparus, pvalueCutoff=1, qvalueCutoff=1)
--> No gene can be mapped....
--> Expected input gene ID: 19555015,19555004,19555016,19555009,19555013,19555007
--> return NULL...
Actually, this is so strange because I could see that my IDs list is in the Sparus database when I check the keys
> keys(Sparus)
[1] "19555004" "19555005" "19555006" "19555007" "19555008"
[6] "19555009" "19555010" "19555011" "19555012" "19555013"
[11] "19555014" "19555015" "19555016" "115565810" "115565973"
[16] "115566089" "115566151" "115566188" "115566653" "115566665"
[21] "115566750" "115566976" "115567004" "115567006" "115567066"
I think the only problem is that in the annotation file it start with some ids that the pattern is different such as
[1] "19555004" "19555005" "19555006" "19555007" "19555008"
[6] "19555009" "19555010" "19555011" "19555012" "19555013"
[11] "19555014" "19555015" "19555016"
Could anyone give a suggestion to me to solve this problem? As I realized editing orgDb class is not easy at least for me! I think if I could remove those first ides then it will work, but I don't know-how!
FWIW ... I posted something similar this morning but ended up deleting it. I thought that just the keyType was the issue but when I try and run the code it still gives me an ERROR with their particular set of keys
so there might be a bug somewhere in the clusterProfiler code
Indeed.
Can't get GO mappings if the genes don't have GO terms appended
Hello James, Many thanks for your response, so It seems that for my intended species (Sparus aurata) because of annotation lack there is no GO term, then I should apply another approach. I am curious to know if there is no GO term so how some web services such as gprofiler can do a functional analysis? do such web services convert the IDs to their internal well-annotated databases such as human or zebrafish? Although, for seabream, the database exists in the ensemble and in grprofiler I can choose that!!
I appreciate it if anyone has an idea about this as well because it is important to know we get real functional analysis or just based on orthologue search.
This support site isn't intended to provide information about anything but Bioconductor packages, so ideally you would figure out how g:Profiler does things by for example reading on their webpage. I did that for you, and it says they do mappings via Ensembl. Which you can hypothetically do as well, although I have no idea if you can use that as input for
clusterprofiler
.Thank you, Games, Indeed, two methods that I have tried for functional analysis as enrichGO() of clusterprofile EX. enrichGO(V$id, OrgDb=, pvalueCutoff=1, qvalueCutoff=1) or params <-new("GOHyperGParams",annotation="org.Hs.eg",geneIds=V$id,universeGeneIds=z,ontology="BP",pvalueCutoff=hgCutoff,testDirection="over") They need to annotation which I think should be supported by Bioconductor such as org.Hs.eg or org.Dr.db so when we replaced the annotation file with our annotation file such as ones that we get through Annotation Hub() it gives error.
You could try
goana
from thelimma
package, which can use adata.frame
like what you get frombiomaRt
.Uh, that should be
kegga
. See here for more information.