Recently, I was doing RNA-Seq data analysis of Populus euphratica, but When doing the GO gene enrichment analysis of Populus euphratica, an error of No gene can be mapped ...
was thrown!
The important thing is that the gene annotation types can be converted, but cannot be analyzed later.
According to the documentation of clusterProfiler, I used AnnotationHub to build a genome annotation set, the code is as follows:
library(AnnotationHub)
hub <- AnnotationHub::AnnotationHub()
query(hub, "Populus euphratica")
euphratica.OrgDb <- hub[["AH75896"]]`
Following are the extracted differential genes:
gene [1] "105138834" "105135191" "105132258" "105140044" "105134133" "105133575" "105141909" "105123270" "105109891" "105132930" "105124276" [12] "105113312" "105135724" "105136902" "105140249" "105114638" "105111984" "105110579" "105139159" "105142457" "105124308" "105122018" [23] "105115857" "105130255" "105123530" "105126014" "105128710" "105127221" "105124911" "105128353" "105137258" "105137322" "105124790" [34] "105122488" "105134690" "105133317" "105142048" "105139696" "105119632" "105131399" "105113266" "105128458" "105138530" "105137532" [45] "105137080" "105107906" "105115364" "105133132" "105135049" "105111831" "105141437" "105136170" "105142560" "105142561" "105127580" [56] "105130295" "105114049" "105110139" "105133472" "105139576" "105112511" "105139018" "105142810" "105113922" "105134758" "105130163" [67] "105133142" "105125965" "105132391" "105127885" "105109447" "105139648" "105124384" "105128925" "105129037" "105116503" "105132431"`
GO
ego <- enrichGO(
gene = gene,
OrgDb= euphratica.OrgDb,
keyType = 'ENTREZID',
ont = "BP",
pAdjustMethod = "BH",
pvalueCutoff = 0.01,
qvalueCutoff = 0.05
)
then,I got the error message:
--> No gene can be mapped....
--> Expected input gene ID: 20160501,20160469,20160501,20160510,20160473,20160415
--> return NULL...
Then I checked if the gene exists and the type matches, and the results are correct.
such as the frist gene 105138834
> "105138834" %in% gene
[1] TRUE
> "105138834" %in% keys(euphratica.OrgDb, "ENTREZID")
[1] TRUE
And the type of gene annotation can change:
df <- bitr(genes,
fromType = "ENTREZID",
toType = c("GENENAME","GID", "SYMBOL"),
OrgDb = euphratica.OrgDb
)
> head(df)
ENTREZID GENENAME GID SYMBOL
1 105138834 uncharacterized LOC105138834 105138834 LOC105138834
2 105135191 U-box domain-containing protein 21-like 105135191 LOC105135191
3 105132258 acid beta-fructofuranosidase 105132258 LOC105132258
4 105140044 plasma membrane ATPase 4 105140044 LOC105140044
5 105134133 protein ASPARTIC PROTEASE IN GUARD CELL 2-like 105134133 LOC105134133
6 105133575 zinc finger CCCH domain-containing protein 47-like 105133575 LOC105133575
Thanks, Yao
I am not sure, but could it be that is due to the fact no Gene Ontology information is present in
euphratica.OrgD
?