Hello,
I am new to using R and bioconductor, so I apologize for any errors. What I am trying to do is use the msigdb database to retrieve the KEGG gmt data for a GSEA analysis and convert the human genes to zebrafish genes. I followed a previous posts on how to retrieve the data online and in their case they converted the human KEGG gene symbol identifiers to human ENSEMBL codes. I tried to replace the "org.Hs.eg.db" with the Zebrafish version "org.Dr.eg.db". However, now under unique identifiers, the total is 0. I assume it is due to attempting to jump straight from human gene symbols to zebrafish ENSEMBL codes. Does someone have a suggestion on how to convert the human genes to zebrafish genes/ENSEMBL codes in the gmt format that can be used by the GSEA software?
Thank you.
> setwd("C:/Users/georg/Desktop")
> library(org.Dr.eg.db)
> keggsym <- getGmt("c2.cp.kegg.v7.1.symbols.gmt", geneIdType=SymbolIdentifier())
> keggsym
GeneSetCollection
names: KEGG_GLYCOLYSIS_GLUCONEOGENESIS, KEGG_CITRATE_CYCLE_TCA_CYCLE, ..., KEGG_VIRAL_MYOCARDITIS (186 total)
unique identifiers: ACSS2, GCK, ..., EIF4G1 (5242 total)
types in collection:
geneIdType: SymbolIdentifier (1 total)
collectionType: NullCollection (1 total)
> keggens <- mapIdentifiers(keggsym, ENSEMBLIdentifier("org.Dr.eg.db"))
> keggens
GeneSetCollection
names: KEGG_GLYCOLYSIS_GLUCONEOGENESIS, KEGG_CITRATE_CYCLE_TCA_CYCLE, ..., KEGG_VIRAL_MYOCARDITIS (186 total)
unique identifiers: (0 total)
types in collection:
geneIdType: ENSEMBLIdentifier (1 total)
collectionType: NullCollection (1 total)
Thank you for the detailed response. The code worked right away. I appreciate the quick response.