I have RNA-Seq data from a prokaryotic non-model organism (Microbacterium) and I am doing a gene set enrichment analysis. I mapped my amino acid sequences to KO annotations first. I then managed to do the gene set enrichment analysis by using organism = 'ko'
.
gse_kegg <- gseKEGG(
geneList = geneList,
organism = 'ko',
minGSSize = 120,
pvalueCutoff = 0.05,
verbose = FALSE)
The output is somewhat unspecific (e.g. upregulated is "biosynthesis of secondary metabolites") and thus not very useful.
My second thought is that I could potentially make use of a closely related genome that is KEGG annotated and thus listed in https://www.genome.jp/kegg/catalog/org_list.html. However, I don't have the gene mapping between the reference and my sequences.
For example, when I try:
gse_kegg <- gseKEGG(
geneList = geneList,
organism = 'mfol',
key = 'kegg',
minGSSize = 120,
pvalueCutoff = 0.05,
verbose = FALSE)
I obviously get an error: Expected input gene ID: ,DXT68_06835,DXT68_14490
, because my gene names are not mapped to the KEGG names of the reference.
Does anybody have an idea, how I can use the organism mfol
with my input genes? How can I map my genes to e.g. gene ID: DXT68_15070
?
Any help would be greatly appreciated.