I have transcriptome data of an inhouse sequenced bacterial genome. I made the database for my bacteria using makeOrgPackageFromNCBI command.
How can I use this database for KEGG AND Go analysis (ORA or GSEA). All commands use either KEGG organism database for such an analysis. I tried with that command (KEGGrich command) but it indicated that "Gene ids did not match" as output.
If I want to use the above method, where I use the closest strain to my species from the KEGG database. I had a set of gene ID (eg: R30_hybrid_002367) for which I have extracted KO, Go ids using eggnog tool. can I map these Ko ids as Term2gene, how can I extract term2name database for this?
If KEGG database of a closet species (with term2gene and term2name) for analysis and I have identified the KO ids of my filtered DEGs using eggnog, but for some gene such as sRNAs or hypothetical protein, I do not have the Ko ids. I will miss those gene ids from my analysis?
Please help me solve this issue.
Please show the lines of code you tried, as well as the content of your input files. It seems you got all annotation information that is required. Also double-check your code with the help pages of each function, because (for example) the function
KEGGrich
doesn't exist. The format of theterm2name
andterm2gene
(beingdata.frame
s) are also mentioned. You may want to check this thread (for KEGG) and this thread (for GO) on how to obtain theterm2gene
andterm2name
objects.You already found the before-mentioned thread on the use KO ids with
clusterProfiler
, so what doesn't work for you? Again, please show your code.Yes, if entries are not mapping to a KO id they will indeed be filtered and not included in a subsequent analysis.
No disrespect intended, but based on the questions you posted I think you would profit most by asking guidance from someone more experienced at you local institute.
Thankyou so much for answering. I was wondering how do I link my created database (using makeorgPackagefromNCBI) for KEGG and GO analysis. Do I need to write as "organism object" (org=) in the KEGG enrichment code or I need link it by term2gene and term2name method as explained in this thread
What did you try yourselves? Again, please show your code....!!
For KEGG-based analysis you can make use of the convenience functions
enrichKEGG
andgseKEGG
.If you use these, you do NOT need anOrgDb
, but just the KEGG organism code of your organism of interest (=hsy
). See?enrichKEGG
and?gseKEGG
.For GO-based analysis you can make use of the convenience functions
enrichGO
andgseGO
. For these 2 functions you will indeed need anOrgDb
. See?enrichGO
and?gseGO
.The nice thing of these convenience functions is that these automagically take care of retrieving and formatting the required annotation information (=
TERM2GENE
andTERM2NAME
) files; you will only need to provide as input list of (selected) genes (of course having the proper type of id).Yet, in the end both convenience functions make use of the generic functions
enricher
andGSEA
. You could also directly use these generic functions yourselves. If you do so, in both cases you will have to provide the theTERM2GENE
andTERM2NAME
files as well. This may be useful if, for example, you already have a table in which ids are mapped to a GO category, or any other gene set. This will allow for flexibility and also permit you to skip the creation of anOrgDb
. See also: https://yulab-smu.top/biomedical-knowledge-mining-book/universal-api.html