I used both gage package in R and GSEA software for KEGG pathway analysis and since the 2 tools have different algorithms, I got slightly different results in the list of genesets. I was able to find the genes in each gene set from GSEA from their website, but I don't seem to find the gene list of some gene sets that I got as a result of GAGE analysis on GSEA website. A couple of those are:
hsa04659 Th17 cell differentiation
hsa04380 Osteoclast differentiation
I could not find these pathways in GSEA:C2:CP:KEGG collection. I did find the list of genes on KEGG website(http://www.genome.jp/dbget-bin/www_bget?hsa04659), but I would like to download this list. Does anyone know how to get the list of genes in these genesets? Is there an R packages that allows you to download genes in a geneset from KEGG database?
The first column in the matrix contains the Entrez Gene ID, and the second contains the gene symbol, plus the gene name. You can also use the org.Hs.eg.db package to annotate if you want an arguably cleaner output.
> select(org.Hs.eg.db, head(zz[,1]), c("SYMBOL","ENSEMBL"))
'select()' returned 1:1 mapping between keys and columns
ENTREZID SYMBOL ENSEMBL
1 3553 IL1B ENSG00000125538
2 3554 IL1R1 ENSG00000115594
3 3556 IL1RAP ENSG00000196083
4 5600 MAPK11 ENSG00000185386
5 6300 MAPK12 ENSG00000188130
6 5603 MAPK13 ENSG00000156711
Which you can note is the same (except for the Ensembl ID) as what you already have
Thank you- This worked!
Could you also help me with matching the KEGG gene IDs to either gene symbols, Entrez ID, or Ensembl IDs?
The first column in the matrix contains the Entrez Gene ID, and the second contains the gene symbol, plus the gene name. You can also use the org.Hs.eg.db package to annotate if you want an arguably cleaner output.
Which you can note is the same (except for the Ensembl ID) as what you already have