Question

How to select the genes mapped to an enriched KEGG pathway (kegga)

0

Entering edit mode

mat149 ▴ 80

@mat149-11450

Last seen 3 months ago

United States

Hello,

I used the kegga function to gather pathway enrichment for my dataset. I would like to obtain a data frame (or any possible format) containing which genes are "up" and "down" for a given kegg pathway. So for example, I want to select all genes from the "N" column, "up" column" and "down" column for the row "path:dre04020".

I would also like to do this for a gene ontology enrichment table returned from goana. For example, I would like to select/extract all genes from the N/up/down columns for row "GO:0022613".

I tried using the select function but haven't had much luck. Does anyone know if this can be done?

Thanks,

Matt

		N	Up	Down	P.Up	P.Down
path:dre01100	Metabolic pathways	1074	291	27	4.29E-85	0.000174
path:dre04020	Calcium signaling pathway	216	70	4	3.03E-24	0.266749
path:dre01200	Carbon metabolism	114	47	2	1.27E-21	0.403579
path:dre04080	Neuroactive ligand-receptor interaction	313	77	13	1.99E-18	0.000105
path:dre00010	Glycolysis / Gluconeogenesis	70	32	1	1.02E-16	0.575509

		Ont	N	Up	Down	P.Up	P.Down
GO:0022613	ribonucleoprotein complex biogenesis	BP	205	0	87	1	1.65E-60
GO:0034660	ncRNA metabolic process	BP	215	0	87	1	2.22E-58
GO:0005730	nucleolus	CC	132	0	68	1	1.93E-54
GO:0042254	ribosome biogenesis	BP	143	0	69	1	1.04E-52
GO:0034470	ncRNA processing	BP	163	0	72	1	1.45E-51

kegga • 5.1k views

ADD COMMENT • link updated 8.4 years ago by Gordon Smyth 52k • written 8.4 years ago by mat149 ▴ 80

score 0 · Answer 1 · 2016-11-19

0

Entering edit mode

Gordon Smyth 52k

@gordon-smyth

Last seen 3 hours ago

WEHI, Melbourne, Australia

This is the sort of thing that we don't provide special functions for in limma, but which you can do reasonably easily for yourself if you are confident in using R.

Suppose you have done a limma linear model analysis resulting in a fit object. You might look at the topTable:

tab <- topTable(fit, n=Inf)

Now you want to a KEGG analysis. You could download the KEGG pathway annotation for zebra fish by:

GK <- getGeneKEGGLinks(species.KEGG = "dre")

Then you could do the KEGG analysis:

k <- kegga(fit, species.KEGG="dre", gene.pathway=GK)
topKEGG(k)

Now if you want to see the top-table genes that belong to a particular pathway (say path:dre01100), it is just a matter of using standard subseting operations:

i <- tab$GeneID %in% GK$GeneID[GK$PathwayID=="path:dre01100"]
tab[i,]

Edit:

The above code assumes your gene IDs are stored in the "GeneID" column. If you have stored them under a different column name, then you need to make the obvious change, such as:

i <- tab$ENTREZID %in% GK$GeneID[GK$PathwayID=="path:dre01100"]
tab[i,]

You might also need to use as.character(tab$ENTREZID) if you have stored the column as a number or as a factor.

ADD COMMENT • link 8.4 years ago Gordon Smyth 52k

0

Entering edit mode

Hey Gordon,

I tried your script but I had to modify a few things to run the KEGG pathway enrichment:

entzzvec<-as.vector(data.fit.eb$genes$ENTREZID)
tab <- topTable(data.fit.eb,lfc=1.5,adjust="BH",p.value=0.01, n=Inf)
GK <- getGeneKEGGLinks(species.KEGG = "dre")
k <- kegga(data.fit.eb,coef=1,geneid=entzzvec,species.KEGG="dre", gene.pathway=GK,FDR=0.01)
topKEGG(k)
i <- tab$GeneID %in% GK$GeneID[GK$PathwayID=="path:dre01100"]
tab[i,]

When I run the last line, it returns:

> tab[i,]
[1] PROBEID ID SYMBOL GENENAME
[5] ENTREZID morphant...control rescue...control morphant...rescue
[9] AveExpr F P.Value adj.P.Val
<0 rows> (or 0-length row.names)

I am not sure what to do from here, am I getting close?

ADD REPLY • link 8.4 years ago mat149 ▴ 80

0

Entering edit mode

You are using tab$GeneID when you don't actually have such a column. So naturally i contains nothing and tab[i,] has zero rows. See my edit above.

ADD REPLY • link 8.4 years ago Gordon Smyth 52k

0

Entering edit mode

Haha thanks for pointing that out (as.character(tab$ENTREZID)), I missed it completely :X

this works, thank you!

ADD REPLY • link 8.4 years ago mat149 ▴ 80

0

Entering edit mode

Hey,

I have been struggling to emulate this with GO categories.  I am stuck b/c I do not know how to generate an "R" object containing all GO term annotations for "Dr" analogous to: GK <- getGeneKEGGLinks(species.KEGG = "dre"). Beyond that, would i <- tab$GeneID %in% GK$GeneID[GK$PathwayID=="path:dre01100"] then need to be changed to: i <- tab$GeneID %in% GK$GeneID[GK$TermID=="GO:0004984"] for gene ontologies?

Here is an example of five categories I am interested in.

	Term	Ont	N	Up	Down	P.Up	P.Down
GO:0004984	olfactory receptor activity	MF	102	48	0	1.15E-30	1
GO:0050877	neurological system process	BP	258	63	47	3.42E-21	0.010118
GO:0007601	visual perception	BP	61	2	32	0.905847	1.7E-13
GO:0008066	glutamate receptor activity	MF	22	0	18	1	4.65E-13
GO:0045202	synapse	CC	167	2	68	0.999791	2.59E-19

entzzvec<-as.vector(data.fit.eb$genes$ENTREZID)
tab1 <- topTable(data.fit.eb,lfc=1.5,adjust="BH",p.value=0.01, n=Inf,coef=1)
??? GG<-GET GO TERM ANNOTATIONS ???
MOgo <- goana(data.fit.eb, coef=1,FDR=0.01,geneid=data.fit.eb$genes$ENTREZID,species="Dr")
topMOgo<-topGO(MOgo,number=50)
i <- tab1$GeneID %in% GG$GeneID[GG$TermID=="GO:0004984"]
tab1[i,]

Thanks,

Matt

ADD REPLY • link 8.3 years ago mat149 ▴ 80