Visualizing GOstats HyperGTest results
1
0
Entering edit mode
@srinivas-iyyer-939
Last seen 10.2 years ago
Dear Group, Apologies for a similar re-post on this issue. I kindly want to bring to the attention of bioconductor developers, if there is a function to generate a heatmap of all enriched GO categories after a conditional hyperGTest. For example, I have 5 different time points time-series data and with 3 different doses. I run HyperGTest and I endup with atleast 50 categories for 5 time points and 3 drug treatments. 3hr @ d1; 3hr @ D2 ; 3 hr @ D3 - 50, 50 ,50 cats respectively. Like wise for 6 and 12 hrs for D1,D2 and D3. I have finally many HTML tables after testing for all time points and drugs. If I have a function to generate a heatmap for D1 all 3 time points on X-axis and a union of enriched categories on Y-axis and color them according to FDR or P-values, it would be a nice visualization. If somehow I can plot all timepoints and all drug effects along with GO categories, that would be a beautiful depiction of results. my questions are: 1. Is there an inbuilt GOstats function like that.(goCluster has something similar, however it is not as easy to combine for a basic R skillset person like me). 2. If not, is it possible to combine all results and use standard heatmap function to obtain a figure of what I wanted and described above. I appreciate if any one can suggest or help me in this question. thank you. srini ________________________________________________________________ ____________________ Looking for last minute shopping deals?
GO Visualization GOstats GO Visualization GOstats • 1.8k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 5 hours ago
United States
Hi Srini, Srinivas Iyyer wrote: > Dear Group, > Apologies for a similar re-post on this issue. > > I kindly want to bring to the attention of > bioconductor developers, if there is a function to > generate a heatmap of all enriched GO categories after > a conditional hyperGTest. > > For example, I have 5 different time points > time-series data and with 3 different doses. I run > HyperGTest and I endup with atleast 50 categories for > 5 time points and 3 drug treatments. > > 3hr @ d1; 3hr @ D2 ; 3 hr @ D3 - 50, 50 ,50 cats > respectively. > > Like wise for 6 and 12 hrs for D1,D2 and D3. > > I have finally many HTML tables after testing for all > time points and drugs. > > If I have a function to generate a heatmap for D1 > all 3 time points on X-axis and a union of enriched > categories on Y-axis and color them according to FDR > or P-values, it would be a nice visualization. If > somehow I can plot all timepoints and all drug effects > along with GO categories, that would be a beautiful > depiction of results. > > my questions are: > > 1. Is there an inbuilt GOstats function like > that.(goCluster has something similar, however it is > not as easy to combine for a basic R skillset person > like me). > > 2. If not, is it possible to combine all results and > use standard heatmap function to obtain a figure of > what I wanted and described above. Sure. Just use the summary() function with the pvalue argument equal to 1. This will return all the GO terms. You can then select those terms that are significant in any comparison and then create a matrix containing the p-values for each term for each comparison. You would probably want to do something like take the -log_10 of these values and then you could feed that matrix to heatmap(). Best, Jim > > I appreciate if any one can suggest or help me in this > question. > > thank you. > > srini > > > > > ______________________________________________________________ ______________________ > Looking for last minute shopping deals? > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD COMMENT
0
Entering edit mode

When I did what you suggest it didn't return all GO terms, as the number of GO terms is varying with the gene set.

Here is an example:

A <- c("86", "51412", "192669", "8289", "57492", "83858", "10533", "10538", "54880", "1230", "947", "79577", "55038", "64866", "1017", "1026", "1063", "1105", "91851", "55118") 

B <- c("79624", "23192", "85003", "151887", "7555", "9267", "8663", "3692", "51359", "90625", "28982", "2597", "81502", "57470", "4223", "57380", "57552", "54539", "55655", "199713", "79031", "123", "5558", "10635", "81892")

library(org.Hs.eg.db) 
library(Category)
library(GOstats)
library(GO.db)

genesGO <- list (A,B)
names(genesGO) <- c("A","B")
allgenesGO <- mappedkeys(org.Hs.egGO)
onto.overGO.bp<-list()
for (i in 1:length(genesGO)){
        paramsGO <- new("GOHyperGParams", geneIds = genesGO[[i]][genesGO[[i]] %in% allgenesGO], universeGeneIds = allgenesGO, annotation = "org.Hs.eg.db", ontology = "BP", testDirection = "over", pvalueCutoff=1)
        hypGO <- hyperGTest(paramsGO)
        tgo <- summary(hypGO, pvalue = 1)
        onto.overGO <- data.frame(tgo[, 1:2], FDR = p.adjust(tgo[, 2], method = "fdr"),tgo[, 3:7])
        onto.overGO.bp[[i]]<-onto.overGO
}
names(onto.overGO.bp) <- c("A","B")
nrow(onto.overGO.bp[[1]])
nrow(onto.overGO.bp[[2]])

 The number of GO terms obtained varies with the gene set even when selecting p-value as 1: nrow(onto.overGO.bp[[1]]) is different from nrow(onto.overGO.bp[[2]]) , which makes impossible to build a heatmap.

Is it something missing in the code?

ADD REPLY

Login before adding your answer.

Traffic: 731 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6