gsea (gene set enrichment analysis) for ranked lists
2
0
Entering edit mode
Asta Laiho ▴ 120
@asta-laiho-4271
Last seen 8.8 years ago
Finland
Hi, I have been using Broad Institute's GSEA tool for gene set enrichment analysis tool in analyzing preranked lists. This allows me to perform statistical testing between the sample groups without coupling this directly to the enrichment analysis but rather to do these steps in a modular way. This also enables me to sort the genes according to my preferred logic and then analyze gene enrichment in a way that ignores the direction of the differential expression (up/down). The drawback of the Broad GSEA implementation is that all the annotations used are human based. I have been trying to search for an alternative approach within R/Bioconductor but haven't been able to find one so far that would fully meet the following criterion: - Allows one to test gene enrichment for preranked gene lists (works with ordered lists of gene symbols/identifiers rather that actual expression value matrixes and thus is not connected to a certain way of gene expression testing between sample groups) - Is available for a number of organisms and gene set annotations (at least GO and KEGG) - Allows one to ignore the direction of the regulation and concentrate on generally differentially expressed genes If someone is aware of a tool that would meet all these criterion, I would be very happy to know. Otherwise this can be regarded as a wish for such a method to be implemented in R/Bioconductor environment. Greetings, Asta
GO GO • 4.4k views
ADD COMMENT
0
Entering edit mode
SimonNoĆ«l ▴ 450
@simonnoel-3455
Last seen 10.2 years ago
Hi, I do use the R version of GSEA but I don't know any bioconductor package or other tool that do that. If you find any, let me know. I am looking for that to. Simon No?l CdeC ________________________________________ De : bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] de la part de Asta Laiho [asta.laiho at btk.fi] Date d'envoi : 5 avril 2011 09:35 ? : bioconductor at r-project.org Objet : [BioC] gsea (gene set enrichment analysis) for ranked lists Hi, I have been using Broad Institute's GSEA tool for gene set enrichment analysis tool in analyzing preranked lists. This allows me to perform statistical testing between the sample groups without coupling this directly to the enrichment analysis but rather to do these steps in a modular way. This also enables me to sort the genes according to my preferred logic and then analyze gene enrichment in a way that ignores the direction of the differential expression (up/down). The drawback of the Broad GSEA implementation is that all the annotations used are human based. I have been trying to search for an alternative approach within R/Bioconductor but haven't been able to find one so far that would fully meet the following criterion: - Allows one to test gene enrichment for preranked gene lists (works with ordered lists of gene symbols/identifiers rather that actual expression value matrixes and thus is not connected to a certain way of gene expression testing between sample groups) - Is available for a number of organisms and gene set annotations (at least GO and KEGG) - Allows one to ignore the direction of the regulation and concentrate on generally differentially expressed genes If someone is aware of a tool that would meet all these criterion, I would be very happy to know. Otherwise this can be regarded as a wish for such a method to be implemented in R/Bioconductor environment. Greetings, Asta _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
On 04/06/2011 11:41 AM, Simon No?l wrote: > Hi, > > I do use the R version of GSEA but I don't know any bioconductor package or other tool that do that. If you find any, let me know. I am looking for that to. > > Simon No?l > CdeC > ________________________________________ > De : bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] de la part de Asta Laiho [asta.laiho at btk.fi] > Date d'envoi : 5 avril 2011 09:35 > ? : bioconductor at r-project.org > Objet : [BioC] gsea (gene set enrichment analysis) for ranked lists > > Hi, > > I have been using Broad Institute's GSEA tool for gene set enrichment analysis tool in analyzing preranked lists. This allows me to perform statistical testing between the sample groups without coupling this directly to the enrichment analysis but rather to do these steps in a modular way. This also enables me to sort the genes according to my preferred logic and then analyze gene enrichment in a way that ignores the direction of the differential expression (up/down). The drawback of the Broad GSEA implementation is that all the annotations used are human based. I have been trying to search for an alternative approach within R/Bioconductor but haven't been able to find one so far that would fully meet the following criterion: > > - Allows one to test gene enrichment for preranked gene lists (works with ordered lists of gene symbols/identifiers rather that actual expression value matrixes and thus is not connected to a certain way of gene expression testing between sample groups) > - Is available for a number of organisms and gene set annotations (at least GO and KEGG) > - Allows one to ignore the direction of the regulation and concentrate on generally differentially expressed genes > The vignette in the Categories package provides a reasonable starting point for customizing analyses. Martin > If someone is aware of a tool that would meet all these criterion, I would be very happy to know. Otherwise this can be regarded as a wish for such a method to be implemented in R/Bioconductor environment. > > Greetings, > Asta > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
ADD REPLY
0
Entering edit mode
Hi Martin, Thanks for your advice. I tried looking at the vignette but couldn't directly see how I could carry out my mission using this package. Could you point out to me a bit more specifically which functions I should use for example to test for GO BP enrichment (and listing p-values for top GO BP terms) in my ranked list of human genes (input are rank values instead of test statistics). Many thanks in advance! - Asta On Apr 6, 2011, at 9:44 PM, Martin Morgan wrote: > On 04/06/2011 11:41 AM, Simon No?l wrote: >> Hi, >> >> I do use the R version of GSEA but I don't know any bioconductor package or other tool that do that. If you find any, let me know. I am looking for that to. >> >> Simon No?l >> CdeC >> ________________________________________ >> De : bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] de la part de Asta Laiho [asta.laiho at btk.fi] >> Date d'envoi : 5 avril 2011 09:35 >> ? : bioconductor at r-project.org >> Objet : [BioC] gsea (gene set enrichment analysis) for ranked lists >> >> Hi, >> >> I have been using Broad Institute's GSEA tool for gene set enrichment analysis tool in analyzing preranked lists. This allows me to perform statistical testing between the sample groups without coupling this directly to the enrichment analysis but rather to do these steps in a modular way. This also enables me to sort the genes according to my preferred logic and then analyze gene enrichment in a way that ignores the direction of the differential expression (up/down). The drawback of the Broad GSEA implementation is that all the annotations used are human based. I have been trying to search for an alternative approach within R/Bioconductor but haven't been able to find one so far that would fully meet the following criterion: >> >> - Allows one to test gene enrichment for preranked gene lists (works with ordered lists of gene symbols/identifiers rather that actual expression value matrixes and thus is not connected to a certain way of gene expression testing between sample groups) >> - Is available for a number of organisms and gene set annotations (at least GO and KEGG) >> - Allows one to ignore the direction of the regulation and concentrate on generally differentially expressed genes >> > > The vignette in the Categories package provides a reasonable starting point for customizing analyses. Martin > >> If someone is aware of a tool that would meet all these criterion, I would be very happy to know. Otherwise this can be regarded as a wish for such a method to be implemented in R/Bioconductor environment. >> >> Greetings, >> Asta >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793
ADD REPLY
0
Entering edit mode
Luo Weijun ★ 1.6k
@luo-weijun-1783
Last seen 17 months ago
United States
Hi Asta, I just came across your post. If I understand correctly, my gage package (with a supportive data package gageData) will do the analysis with all your criteria. Here are an example run: library(gage) library(gageData) data(gse16873) #gene sets data available for other species, type in ?kegg.gs datakegg.gs) #generate some pre-ranked gene expression data as you may have a= gse16873[,c(2,4)] - gse16873[,c(1,3)] a=apply(a, 2, rank) #test with direction: either up or down kegg.rk <- gage(a, gsets = kegg.gs, ref = NULL, samp = NULL, rank.test=T) names(kegg.rk) head(kegg.rk$greater) #note that your don?t need to pre-rank your genes to do rank.test with gage, the following line would give you the same results as kegg.rk above kegg.rk.2 <- gage(gse16873, gsets = kegg.gs, ref = c(1,3), samp = c(2,4), rank.test=T) #test without direction, i.e. 2-way perturbations kegg.rk.2d <- gage(gse16873, gsets = kegg.gs, ref = c(1,3), samp = c(2,4), rank.test=T, same.dir=F) names(kegg.rk.2d) head(kegg.rk.2d$greater) There are many other options gage provide for gene set test, check the package vignette or type in ?gage for details. If your pre-ranked gene data is a vector (or a single-column matrix), you need to create a single-column matrix (with column name) using cbind. First of all you need to update to the development version of gage package (due to a small bug). #generate pre-ranked data vector a= gse16873[,c(2,4)] - gse16873[,c(1,3)] a=apply(a, 1, mean) a=rank(a) kegg.p <- gage(cbind(exp1=a), gsets = kegg.gs, ref = NULL, samp = NULL) names(kegg.p) head(kegg.p$greater) This line above for single-column data GAGE analysis will NOT work with current release version 2.2.x) of gage as there is a small bug I just fixed. You need to download and install the development version at http://bioconductor.org/packages/2.9/bioc/html/gage.html sometime tomorrow when the daily check-build cycle is done. Hope this helps. Weijun ## Asta Laiho wrote: Hi, I have been using Broad Institute's GSEA tool for gene set enrichment analysis tool in analyzing preranked lists. This allows me to perform statistical testing between the sample groups without coupling this directly to the enrichment analysis but rather to do these steps in a modular way. This also enables me to sort the genes according to my preferred logic and then analyze gene enrichment in a way that ignores the direction of the differential expression (up/down). The drawback of the Broad GSEA implementation is that all the annotations used are human based. I have been trying to search for an alternative approach within R/Bioconductor but haven't been able to find one so far that would fully meet the following criterion: - Allows one to test gene enrichment for preranked gene lists (works with ordered lists of gene symbols/identifiers rather that actual expression value matrixes and thus is not connected to a certain way of gene expression testing between sample groups) - Is available for a number of organisms and gene set annotations (at least GO and KEGG) - Allows one to ignore the direction of the regulation and concentrate on generally differentially expressed genes If someone is aware of a tool that would meet all these criterion, I would be very happy to know. Otherwise this can be regarded as a wish for such a method to be implemented in R/Bioconductor environment. Greetings, Asta
ADD COMMENT

Login before adding your answer.

Traffic: 553 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6