How to carry out Gene Set Enrichment Analysis(GSEA) on an ordered list of Entrezgene ids?

0

Entering edit mode

anna freni sterrantino ▴ 120

@anna-freni-sterrantino-2847

Last seen 10.5 years ago

Hi Chanchal , as long you have only a list on entrezID, there is not much to do, oh well: Of course you can get gene sets ,information related to the gene set structure, description etc it should be fine, with GSEABase. But to perform a Gene Set Enrichment Analysis, ( in a more general framework) you'll need a statistic ( usually a t-stat) at gene level and then you will use that to compute a statistic for gene set level. As long as I understood from your email you only know the order of gene's fold change, this kind of information is not enough, for this latter type of analysis. Hope it helps Best regards Anna ----- Messaggio originale ----- Da: Chanchal Kumar <chanchal@biochem.mpg.de> A: bioconductor@stat.math.ethz.ch Inviato: VenerdÃ¬ 25 luglio 2008, 1:14:20 Oggetto: [BioC] How to carry out Gene Set Enrichment Analysis(GSEA) on an ordered list of Entrezgene ids? Dear All, I have a set of Entrez ids which have been ordered as per fold change expression from a control experiment. I am now interested in carrying out gene set enrichment analysis using Bioconductor GSEABase package. I don't have any other statistics for these genes. Is it possible to carry out GSEA on a vector of Entrezids which is ordered by say fold change? I attach an example vector and would like to carry out GSEA on this test set to get an idea of how this might work. library(annotate) library(hgu95av2.db) set.seed(12345) set1 <- unique(getEG(sample(ls(hgu95av2GO), 100), "hgu95av2")) set1<-na.omit(set1) # as I get NAs in the vector before For GSEA I assume that element set1[1] has highest fold change and set1[length(set1)] has the lowest fold change. Any help in this regard will be appreciated. Thanks in advance! ---------------------------------------------------------------------- -- --------------- > sessionInfo() R version 2.7.0 (2008-04-22) i386-pc-mingw32 locale: LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETAR Y= German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252 attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] annotate_1.18.0 xtable_1.5-2 hgu95av2.db_2.2.0 [4] AnnotationDbi_1.2.1 RSQLite_0.6-8 DBI_0.2-4 [7] Biobase_2.0.1 loaded via a namespace (and not attached): [1] splines_2.7.0 ---------------------------------------------------------------------- -- --------------- Best Regards, Chanchal =============================== Chanchal Kumar, Ph.D. Candidate Dept. of Proteomics and Signal Transduction Max Planck Institute of Biochemistry Am Klopferspitz 18 82152 D-Martinsried (near Munich) Germany e-mail: chanchal@biochem.mpg.de Phone: (Office) +49 (0) 89 8578 2296 Fax:(Office) +49 (0) 89 8578 2219 http://www.biochem.mpg.de/mann/ =============================== _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor Anna Freni Sterrantino Ph.D Student Department of Statistics University of Bologna, Italy via Belle Arti 41, 40124 BO. Posta, news, sport, oroscopo: tutto in una sola pagina. [[elided Yahoo spam]] www.yahoo.it/latuapagina [[alternative HTML version deleted]]

Proteomics hgu95av2 GSEABase Proteomics hgu95av2 GSEABase • 2.7k views

ADD COMMENT • link 16.6 years ago anna freni sterrantino ▴ 120

0

Entering edit mode

anna freni sterrantino ▴ 120

@anna-freni-sterrantino-2847

Last seen 10.5 years ago

I see what you mean, now that you mention the Broad Institute, as long I know, the pre-ranked list of genes is ranked to reflect differential expressions between two classes, and not as in you case just by fold change values. Any way form Bioconductor, if you know at least the chip form where you got your EntrezID, you can easily get OMIM, if it's available. Good Luck Anna ----- Messaggio originale ----- Da:Chanchal Kumar <chanchal@biochem.mpg.de> A: anna freni sterrantino <annafreni@yahoo.it>; bioconductor@stat.math.ethz.ch Inviato: VenerdÃ¬ 25 luglio 2008, 11:43:26 Oggetto: RE: [BioC] How to carry out Gene Set Enrichment Analysis(GSEA) on an ordered list of Entrezgene ids? Hi Anna, Thank you for the reply. As the GSEA(from Broad Institute) has an option for analyzing preranked list of genes so I was curious if similar functionality exists in Bioconductor as well. I will like to use GSEA to get the OMIM etc. related information using Bioconductor as I donât think thatâs available in the annotation packages. Best Regards, Chanchal =============================== Chanchal Kumar,Ph.D. Candidate Dept. of Proteomics and Signal Transduction Max Planck Institute of Biochemistry Am Klopferspitz 18 82152 D-Martinsried (near Munich) Germany e-mail: chanchal@biochem.mpg.de Phone: (Office) +49 (0) 89 8578 2296 Fax:(Office) +49 (0) 89 8578 2219 http://www.biochem.mpg.de/mann/ =============================== From:anna freni sterrantino [mailto:annafreni@yahoo.it] Sent: Friday, July 25, 2008 11:03 AM To: Chanchal Kumar; bioconductor@stat.math.ethz.ch Subject: Re: [BioC] How to carry out Gene Set Enrichment Analysis(GSEA) on an ordered list of Entrezgene ids? Hi Chanchal , as long you have only a list on entrezID, there is not much to do, oh well: Of course you can get gene sets ,information related to the gene set structure, description etc it should be fine, with GSEABase. But to perform a Gene Set Enrichment Analysis, ( in a more general framework) you'll need a statistic ( usually a t-stat) at gene level and then you will use that to compute a statistic for gene set level. As long as I understood from your email you only know the order of gene's fold change, this kind of information is not enough, for this latter type of analysis. Hope it helps Best regards Anna ----- Messaggio originale ----- Da: Chanchal Kumar <chanchal@biochem.mpg.de> A: bioconductor@stat.math.ethz.ch Inviato: VenerdÃ¬ 25 luglio 2008, 1:14:20 Oggetto: [BioC] How to carry out Gene Set Enrichment Analysis(GSEA) on an ordered list of Entrezgene ids? Dear All, I have a set of Entrez ids which have been ordered as per fold change expression from a control experiment. I am now interested in carrying out gene set enrichment analysis using Bioconductor GSEABase package. I don't have any other statistics for these genes. Is it possible to carry out GSEA on a vector of Entrezids which is ordered by say fold change? I attach an example vector and would like to carry out GSEA on this test set to get an idea of how this might work. library(annotate) library(hgu95av2.db) set.seed(12345) set1 <- unique(getEG(sample(ls(hgu95av2GO), 100), "hgu95av2")) set1<-na.omit(set1) # as I get NAs in the vector before For GSEA I assume that element set1[1] has highest fold change and set1[length(set1)] has the lowest fold change. Any help in this regard will be appreciated. Thanks in advance! ---------------------------------------------------------------------- -- --------------- > sessionInfo() R version 2.7.0 (2008-04-22) i386-pc-mingw32 locale: LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETAR Y= German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252 attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] annotate_1.18.0 xtable_1.5-2 hgu95av2.db_2.2.0 [4] AnnotationDbi_1.2.1 RSQLite_0.6-8 DBI_0.2-4 [7] Biobase_2.0.1 loaded via a namespace (and not attached): [1] splines_2.7.0 ---------------------------------------------------------------------- -- --------------- Best Regards, Chanchal =============================== Chanchal Kumar, Ph.D. Candidate Dept. of Proteomics and Signal Transduction Max Planck Institute of Biochemistry Am Klopferspitz 18 82152 D-Martinsried (near Munich) Germany e-mail: chanchal@biochem.mpg.de Phone: (Office) +49 (0) 89 8578 2296 Fax:(Office) +49 (0) 89 8578 2219 http://www.biochem.mpg.de/mann/ =============================== _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor Anna Freni Sterrantino Ph.D Student Department of Statistics University of Bologna, Italy via Belle Arti 41, 40124 BO. Posta, news, sport, oroscopo: tutto in una sola pagina. [[elided Yahoo spam]] www.yahoo.it/latuapagina [[alternative HTML version deleted]]

ADD COMMENT • link 16.6 years ago anna freni sterrantino ▴ 120

0

Entering edit mode

Hi Anna, I am sorry to not mention this before but by fold change I meant the following, fold change = ratio(treated with stimuli / non treated control). And for OMIM I have decided to create my own annotation package as thatâs much easier to use. Best Regards, Chanchal =============================== Chanchal Kumar, Ph.D. Candidate Dept. of Proteomics and Signal Transduction Max Planck Institute of Biochemistry Am Klopferspitz 18 82152 D-Martinsried (near Munich) Germany e-mail: chanchal@biochem.mpg.de <mailto:chanchal@biochem.mpg.de> Phone: (Office) +49 (0) 89 8578 2296 Fax:(Office) +49 (0) 89 8578 2219 http://www.biochem.mpg.de/mann/ <http: www.biochem.mpg.de="" mann=""/> =============================== From: anna freni sterrantino [mailto:annafreni@yahoo.it] Sent: Friday, July 25, 2008 12:23 PM To: Chanchal Kumar; bioconductor@stat.math.ethz.ch Subject: Re: [BioC] How to carry out Gene Set Enrichment Analysis(GSEA) on an ordered list of Entrezgene ids? I see what you mean, now that you mention the Broad Institute, as long I know, the pre-ranked list of genes is ranked to reflect differential expressions between two classes, and not as in you case just by fold change values. Any way form Bioconductor, if you know at least the chip form where you got your EntrezID, you can easily get OMIM, if it's available. Good Luck Anna ----- Messaggio originale ----- Da:Chanchal Kumar <chanchal@biochem.mpg.de> A: anna freni sterrantino <annafreni@yahoo.it>; bioconductor@stat.math.ethz.ch Inviato: VenerdÃ¬ 25 luglio 2008, 11:43:26 Oggetto: RE: [BioC] How to carry out Gene Set Enrichment Analysis(GSEA) on an ordered list of Entrezgene ids? Hi Anna, Thank you for the reply. As the GSEA(from Broad Institute) has an option for analyzing preranked list of genes so I was curious if similar functionality exists in Bioconductor as well. I will like to use GSEA to get the OMIM etc. related information using Bioconductor as I donât think thatâs available in the annotation packages. Best Regards, Chanchal =============================== Chanchal Kumar,Ph.D. Candidate Dept. of Proteomics and Signal Transduction Max Planck Institute of Biochemistry Am Klopferspitz 18 82152 D-Martinsried (near Munich) Germany e-mail: chanchal@biochem.mpg.de <mailto:chanchal@biochem.mpg.de> Phone: (Office) +49 (0) 89 8578 2296 Fax:(Office) +49 (0) 89 8578 2219 http://www.biochem.mpg.de/mann/ <http: www.biochem.mpg.de="" mann=""/> =============================== From: anna freni sterrantino [mailto:annafreni@yahoo.it] Sent: Friday, July 25, 2008 11:03 AM To: Chanchal Kumar; bioconductor@stat.math.ethz.ch Subject: Re: [BioC] How to carry out Gene Set Enrichment Analysis(GSEA) on an ordered list of Entrezgene ids? Hi Chanchal , as long you have only a list on entrezID, there is not much to do, oh well: Of course you can get gene sets ,information related to the gene set structure, description etc it should be fine, with GSEABase. But to perform a Gene Set Enrichment Analysis, ( in a more general framework) you'll need a statistic ( usually a t-stat) at gene level and then you will use that to compute a statistic for gene set level. As long as I understood from your email you only know the order of gene's fold change, this kind of information is not enough, for this latter type of analysis. Hope it helps Best regards Anna ----- Messaggio originale ----- Da: Chanchal Kumar <chanchal@biochem.mpg.de> A: bioconductor@stat.math.ethz.ch Inviato: VenerdÃ¬ 25 luglio 2008, 1:14:20 Oggetto: [BioC] How to carry out Gene Set Enrichment Analysis(GSEA) on an ordered list of Entrezgene ids? Dear All, I have a set of Entrez ids which have been ordered as per fold change expression from a control experiment. I am now interested in carrying out gene set enrichment analysis using Bioconductor GSEABase package. I don't have any other statistics for these genes. Is it possible to carry out GSEA on a vector of Entrezids which is ordered by say fold change? I attach an example vector and would like to carry out GSEA on this test set to get an idea of how this might work. library(annotate) library(hgu95av2.db) set.seed(12345) set1 <- unique(getEG(sample(ls(hgu95av2GO), 100), "hgu95av2")) set1<-na.omit(set1) # as I get NAs in the vector before For GSEA I assume that element set1[1] has highest fold change and set1[length(set1)] has the lowest fold change. Any help in this regard will be appreciated. Thanks in advance! ---------------------------------------------------------------------- -- --------------- > sessionInfo() R version 2.7.0 (2008-04-22) i386-pc-mingw32 locale: LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETAR Y= German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252 attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] annotate_1.18.0 xtable_1.5-2 hgu95av2.db_2.2.0 [4] AnnotationDbi_1.2.1 RSQLite_0.6-8 DBI_0.2-4 [7] Biobase_2.0.1 loaded via a namespace (and not attached): [1] splines_2.7.0 ---------------------------------------------------------------------- -- --------------- Best Regards, Chanchal =============================== Chanchal Kumar, Ph.D. Candidate Dept. of Proteomics and Signal Transduction Max Planck Institute of Biochemistry Am Klopferspitz 18 82152 D-Martinsried (near Munich) Germany e-mail: chanchal@biochem.mpg.de Phone: (Office) +49 (0) 89 8578 2296 Fax:(Office) +49 (0) 89 8578 2219 http://www.biochem.mpg.de/mann/ =============================== _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor Anna Freni Sterrantino Ph.D Student Department of Statistics University of Bologna, Italy via Belle Arti 41, 40124 BO. ________________________________ Posta, news, sport, oroscopo: tutto in una sola pagina Crea l'home page che piace a te! <http: us.rd.yahoo.com="" mailuk="" taglin="" es="" isp="" control="" *http:="" us.rd.yahoo.com="" evt="52437/*<a href=" http:="" www.yahoo.it="" la"="" rel="nofollow">http:/www.yahoo.it/la" tuapagina=""> . [[alternative HTML version deleted]]

ADD REPLY • link 16.6 years ago Chanchal Kumar ▴ 130

0

Entering edit mode

Chanchal Kumar ▴ 130

@chanchal-kumar-2465

Last seen 10.5 years ago

Hi Anna, Thank you for the reply. As the GSEA(from Broad Institute) has an option for analyzing preranked list of genes so I was curious if similar functionality exists in Bioconductor as well. I will like to use GSEA to get the OMIM etc. related information using Bioconductor as I donât think thatâs available in the annotation packages. Best Regards, Chanchal =============================== Chanchal Kumar, Ph.D. Candidate Dept. of Proteomics and Signal Transduction Max Planck Institute of Biochemistry Am Klopferspitz 18 82152 D-Martinsried (near Munich) Germany e-mail: chanchal@biochem.mpg.de <mailto:chanchal@biochem.mpg.de> Phone: (Office) +49 (0) 89 8578 2296 Fax:(Office) +49 (0) 89 8578 2219 http://www.biochem.mpg.de/mann/ <http: www.biochem.mpg.de="" mann=""/> =============================== From: anna freni sterrantino [mailto:annafreni@yahoo.it] Sent: Friday, July 25, 2008 11:03 AM To: Chanchal Kumar; bioconductor@stat.math.ethz.ch Subject: Re: [BioC] How to carry out Gene Set Enrichment Analysis(GSEA) on an ordered list of Entrezgene ids? Hi Chanchal , as long you have only a list on entrezID, there is not much to do, oh well: Of course you can get gene sets ,information related to the gene set structure, description etc it should be fine, with GSEABase. But to perform a Gene Set Enrichment Analysis, ( in a more general framework) you'll need a statistic ( usually a t-stat) at gene level and then you will use that to compute a statistic for gene set level. As long as I understood from your email you only know the order of gene's fold change, this kind of information is not enough, for this latter type of analysis. Hope it helps Best regards Anna ----- Messaggio originale ----- Da: Chanchal Kumar <chanchal@biochem.mpg.de> A: bioconductor@stat.math.ethz.ch Inviato: VenerdÃ¬ 25 luglio 2008, 1:14:20 Oggetto: [BioC] How to carry out Gene Set Enrichment Analysis(GSEA) on an ordered list of Entrezgene ids? Dear All, I have a set of Entrez ids which have been ordered as per fold change expression from a control experiment. I am now interested in carrying out gene set enrichment analysis using Bioconductor GSEABase package. I don't have any other statistics for these genes. Is it possible to carry out GSEA on a vector of Entrezids which is ordered by say fold change? I attach an example vector and would like to carry out GSEA on this test set to get an idea of how this might work. library(annotate) library(hgu95av2.db) set.seed(12345) set1 <- unique(getEG(sample(ls(hgu95av2GO), 100), "hgu95av2")) set1<-na.omit(set1) # as I get NAs in the vector before For GSEA I assume that element set1[1] has highest fold change and set1[length(set1)] has the lowest fold change. Any help in this regard will be appreciated. Thanks in advance! ---------------------------------------------------------------------- -- --------------- > sessionInfo() R version 2.7.0 (2008-04-22) i386-pc-mingw32 locale: LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETAR Y= German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252 attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] annotate_1.18.0 xtable_1.5-2 hgu95av2.db_2.2.0 [4] AnnotationDbi_1.2.1 RSQLite_0.6-8 DBI_0.2-4 [7] Biobase_2.0.1 loaded via a namespace (and not attached): [1] splines_2.7.0 ---------------------------------------------------------------------- -- --------------- Best Regards, Chanchal =============================== Chanchal Kumar, Ph.D. Candidate Dept. of Proteomics and Signal Transduction Max Planck Institute of Biochemistry Am Klopferspitz 18 82152 D-Martinsried (near Munich) Germany e-mail: chanchal@biochem.mpg.de Phone: (Office) +49 (0) 89 8578 2296 Fax:(Office) +49 (0) 89 8578 2219 http://www.biochem.mpg.de/mann/ =============================== _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor Anna Freni Sterrantino Ph.D Student Department of Statistics University of Bologna, Italy via Belle Arti 41, 40124 BO. ________________________________ Posta, news, sport, oroscopo: tutto in una sola pagina Crea l'home page che piace a te! <http: us.rd.yahoo.com="" mailuk="" taglin="" es="" isp="" control="" *http:="" us.rd.yahoo.com="" evt="52437/*<a href=" http:="" www.yahoo.it="" la"="" rel="nofollow">http:/www.yahoo.it/la" tuapagina=""> . [[alternative HTML version deleted]]

ADD COMMENT • link 16.6 years ago Chanchal Kumar ▴ 130

Login before adding your answer.