quick question about romer results
1
0
Entering edit mode
@iain-gallagher-2532
Last seen 9.5 years ago
United Kingdom
Hello list. I have a quick question about the 'romer' function used for GSEA in the limma suite. With guidance (much appreciated!) from Di Wu I have run some data through the algorithm and the results look appropriate for my dataset. The question I have is to do with the p-values returned. I run romer with 9999 rotations (a method suitable for 'permutation' in linear models) and get back ~3200 sig genesets. This is from an examination of ~ 17000 probes across 2 groups (3 adult arrays and 9 fetal arrays - affy data). If a set is significant I get back a p value of 0.0001 - nothing more & nothing less. If a set is not sig I get a p value of 1. I understand that the MINIMUM p value is 1/(nrot+1) so all my significant sets are at the minimum p value. I would have expected a range of p values however. Could someone with a more statistical background shed some light on this? Perhaps my groups are very different (would make some sense I suppose). Thanks again in advance for any help. iain > sessionInfo() R version 2.12.0 (2010-10-15) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.utf8 LC_NUMERIC=C [3] LC_TIME=en_GB.utf8 LC_COLLATE=en_GB.utf8 [5] LC_MONETARY=C LC_MESSAGES=en_GB.utf8 [7] LC_PAPER=en_GB.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] hgu133plus2.db_2.4.5 org.Hs.eg.db_2.4.6 RSQLite_0.9-3 [4] DBI_0.2-5 AnnotationDbi_1.12.0 genefilter_1.32.0 [7] hgu133plus2cdf_2.7.0 affy_1.28.0 Biobase_2.10.0 [10] RColorBrewer_1.0-2 limma_3.6.6 loaded via a namespace (and not attached): [1] affyio_1.18.0 annotate_1.28.0 preprocessCore_1.12.0 [4] splines_2.12.0 survival_2.36-2 tools_2.12.0 [7] xtable_1.5-6 >
hgu133plus2 limma hgu133plus2 limma • 1.3k views
ADD COMMENT
0
Entering edit mode
Di Wu ▴ 190
@di-wu-1837
Last seen 10.3 years ago
Hi Iain, Can you paste your code for romer, the design matrix and the contrast you are interested in? I suspect the contrast has not been the one you are interested in. Cheers, Di On Tue, Nov 30, 2010 at 5:05 AM, Iain Gallagher < iaingallagher@btopenworld.com> wrote: > Hello list. > > I have a quick question about the 'romer' function used for GSEA in the > limma suite. > > With guidance (much appreciated!) from Di Wu I have run some data through > the algorithm and the results look appropriate for my dataset. > > The question I have is to do with the p-values returned. I run romer with > 9999 rotations (a method suitable for 'permutation' in linear models) and > get back ~3200 sig genesets. This is from an examination of ~ 17000 probes > across 2 groups (3 adult arrays and 9 fetal arrays - affy data). > > If a set is significant I get back a p value of 0.0001 - nothing more & > nothing less. If a set is not sig I get a p value of 1. > > I understand that the MINIMUM p value is 1/(nrot+1) so all my significant > sets are at the minimum p value. > > I would have expected a range of p values however. Could someone with a > more statistical background shed some light on this? Perhaps my groups are > very different (would make some sense I suppose). > > Thanks again in advance for any help. > > iain > > > sessionInfo() > R version 2.12.0 (2010-10-15) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_GB.utf8 LC_NUMERIC=C > [3] LC_TIME=en_GB.utf8 LC_COLLATE=en_GB.utf8 > [5] LC_MONETARY=C LC_MESSAGES=en_GB.utf8 > [7] LC_PAPER=en_GB.utf8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] hgu133plus2.db_2.4.5 org.Hs.eg.db_2.4.6 RSQLite_0.9-3 > [4] DBI_0.2-5 AnnotationDbi_1.12.0 genefilter_1.32.0 > [7] hgu133plus2cdf_2.7.0 affy_1.28.0 Biobase_2.10.0 > [10] RColorBrewer_1.0-2 limma_3.6.6 > > loaded via a namespace (and not attached): > [1] affyio_1.18.0 annotate_1.28.0 preprocessCore_1.12.0 > [4] splines_2.12.0 survival_2.36-2 tools_2.12.0 > [7] xtable_1.5-6 > > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hello Di Thanks for your offer of help. Below I have pasted the code I am using for romer analysis. Basically I am using the same design and contrast matrix I use in limma for DE gene analysis. rmaFilteredData is an ExpressionSet object with 3 adult arrays (3H...) and 9 fetal arrays (FT...)  colnames(exprs(rmaDataFiltered))  [1] "3H5205-A H0284" "3H5205-B H0697" "3H5205-C H0312" "FT2435 P2 14+3"  [5] "FT2436 P2 16+3" "FT2449 P2 18+4" "FT2454 P6 19+4" "FT2455 P6 14+5"  [9] "FT2460 P6 20+0" "FT2462 P6 16+1" "FT2476 P6 14+5" "FT2480 P6 14+5" ###############~LIMMA~DE CODE##################################### library(limma) TS <- phenoData(rmaDataFiltered)$class #make model matrix design TS <- factor(TS, levels=c("fetal", "adult")) design <- model.matrix(~0+TS) > design    fetal adult 1      0     1 2      0     1 3      0     1 4      1     0 5      1     0 6      1     0 7      1     0 8      1     0 9      1     0 10     1     0 11     1     0 12     1     0 attr(,"assign") [1] 1 1 attr(,"contrasts") attr(,"contrasts")$TS [1] "contr.treatment" colnames(design) <- levels(TS) fit <- lmFit(rmaDataFiltered, design) #DE between fetal and adult cont.matrix <- makeContrasts(fVad = fetal-adult, levels = design) > cont.matrix        Contrasts Levels  fVad   fetal    1   adult   -1 fit2 <- contrasts.fit(fit, cont.matrix) fit2 <- eBayes(fit2) #get results out fetalVadResult <- topTable(fit2, coef=1, p.value=0.05, number=Inf, adjust.method='fdr', lfc=log2(1.5)) #annotate library(hgu133plus2.db) fetalVadResult$SYMBOL <-  unlist(mget(fetalVadResult$ID, hgu133plus2SYMBOL, ifnotfound=NA)) #romer GSEA with Broad Inst genesets # for romer i need: # iset: list of indices specifying the rows of ‘y’ in the gene sets. The list can be made by symbols2indices and the gene sets can be retrieved from the molecular signatures database in Broad Institute. # y the expression data # design: design matrix # contrast: contrast for which the test is required. Can be an integer specifying a column of ‘design’, or else a contrast vector of length equal to the number of columns of ‘design’. #get the symbols for the genes of interest symbols <- fetalVadResult$SYMBOL y <- exprs(rmaDataFiltered)[as.numeric(rownames(fetalVadResult)), ] #get expressionData for genes of interest load("/home/iain/Documents/Work/Results/GSEAData/human_c2.rdata") #ok now set up the index of where these genes are in the genesets c2 = symbols2indices(Hs.gmtl.c2, symbols)#use Hs.gmtl.c3 for motif sets # use design & matrix defined in limma work above test <- romer(c2, y, design=design, contrast=cont.matrix, nrot=9999) ups <- topRomer(test, 100, alternative='up') downs <- topRomer(test, 100, alternative='down') ####~END TEST CODE~##### Thanks for looking this over. iain --- On Mon, 29/11/10, Di Wu <di.wu@med.monash.edu.au> wrote: From: Di Wu <di.wu@med.monash.edu.au> Subject: Re: [BioC] quick question about romer results To: "Iain Gallagher" <iaingallagher@btopenworld.com> Cc: "bioconductor" <bioconductor@stat.math.ethz.ch> Date: Monday, 29 November, 2010, 23:32 Hi Iain, Can you paste your code for romer, the design matrix and the contrast you are interested in? I suspect the contrast has not been the one you are interested in. Cheers, Di On Tue, Nov 30, 2010 at 5:05 AM, Iain Gallagher <iaingallagher@btopenworld.com> wrote: Hello list. I have a quick question about the 'romer' function used for GSEA in the limma suite. With guidance (much appreciated!) from Di Wu I have run some data through the algorithm and the results look appropriate for my dataset. The question I have is to do with the p-values returned. I run romer with 9999 rotations (a method suitable for 'permutation' in linear models) and get back ~3200 sig genesets. This is from an examination of ~ 17000 probes across 2 groups (3 adult arrays and 9 fetal arrays - affy data). If a set is significant I get back a p value of 0.0001 - nothing more & nothing less. If a set is not sig I get a p value of 1. I understand that the MINIMUM p value is 1/(nrot+1) so all my significant sets are at the minimum p value. I would have expected a range of p values however. Could someone with a more statistical background shed some light on this? Perhaps my groups are very different (would make some sense I suppose). Thanks again in advance for any help. iain > sessionInfo() R version 2.12.0 (2010-10-15) Platform: x86_64-pc-linux-gnu (64-bit) locale:  [1] LC_CTYPE=en_GB.utf8       LC_NUMERIC=C  [3] LC_TIME=en_GB.utf8        LC_COLLATE=en_GB.utf8  [5] LC_MONETARY=C             LC_MESSAGES=en_GB.utf8  [7] LC_PAPER=en_GB.utf8       LC_NAME=C  [9] LC_ADDRESS=C              LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats     graphics  grDevices utils     datasets  methods  base other attached packages:  [1] hgu133plus2.db_2.4.5 org.Hs.eg.db_2.4.6   RSQLite_0.9-3  [4] DBI_0.2-5            AnnotationDbi_1.12.0 genefilter_1.32.0  [7] hgu133plus2cdf_2.7.0 affy_1.28.0          Biobase_2.10.0 [10] RColorBrewer_1.0-2   limma_3.6.6 loaded via a namespace (and not attached): [1] affyio_1.18.0         annotate_1.28.0      preprocessCore_1.12.0 [4] splines_2.12.0        survival_2.36-2       tools_2.12.0 [7] xtable_1.5-6 > _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hello again Di I must have made some error last night (it was late!) executing the code. All runs fine today and gives a range of p values. Very nice! Thanks for your input though. iain --- On Mon, 29/11/10, Di Wu <di.wu@med.monash.edu.au> wrote: From: Di Wu <di.wu@med.monash.edu.au> Subject: Re: [BioC] quick question about romer results To: "Iain Gallagher" <iaingallagher@btopenworld.com> Cc: "bioconductor" <bioconductor@stat.math.ethz.ch> Date: Monday, 29 November, 2010, 23:32 Hi Iain, Can you paste your code for romer, the design matrix and the contrast you are interested in? I suspect the contrast has not been the one you are interested in. Cheers, Di On Tue, Nov 30, 2010 at 5:05 AM, Iain Gallagher <iaingallagher@btopenworld.com> wrote: Hello list. I have a quick question about the 'romer' function used for GSEA in the limma suite. With guidance (much appreciated!) from Di Wu I have run some data through the algorithm and the results look appropriate for my dataset. The question I have is to do with the p-values returned. I run romer with 9999 rotations (a method suitable for 'permutation' in linear models) and get back ~3200 sig genesets. This is from an examination of ~ 17000 probes across 2 groups (3 adult arrays and 9 fetal arrays - affy data). If a set is significant I get back a p value of 0.0001 - nothing more & nothing less. If a set is not sig I get a p value of 1. I understand that the MINIMUM p value is 1/(nrot+1) so all my significant sets are at the minimum p value. I would have expected a range of p values however. Could someone with a more statistical background shed some light on this? Perhaps my groups are very different (would make some sense I suppose). Thanks again in advance for any help. iain > sessionInfo() R version 2.12.0 (2010-10-15) Platform: x86_64-pc-linux-gnu (64-bit) locale:  [1] LC_CTYPE=en_GB.utf8       LC_NUMERIC=C  [3] LC_TIME=en_GB.utf8        LC_COLLATE=en_GB.utf8  [5] LC_MONETARY=C             LC_MESSAGES=en_GB.utf8  [7] LC_PAPER=en_GB.utf8       LC_NAME=C  [9] LC_ADDRESS=C              LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats     graphics  grDevices utils     datasets  methods   base other attached packages:  [1] hgu133plus2.db_2.4.5 org.Hs.eg.db_2.4.6   RSQLite_0.9-3  [4] DBI_0.2-5            AnnotationDbi_1.12.0 genefilter_1.32.0  [7] hgu133plus2cdf_2.7.0 affy_1.28.0          Biobase_2.10.0 [10] RColorBrewer_1.0-2   limma_3.6.6 loaded via a namespace (and not attached): [1] affyio_1.18.0         annotate_1.28.0       preprocessCore_1.12.0 [4] splines_2.12.0        survival_2.36-2       tools_2.12.0 [7] xtable_1.5-6 > _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 719 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6