Dear group,
i want to convert all gene sets available from GSEA
(C1,C2,C3 and C4) into a 4 different gene sets, so
that I can use geneSetTest of limma on above 4
different gene sets.
In one of the examples (classic estrogen example, only
one set is described).
What kind of formats geneSetTest will read for gene
sets. ( e.g. every set writting in a single line
format OR a list of lists for each gene set).
Could anyone suggest some steps to make gene sets for
genesettest.
thanks
Srini
Hi Srini --
Srinivas Iyyer wrote:
> Dear group,
>
> i want to convert all gene sets available from GSEA
> (C1,C2,C3 and C4) into a 4 different gene sets, so
> that I can use geneSetTest of limma on above 4
> different gene sets.
>
Hi Srini -- I'm not really sure what you want to do, or if what you
want
to do makes sense. Here's my best guess at 'getting the job done', but
maybe others will give some more advice one whether it's actually a
good
idea.
One possibility is to visit the Broad and download their entire
database
http://www.broad.mit.edu/gsea/downloads.jsp
(look for 'XML database file'). You can then read these into R with
> library(GSEABase)
> gss = getBroadSets('/path/to/msigdb_v2.5.xml')
you might then
> collType = lapply(gss, collectionType)
> catType = sapply(collType, bcCategory)
> table(catType)
catType
c1 c2 c3 c4 c5
386 1892 837 883 1454
to get the category of each gene set, and
> c1sets = gss[catType=="c1"]
to select just the c1 sets.
> In one of the examples (classic estrogen example, only
> one set is described).
>
I'm not really sure what you are referring to (where is the example?)
and I'm not sure that geneSetTest will do what you want.
Say you've performed lmFit. You could make a vector of the relevant
est
statistic, and make sure the vector elements have appropriate names,
e.g., converting probes to Symbol identifiers. Say that vector is x.
If you were interested in a particular gene set, say all genes in
chromosome band 4q27, you could select that
> mySet = gss[["chr4q27"]]
You might then do something like
> geneSetTest(geneIds(mySet), x, ...
(where ... might be addition arguments to geneSetTest). If you wanted
to
test many sets, you might
lapply(gss[catType=="c1"], function(aSet) {
geneSetTest(geneIds(aSet), x, ...)
})
(where again ... are additional arguments to geneSetTest).
The key is to get x to have names that are the same as the geneIds in
the gene sets. The mapIdentifiers function in GSEABase might help.
Hope this helps,
Martin
> What kind of formats geneSetTest will read for gene
> sets. ( e.g. every set writting in a single line
> format OR a list of lists for each gene set).
>
> Could anyone suggest some steps to make gene sets for
> genesettest.
>
> thanks
> Srini
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
Dear Martin,
Thank you for your email. I should have explained
straight and simple.
I want to associate my differentially expressed genes
( data from lmfit and ebayes) to 10 different gene
expression studies.
What i have from those 10 studies is list of genes
that are differentially expressed. Ive written them
as the following tab-delim text file (.gmt file as per
GSEA)
Study1 NA G1 G2 G3 G4...
Study2 NA G10,G49, G432.....
I want to test these gene sets on my lmfit object
using geneSetTest().
In the example listed at
http://www.bioconductor.org/workshops/2005/BioC2005/labs/lab01/estroge
n.html.
example;
geneSetTest(knownERgenesOnChip,completeTableEst10$t,alternative="both"
)
In this particular example, the author is testing the
list of genes with sorted t-statistics obtained from
lmfit object on 'knownERgenesOnChip' set that has ~20
genes from that chip only.
However, In my case I want to test my lmfit object on
10 different sets. How do I make a smilar
'knownERgenesOnChip' gene set that will have all my 10
gene lists as sets.
Simply I do not want to run 10 genesetTest's on each
gene set. I want to run only once.
In the case of GSEA we make a .gmt file with each line
as one set.
apologies for not writing my question as simple.
Thank you .
srini
--- Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote:
> Hi Srini --
>
> Srinivas Iyyer wrote:
> > Dear group,
> >
> > i want to convert all gene sets available from
> GSEA
> > (C1,C2,C3 and C4) into a 4 different gene sets, so
> > that I can use geneSetTest of limma on above 4
> > different gene sets.
> >
> Hi Srini -- I'm not really sure what you want to do,
> or if what you want
> to do makes sense. Here's my best guess at 'getting
> the job done', but
> maybe others will give some more advice one whether
> it's actually a good
> idea.
>
> One possibility is to visit the Broad and download
> their entire database
>
> http://www.broad.mit.edu/gsea/downloads.jsp
>
> (look for 'XML database file'). You can then read
> these into R with
>
> > library(GSEABase)
> > gss = getBroadSets('/path/to/msigdb_v2.5.xml')
>
> you might then
>
> > collType = lapply(gss, collectionType)
> > catType = sapply(collType, bcCategory)
> > table(catType)
> catType
> c1 c2 c3 c4 c5
> 386 1892 837 883 1454
>
> to get the category of each gene set, and
>
> > c1sets = gss[catType=="c1"]
>
> to select just the c1 sets.
> > In one of the examples (classic estrogen example,
> only
> > one set is described).
> >
> I'm not really sure what you are referring to (where
> is the example?)
> and I'm not sure that geneSetTest will do what you
> want.
>
> Say you've performed lmFit. You could make a vector
> of the relevant est
> statistic, and make sure the vector elements have
> appropriate names,
> e.g., converting probes to Symbol identifiers. Say
> that vector is x.
>
> If you were interested in a particular gene set, say
> all genes in
> chromosome band 4q27, you could select that
>
> > mySet = gss[["chr4q27"]]
>
> You might then do something like
>
> > geneSetTest(geneIds(mySet), x, ...
>
> (where ... might be addition arguments to
> geneSetTest). If you wanted to
> test many sets, you might
>
> lapply(gss[catType=="c1"], function(aSet) {
> geneSetTest(geneIds(aSet), x, ...)
> })
>
> (where again ... are additional arguments to
> geneSetTest).
>
> The key is to get x to have names that are the same
> as the geneIds in
> the gene sets. The mapIdentifiers function in
> GSEABase might help.
>
> Hope this helps,
>
> Martin
> > What kind of formats geneSetTest will read for
> gene
> > sets. ( e.g. every set writting in a single line
> > format OR a list of lists for each gene set).
> >
> > Could anyone suggest some steps to make gene sets
> for
> > genesettest.
> >
> > thanks
> > Srini
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
>
http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
>
>
Dear Srinivas,
You don't seem to have looked at the help document for the
geneSetTest()
function, and you really do need to read it. That would make it clear
that geneSetTest() produces results for only one set at a time. So,
if
you want results for 10 sets, you have to run geneSetTest() 10 times.
There's no alternative. Running geneSetTest() 10 times is still a
whole
lot quicker than running GSEA once, so it doesn't seem such a problem.
Best wishes
Gordon
----------- Original Message ---------------------------
[BioC] making gene sets for geneSetTest of Limma
Srinivas Iyyer srini_iyyer_bio at yahoo.com
Wed May 28 05:50:30 CEST 2008
Dear Martin,
Thank you for your email. I should have explained
straight and simple.
I want to associate my differentially expressed genes
( data from lmfit and ebayes) to 10 different gene
expression studies.
What i have from those 10 studies is list of genes
that are differentially expressed. Ive written them
as the following tab-delim text file (.gmt file as per
GSEA)
Study1 NA G1 G2 G3 G4...
Study2 NA G10,G49, G432.....
I want to test these gene sets on my lmfit object
using geneSetTest().
In the example listed at
http://www.bioconductor.org/workshops/2005/BioC2005/labs/lab01/estroge
n.html.
example;
geneSetTest(knownERgenesOnChip,completeTableEst10$t,alternative="both"
)
In this particular example, the author is testing the
list of genes with sorted t-statistics obtained from
lmfit object on 'knownERgenesOnChip' set that has ~20
genes from that chip only.
However, In my case I want to test my lmfit object on
10 different sets. How do I make a smilar
'knownERgenesOnChip' gene set that will have all my 10
gene lists as sets.
Simply I do not want to run 10 genesetTest's on each
gene set. I want to run only once.
In the case of GSEA we make a .gmt file with each line
as one set.
apologies for not writing my question as simple.
Thank you .
srini
--- Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote:
> Hi Srini --
>
> Srinivas Iyyer wrote:
> > Dear group,
> >
> > i want to convert all gene sets available from
> GSEA
> > (C1,C2,C3 and C4) into a 4 different gene sets, so
> > that I can use geneSetTest of limma on above 4
> > different gene sets.
> >
> Hi Srini -- I'm not really sure what you want to do,
> or if what you want
> to do makes sense. Here's my best guess at 'getting
> the job done', but
> maybe others will give some more advice one whether
> it's actually a good
> idea.
>
> One possibility is to visit the Broad and download
> their entire database
>
> http://www.broad.mit.edu/gsea/downloads.jsp
>
> (look for 'XML database file'). You can then read
> these into R with
>
> > library(GSEABase)
> > gss = getBroadSets('/path/to/msigdb_v2.5.xml')
>
> you might then
>
> > collType = lapply(gss, collectionType)
> > catType = sapply(collType, bcCategory)
> > table(catType)
> catType
> c1 c2 c3 c4 c5
> 386 1892 837 883 1454
>
> to get the category of each gene set, and
>
> > c1sets = gss[catType=="c1"]
>
> to select just the c1 sets.
> > In one of the examples (classic estrogen example,
> only
> > one set is described).
> >
> I'm not really sure what you are referring to (where
> is the example?)
> and I'm not sure that geneSetTest will do what you
> want.
>
> Say you've performed lmFit. You could make a vector
> of the relevant est
> statistic, and make sure the vector elements have
> appropriate names,
> e.g., converting probes to Symbol identifiers. Say
> that vector is x.
>
> If you were interested in a particular gene set, say
> all genes in
> chromosome band 4q27, you could select that
>
> > mySet = gss[["chr4q27"]]
>
> You might then do something like
>
> > geneSetTest(geneIds(mySet), x, ...
>
> (where ... might be addition arguments to
> geneSetTest). If you wanted to
> test many sets, you might
>
> lapply(gss[catType=="c1"], function(aSet) {
> geneSetTest(geneIds(aSet), x, ...)
> })
>
> (where again ... are additional arguments to
> geneSetTest).
>
> The key is to get x to have names that are the same
> as the geneIds in
> the gene sets. The mapIdentifiers function in
> GSEABase might help.
>
> Hope this helps,
>
> Martin
> > What kind of formats geneSetTest will read for
> gene
> > sets. ( e.g. every set writting in a single line
> > format OR a list of lists for each gene set).
> >
> > Could anyone suggest some steps to make gene sets
> for
> > genesettest.
> >
> > thanks
> > Srini