Dear list,
I am new to use the ?globaltest? packages (version "4.2.0"). I have 10
mouse
arrays from two groups (control and treated). I tested them against
all the
kegg pathways. The result looks stage to me because among the 171
pathways
tested, most of them have the identical p-value. And that p-value is
the
smallest.
The code I used is listed, could someone help to tell me where went
wrong
with my code?
Thanks!
/Mike
kegg<-as.list(mouse4302PATH2PROBE)
gtkegg<-globaltest(affy_expression, diagno, kegg)
##where the first argument ?affy_expression? is the affy expression
data set
I got by using function ?exprs()?, each row is one affy probe and each
column is from one array.
## the second argument ?diagno? is a vector containing 10 group names
(?treated? or ?control?) for the 10 arrays and they are in the
corresponding
order to the 10 columns in the expression data.
gtkegg<-sort(gtkegg)
#Just list the top 5 of the result, the P-value are identical, what?s
wrong?
gtkegg[1:5]
Global Test result:
Data: 10 samples with 45101 genes; 5 pathways tested
Model: logistic
Method: All 210 permutations
Genes Tested Statistic Q Expected Q sd of Q P-value
00623 12 12 37.552 9.1318 9.2135 0.0047619
00440 47 47 13.010 3.3143 1.7585 0.0047619
00624 43 43 57.812 9.1819 8.2350 0.0047619
00625 19 19 71.404 12.6820 10.4620 0.0047619
00626 28 28 15.648 3.5587 2.2039 0.0047619
_________________________________________________________________
Check the weather nationwide with MSN Search: Try it now!
On small groups the globaltest automatically uses permutation tests.
You
can see in your result that in this case there are 210 permutations,
and
all p-values will therefore be multiples of 1/210, with a minimum of
1/210(=0.0047619). You can force it to use a more gliding scale by
using
the argument method="asymptotic".
Jan
-----Original Message-----
From: bioconductor-bounces@stat.math.ethz.ch
[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of mike
Ad.
Sent: dinsdag 12 september 2006 16:23
To: bioconductor at stat.math.ethz.ch
Subject: [BioC] globaltest question
Dear list,
I am new to use the "globaltest" packages (version "4.2.0"). I have 10
mouse arrays from two groups (control and treated). I tested them
against all the kegg pathways. The result looks stage to me because
among the 171 pathways tested, most of them have the identical
p-value.
And that p-value is the smallest.
The code I used is listed, could someone help to tell me where went
wrong with my code?
Thanks!
/Mike
kegg<-as.list(mouse4302PATH2PROBE)
gtkegg<-globaltest(affy_expression, diagno, kegg) ##where the first
argument "affy_expression" is the affy expression data set I got by
using function "exprs()", each row is one affy probe and each column
is
from one array.
## the second argument "diagno" is a vector containing 10 group names
("treated" or "control") for the 10 arrays and they are in the
corresponding order to the 10 columns in the expression data.
gtkegg<-sort(gtkegg)
#Just list the top 5 of the result, the P-value are identical, what's
wrong?
gtkegg[1:5]
Global Test result:
Data: 10 samples with 45101 genes; 5 pathways tested
Model: logistic
Method: All 210 permutations
Genes Tested Statistic Q Expected Q sd of Q P-value
00623 12 12 37.552 9.1318 9.2135 0.0047619
00440 47 47 13.010 3.3143 1.7585 0.0047619
00624 43 43 57.812 9.1819 8.2350 0.0047619
00625 19 19 71.404 12.6820 10.4620 0.0047619
00626 28 28 15.648 3.5587 2.2039 0.0047619
_________________________________________________________________
Check the weather nationwide with MSN Search: Try it now!
Hi,
Thanks for the reply!
I have two following questions:
1. It is ok to use globaltest for small groups? (totally 10 arrays for
2
groups in my case.)
2. Different method ("auto", "asymptotic"...) in the globaltest gives
different p-values, how should one set threshold to pick out the
significant
pathways?
Thanks,
/Mike
>From: "Oosting, J. (PATH)" <j.oosting at="" lumc.nl="">
>To: "mike Ad." <mikeaddr at="" hotmail.com="">,<bioconductor at="" stat.math.ethz.ch="">
>Subject: RE: [BioC] globaltest question
>Date: Tue, 12 Sep 2006 16:42:25 +0200
>
>On small groups the globaltest automatically uses permutation tests.
You
>can see in your result that in this case there are 210 permutations,
and
>all p-values will therefore be multiples of 1/210, with a minimum of
>1/210(=0.0047619). You can force it to use a more gliding scale by
using
>the argument method="asymptotic".
>
>Jan
>
>-----Original Message-----
>From: bioconductor-bounces at stat.math.ethz.ch
>[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of mike
Ad.
>Sent: dinsdag 12 september 2006 16:23
>To: bioconductor at stat.math.ethz.ch
>Subject: [BioC] globaltest question
>
>Dear list,
>
>I am new to use the "globaltest" packages (version "4.2.0"). I have
10
>mouse arrays from two groups (control and treated). I tested them
>against all the kegg pathways. The result looks stage to me because
>among the 171 pathways tested, most of them have the identical
p-value.
>And that p-value is the smallest.
>The code I used is listed, could someone help to tell me where went
>wrong with my code?
>
>Thanks!
>
>/Mike
>
>kegg<-as.list(mouse4302PATH2PROBE)
>
>gtkegg<-globaltest(affy_expression, diagno, kegg) ##where the first
>argument "affy_expression" is the affy expression data set I got by
>using function "exprs()", each row is one affy probe and each column
is
>from one array.
>## the second argument "diagno" is a vector containing 10 group names
>("treated" or "control") for the 10 arrays and they are in the
>corresponding order to the 10 columns in the expression data.
>
>gtkegg<-sort(gtkegg)
>
>#Just list the top 5 of the result, the P-value are identical, what's
>wrong?
>gtkegg[1:5]
>Global Test result:
>Data: 10 samples with 45101 genes; 5 pathways tested
>Model: logistic
>Method: All 210 permutations
>
> Genes Tested Statistic Q Expected Q sd of Q P-value
>00623 12 12 37.552 9.1318 9.2135 0.0047619
>00440 47 47 13.010 3.3143 1.7585 0.0047619
>00624 43 43 57.812 9.1819 8.2350 0.0047619
>00625 19 19 71.404 12.6820 10.4620 0.0047619
>00626 28 28 15.648 3.5587 2.2039 0.0047619
>
>_________________________________________________________________
>Check the weather nationwide with MSN Search: Try it now!
>
_________________________________________________________________
Windows Live Spaces is here! It?s easy to create your own personal Web
site.
http://spaces.live.com/signup.aspx
Dear Mike,
The permutation version of globaltest is safe but conservative for
small
sample size. It can always be used, even in small groups, but it is
not
so useful if you want to test many pathways because a Bonferroni or
FDR
correction may leave you with no significant pathways at all due to
the
conservatism of the permutation test.
In that case you may therefore want to use the asymptotic version.
This
is like using the t-test for small samples when you are not completely
sure that the data are normally distributed, so some care should be
taken when interpreting the results. But for mining pathways for
strong
association with your phenotype this works quite well. Except in
unusual
situations, the most asymptotically significant pathways will also
have
the smallest possible permutation p-value.
For multiple testing correction see the gt.multtest function.
Jelle
-----Oorspronkelijk bericht-----
Van: mike Ad. [mailto:mikeaddr at hotmail.com]
Verzonden: dinsdag 12 september 2006 20:57
Aan: Oosting, J. (PATH); bioconductor at stat.math.ethz.ch
Onderwerp: Re: [BioC] globaltest question
Hi,
Thanks for the reply!
I have two following questions:
1. It is ok to use globaltest for small groups? (totally 10 arrays for
2
groups in my case.)
2. Different method ("auto", "asymptotic"...) in the globaltest gives
different p-values, how should one set threshold to pick out the
significant
pathways?
Thanks,
/Mike
>From: "Oosting, J. (PATH)" <j.oosting at="" lumc.nl="">
>To: "mike Ad." <mikeaddr at="" hotmail.com="">,<bioconductor at="" stat.math.ethz.ch="">
>Subject: RE: [BioC] globaltest question
>Date: Tue, 12 Sep 2006 16:42:25 +0200
>
>On small groups the globaltest automatically uses permutation tests.
You
>can see in your result that in this case there are 210 permutations,
and
>all p-values will therefore be multiples of 1/210, with a minimum of
>1/210(=0.0047619). You can force it to use a more gliding scale by
using
>the argument method="asymptotic".
>
>Jan
>
>-----Original Message-----
>From: bioconductor-bounces at stat.math.ethz.ch
>[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of mike
Ad.
>Sent: dinsdag 12 september 2006 16:23
>To: bioconductor at stat.math.ethz.ch
>Subject: [BioC] globaltest question
>
>Dear list,
>
>I am new to use the "globaltest" packages (version "4.2.0"). I have
10
>mouse arrays from two groups (control and treated). I tested them
>against all the kegg pathways. The result looks stage to me because
>among the 171 pathways tested, most of them have the identical
p-value.
>And that p-value is the smallest.
>The code I used is listed, could someone help to tell me where went
>wrong with my code?
>
>Thanks!
>
>/Mike
>
>kegg<-as.list(mouse4302PATH2PROBE)
>
>gtkegg<-globaltest(affy_expression, diagno, kegg) ##where the first
>argument "affy_expression" is the affy expression data set I got by
>using function "exprs()", each row is one affy probe and each column
is
>from one array.
>## the second argument "diagno" is a vector containing 10 group names
>("treated" or "control") for the 10 arrays and they are in the
>corresponding order to the 10 columns in the expression data.
>
>gtkegg<-sort(gtkegg)
>
>#Just list the top 5 of the result, the P-value are identical, what's
>wrong?
>gtkegg[1:5]
>Global Test result:
>Data: 10 samples with 45101 genes; 5 pathways tested
>Model: logistic
>Method: All 210 permutations
>
> Genes Tested Statistic Q Expected Q sd of Q P-value
>00623 12 12 37.552 9.1318 9.2135 0.0047619
>00440 47 47 13.010 3.3143 1.7585 0.0047619
>00624 43 43 57.812 9.1819 8.2350 0.0047619
>00625 19 19 71.404 12.6820 10.4620 0.0047619
>00626 28 28 15.648 3.5587 2.2039 0.0047619
>
>_________________________________________________________________
>Check the weather nationwide with MSN Search: Try it now!
>
_________________________________________________________________
Windows Live Spaces is here! It's easy to create your own personal Web
site.
http://spaces.live.com/signup.aspx
Dear list,
I am using the Rgraphviz("1.10.0?) to layout a graph. I would like to
change
the positions of some nodes (node center). But it seems the node
center is
not an attribute for the node, and I can not set them in the
?nodeAttrs? in
the "agopen".
The reason to do this is because when I plot the graph, the warning
message
says ?zero-length arrow is of indeterminate angle and so skipped?. I
checked
the result graph and found that some nodes are very close and the
arrows
between them were not drawn. So, I would like to change the node
positions
for them and replot the graph.
Could some one help with this?
Thanks,
/Mike
_________________________________________________________________
Check the weather nationwide with MSN Search: Try it now!
I don't think you could specify the node position. The node position
is
calculated based on the selected algorithm.
You could probably try to do a bigger plot, or plot the graph part by
part.
Li
> Dear list,
>
> I am using the Rgraphviz("1.10.0?) to layout a graph. I would like
to
> change
> the positions of some nodes (node center). But it seems the node
center is
> not an attribute for the node, and I can not set them in the
?nodeAttrs?
> in
> the "agopen".
> The reason to do this is because when I plot the graph, the warning
> message
> says ?zero-length arrow is of indeterminate angle and so skipped?. I
> checked
> the result graph and found that some nodes are very close and the
arrows
> between them were not drawn. So, I would like to change the node
positions
> for them and replot the graph.
> Could some one help with this?
>
> Thanks,
> /Mike
>
> _________________________________________________________________
> Check the weather nationwide with MSN Search: Try it now!
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
Hi,
I am trying to use GOstats (1.6.0) for GOSlim analyses, but have just
realised that several of my selected Slim terms (chosen via AmiGO
http://www.godatabase.org/) are not in GOstats.
> GOTERM$"GO:0031975"
NULL
> GOTERM$"GO:0007067"
GOID = GO:0007067
Term = mitosis
Definition = The division of the eukaryotic cell nucleus to produce
two
daughter nuclei that, usually, contain the identical chromosome
complement to their mother.
Ontology = BP
I have tried the text search methods as suggested by Nianhua recently,
also drew a blank. The unifying feature of all the "failures" is that
they have high number GOIDs.
Is there a simple way of finding what the highest GOID in GOstats is?
Also, I guess, when might GOstats be updated!?
Many thanks,
al
> sessionInfo()
Version 2.3.1 (2006-06-01)
i386-pc-mingw32
attached base packages:
[1] "splines" "tools" "methods" "stats" "graphics"
"grDevices" "utils" "datasets" "base"
other attached packages:
GOstats Category hgu95av2 KEGG multtest genefilter
survival xtable RBGL annotate GO graph
Ruuid Biobase
"1.6.0" "1.4.1" "1.6.5" "1.6.5" "1.10.2" "1.11.7"
"2.28" "1.3-2" "1.8.1" "1.10.0" "1.6.5" "1.10.6"
"1.10.0"
"1.11.17"
gplots gdata gtools lattice MASS statmod
sma limma Hmisc
"2.0.2" "2.0.2" "2.0.2" "0.13-10" "7.2-27.1" "1.2.4"
"0.5.15" "2.7.10" "3.0-12"
Hi Al,
GOstats depends on the GO package, which is updated at the same time
as
the other annotation packages. That happens once per stable release if
I'm not mistaken.
My suggestion would be to update your annotations (e.g. KEGG, hgu95av2
and GO) because they're quite out of date: versions 1.6.5. The current
version for bioC 1.8 is 1.12.0.
Running your query on an up-to-date version of the GO annotation gives
me the following:
> GOTERM$"GO:0031975"@Term
[1] "envelope"
I think you have a mix from different versions, because your Biobase
is
version 1.11.17, while the current version for bioC 1.8 is version
1.10.1. Sometimes strange behavior can happen when packages from
different versions are installed.
You can update your annotations using the biocLite
source("http://www.bioconductor.org/biocLite.R")
and then updating the package that you want:
biocLite('GO')
I strongly suggest that you update all of your annotation packages to
the same version, because code generally breaks when your chip
annotation, for example, points to a KEGG pathway that the KEGG
annotation ignores.
Hope this helps,
Francois
On Tue, 2006-09-12 at 22:43 +0100, Al Ivens wrote:
> Hi,
>
> I am trying to use GOstats (1.6.0) for GOSlim analyses, but have
just
> realised that several of my selected Slim terms (chosen via AmiGO
> http://www.godatabase.org/) are not in GOstats.
>
> > GOTERM$"GO:0031975"
> NULL
> > GOTERM$"GO:0007067"
> GOID = GO:0007067
> Term = mitosis
> Definition = The division of the eukaryotic cell nucleus to produce
two
> daughter nuclei that, usually, contain the identical chromosome
> complement to their mother.
> Ontology = BP
>
> I have tried the text search methods as suggested by Nianhua
recently,
> also drew a blank. The unifying feature of all the "failures" is
that
> they have high number GOIDs.
>
> Is there a simple way of finding what the highest GOID in GOstats
is?
> Also, I guess, when might GOstats be updated!?
>
> Many thanks,
>
> al
>
> > sessionInfo()
> Version 2.3.1 (2006-06-01)
> i386-pc-mingw32
>
> attached base packages:
> [1] "splines" "tools" "methods" "stats" "graphics"
> "grDevices" "utils" "datasets" "base"
>
> other attached packages:
> GOstats Category hgu95av2 KEGG multtest genefilter
> survival xtable RBGL annotate GO graph
> Ruuid Biobase
> "1.6.0" "1.4.1" "1.6.5" "1.6.5" "1.10.2" "1.11.7"
> "2.28" "1.3-2" "1.8.1" "1.10.0" "1.6.5" "1.10.6"
"1.10.0"
> "1.11.17"
> gplots gdata gtools lattice MASS statmod
> sma limma Hmisc
> "2.0.2" "2.0.2" "2.0.2" "0.13-10" "7.2-27.1" "1.2.4"
> "0.5.15" "2.7.10" "3.0-12"
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
Hi Francois,
Thanks for the prompt response. I thought it might be a versioning
issue, but I religiously update every Monday AM!
I use:
* Windows GUI Menu
* Melbourne CRAN mirror
* all 4 repositories.
then run update packages from the menu. This should do the trick,
shouldn't it, normally?
The Biobase version was to make affyPLM and affyIO work with the
latest
version of RMAExpress. I shall
source("http://www.bioconductor.org/biocLite.R") as you suggest.
One other quick question: once one has done a:
require(mylibraryname)
is there a way of unloading it from RAM, so that an update can be done
without restarting the session?
Many thanks,
a
> -----Original Message-----
> From: Francois Pepin [mailto:fpepin at cs.mcgill.ca]
> Sent: 12 September 2006 23:08
> To: Al Ivens
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] GOstats
>
>
> Hi Al,
>
> GOstats depends on the GO package, which is updated at the
> same time as the other annotation packages. That happens once
> per stable release if I'm not mistaken.
>
> My suggestion would be to update your annotations (e.g. KEGG,
> hgu95av2 and GO) because they're quite out of date: versions
> 1.6.5. The current version for bioC 1.8 is 1.12.0.
>
> Running your query on an up-to-date version of the GO
> annotation gives me the following:
> > GOTERM$"GO:0031975"@Term
> [1] "envelope"
>
> I think you have a mix from different versions, because your
> Biobase is version 1.11.17, while the current version for
> bioC 1.8 is version 1.10.1. Sometimes strange behavior can
> happen when packages from different versions are installed.
>
> You can update your annotations using the biocLite
> source("http://www.bioconductor.org/biocLite.R")
> and then updating the package that you want:
> biocLite('GO')
>
> I strongly suggest that you update all of your annotation
> packages to the same version, because code generally breaks
> when your chip annotation, for example, points to a KEGG
> pathway that the KEGG annotation ignores.
>
> Hope this helps,
>
> Francois
>
> On Tue, 2006-09-12 at 22:43 +0100, Al Ivens wrote:
> > Hi,
> >
> > I am trying to use GOstats (1.6.0) for GOSlim analyses, but
> have just
> > realised that several of my selected Slim terms (chosen via AmiGO
> > http://www.godatabase.org/) are not in GOstats.
> >
> > > GOTERM$"GO:0031975"
> > NULL
> > > GOTERM$"GO:0007067"
> > GOID = GO:0007067
> > Term = mitosis
> > Definition = The division of the eukaryotic cell nucleus to
produce
> > two daughter nuclei that, usually, contain the identical
chromosome
> > complement to their mother.
> > Ontology = BP
> >
> > I have tried the text search methods as suggested by
> Nianhua recently,
> > also drew a blank. The unifying feature of all the
> "failures" is that
> > they have high number GOIDs.
> >
> > Is there a simple way of finding what the highest GOID in
> GOstats is?
> > Also, I guess, when might GOstats be updated!?
> >
> > Many thanks,
> >
> > al
> >
> > > sessionInfo()
> > Version 2.3.1 (2006-06-01)
> > i386-pc-mingw32
> >
> > attached base packages:
> > [1] "splines" "tools" "methods" "stats" "graphics"
> > "grDevices" "utils" "datasets" "base"
> >
> > other attached packages:
> > GOstats Category hgu95av2 KEGG multtest genefilter
> > survival xtable RBGL annotate GO graph
> > Ruuid Biobase
> > "1.6.0" "1.4.1" "1.6.5" "1.6.5" "1.10.2" "1.11.7"
> > "2.28" "1.3-2" "1.8.1" "1.10.0" "1.6.5"
> "1.10.6" "1.10.0"
> > "1.11.17"
> > gplots gdata gtools lattice MASS statmod
> > sma limma Hmisc
> > "2.0.2" "2.0.2" "2.0.2" "0.13-10" "7.2-27.1" "1.2.4"
> > "0.5.15" "2.7.10" "3.0-12"
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
>
>
Hi Ali
> Thanks for the prompt response. I thought it might be a versioning
> issue, but I religiously update every Monday AM!
>
> I use:
> * Windows GUI Menu
> * Melbourne CRAN mirror
> * all 4 repositories.
>
> then run update packages from the menu. This should do the trick,
> shouldn't it, normally?
I'm unfamiliar with the windows GUI menu, but the annotation and the
experiment data each have a repository of their own. I'm guessing
you're
not in your list. You might want to add them manually, the relevant
lines from the biocLite.R (after sourcing getBioC and biocinstall):
## CRAN-style Repositories where we'll look for packages
repos <- c(
"bioc",
"data/annotation",
"data/experiment",
"omegahat",
"lindsey"
)
repos <- paste("http://bioconductor.org/packages/1.8", repos,
sep="/")
repos <- c(repos, "http://cran.fhcrc.org")
This isn't recommended though, because it's more error-prone to
changes.
biocLite gets updated to make sure it keeps on working as it should
(like verifying that you have the right R version and such).
> is there a way of unloading it from RAM, so that an update can be
done
> without restarting the session?
you might want to look at ?detach. I am not familiar enough with the R
internals enough to know if it frees the RAM or not.
Francois