I've got a quick newbie question-- if I'm trying to do two-way ANOVA
on a microarray experiment (with a few missing spots) which packages
should I install?
Can Bioconductor do empirical null distributions for the ANOVA F
statistic?
Thank you.
--------------------------------------
Protect yourself from spam,
use http://sneakemail.com
"Alex F. Bokov" <yjih74b02@sneakemail.com> writes:
> I've got a quick newbie question-- if I'm trying to do two-way ANOVA
> on a microarray experiment (with a few missing spots) which packages
> should I install?
It would be good to be clear -- ANOVA at the gene level, or are you
thinking of mixing in experiment-level parameters? For the former,
Biobase would suffice, though other packages (limma, nlme) might be
useful as well. This assumes that you've already got normalized data
(or else you'd need the appropriate packages)
> Can Bioconductor do empirical null distributions for the ANOVA F
statistic?
R can. See "?pf" for a function computing the distribution of the F
statistic.
best,
-tony
---
A.J. Rossini / rossini@u.washington.edu / rossini@scharp.org
Biomedical/Health Informatics and Biostatistics, University of
Washington.
Biostatistics, HVTN/SCHARP, Fred Hutchinson Cancer Research Center.
FHCRC: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email
CONFIDENTIALITY NOTICE: This e-mail message and any attachments ...
{{dropped}}
Wow, that was fast! Thanks for your answer.
rossini-at-blindglobe.net (A.J. Rossini) |bioconductor.org| wrote:
> It would be good to be clear -- ANOVA at the gene level, or are you
> thinking of mixing in experiment-level parameters? For the former,
> Biobase would suffice, though other packages (limma, nlme) might be
> useful as well. This assumes that you've already got normalized
data
> (or else you'd need the appropriate packages)
It's a two-way experiment. Young vs. old, mutant vs. wildtype, with 8
to 10 replicate arrays in each of the four possible combinations.
Would that mean it's experiment-level or gene-level (sorry, I don't
yet know the proper terminology)?
The sort of thing that in SAS would involve PROC GLM... except the
data is probably not normally distributed even after log2 (thus
necessitating the empirical null distribution) and of course the
multiple comparison problem (thus necessitating multtest or qvalue).
I'm surprised that a two-way ANOVA using an empirical null dist and
then adjusted p-values isn't the default analysis everyone does on
their microarrays... could it be that everyone in the field is doing
t-test style comparisons only?
--------------------------------------
Protect yourself from spam,
use http://sneakemail.com
Alex,
If I were you I would start by using Hubers'
vsn normalisation, it takes out the dependence
of the variance to the expression intensities
and then makes for an easier analysis.
In the multtest pacakge you have the option
of using the F or a nonparametric equivalent which
is what I always do as the data often are NOT normal,
Best
Susan Holmes
Statistics
Stanford
"Alex F. Bokov" <yjih74b02@sneakemail.com> writes:
> Wow, that was fast! Thanks for your answer.
>
> rossini-at-blindglobe.net (A.J. Rossini) |bioconductor.org| wrote:
>
>> It would be good to be clear -- ANOVA at the gene level, or are you
>> thinking of mixing in experiment-level parameters? For the
former,
>> Biobase would suffice, though other packages (limma, nlme) might be
>> useful as well. This assumes that you've already got normalized
data
>> (or else you'd need the appropriate packages)
>
> It's a two-way experiment. Young vs. old, mutant vs. wildtype, with
> 8 to 10 replicate arrays in each of the four possible
> combinations. Would that mean it's experiment-level or gene-level
> (sorry, I don't yet know the proper terminology)?
I'm assuming that you want to fit a 2-way ANOVA to each gene or tag.
That would be easy, the overview being to stick the data into an
exprSet, and then use esApply to compute p-values from which to
evaluate via multtest. The question would be if you wanted to include
a parameter for modeling/testing that was the same for all genes.
> The sort of thing that in SAS would involve PROC GLM... except the
> data is probably not normally distributed even after log2 (thus
> necessitating the empirical null distribution) and of course the
> multiple comparison problem (thus necessitating multtest or qvalue).
> I'm surprised that a two-way ANOVA using an empirical null dist and
> then adjusted p-values isn't the default analysis everyone does on
> their microarrays... could it be that everyone in the field is doing
> t-test style comparisons only?
Depends on all of the following: the technology and chips used, the
experiment (and different scientific areas seem to have different uses
for expression arrays), and how well normalization works. Not all
experiments are in the format you described (very simple 2-way
layout), and not all observed datasets are appropriate for ANOVA.
best,
-tony
--
A.J. Rossini / rossini@u.washington.edu / rossini@scharp.org
Biomedical/Health Informatics and Biostatistics, University of
Washington.
Biostatistics, HVTN/SCHARP, Fred Hutchinson Cancer Research Center.
FHCRC: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email
CONFIDENTIALITY NOTICE: This e-mail message and any attachments ...
{{dropped}}