Thanks, Robert. If I am understanding you correctly, you would
advocate both
separate normalization AND separate linear modeling in the case where
the two
arrays come from different batches and have no common probeset,
correct? If I
was reading Gordon's reply to the other gentleman's email correctly,
he was
suggesting separate normalization but not separate linear modeling for
the
datasets. My question, which in retrospect was unclear, was about what
the
advantages/disadvantages were to combining/separating the datasets for
linear
modeling.
Matt
> Hi,
> I am not sure what you are really asking but here goes.
> References and corresponding R/Bioconductor packages are listed
below.
>
> In my opinion separate normalization and expression estimation is
> essential for different experiments (and by experiment I mean a
> collection of identical arrays processed at about the same time by
about
> the same people using about the same protocol; and by identical
arrays I
> mean from the same batch). While one can often do fancy things to
align
> different arrays prior to processing them it does not seem like a
good
> idea at all. When it works, so would separate normalization and when
it
> does not work you won't know.
>
> After you have normalized and estimated expression values then you
> have the gene matching problem. This is not tivial, there are papers
> around that discuss this (Parmigiani et al). There are some issues
> regarding whether you want to make inference at the gene level or
the
> sequence level (Unigene is not the same as Entrez Gene). While many
have
> ignored the issues that arise (even on a single chip) where the same
> gene has been probed via several different methods, that does not
seem
> to be a "best practices".
>
> If you have no common genes, then life is somewhat easier, you
just
> have a bunch more features, and the suggestion to simply use rbind
seems
> pretty sensible to me, although there are some potential pitfalls
and
> you might want to do some checking to ensure that one set of
features is
> not dominating the other for reasons that are not biological.
>
> If you do have genes in common, then life is harder, the models
are
> more complicated and IMHO you want to spend a few hours with a local
> statistician sorting out what questions you want to ask.
Essentially,
> considering what the right model is, on a per gene basis is a pretty
> good starting point. As I said there are some papers (Choi et al,
> Gentleman et al), sometimes they come under the heading of
> meta-analysis, and other times simply random effects models. For the
> more statistically inclined I recommend the book by Solomon and Cox
> which directly addresses issues regarding combining microarray
experiments.
>
> Best wishes,
> Robert
>
> G. Parmigiani, E. Garrett-Mayer, R. Anbazhagan, et al. A cross-study
> comparison of gene
> expression studies for the molecular classification of lung cancer.
> Clincal Cancer Research,
> 10:2922?2927, 2004.
> R package: MergeMaid
>
> J. K. Choi, U. Yu, S. Kim, et al. Combining multiple microarray
studies
> and modeling
> interstudy variation. Bioinformatics, 19, Suppl. 1:i84?i90, 2003.
> R package: GeneMeta
>
> D.R. Cox and P. J. Solomon. Components of Variance. Chapman and
Hall,
> New York, 2003.
>
>
> On the Synthesis of Microarray Experiments
> R. Gentleman, M. Ruschhaupt and W. Huber,
> R package: GeneMetaEx
> scholz at Ag.arizona.edu wrote:
> > Adrien,
> >
> > Thanks for this response. Unfortunately, there are no oligos in
common between
> > the two arrays. If anyone else has a response to my question
(below), I'd like
> > to hear it.
> >
> > Matt
> >
> >
> > Matt,
> >
> > I am not familiar with the maize arrays, but I am using the
following
> > procedure for Affymetrix moe430 split arrays, which have ~160
probesets
> > in common between A and B:
> > 1) background-correct each chip separately at probe-level
> > 2) get a measure of expression at probeset-level
> > 3) plot the common probesets against each other for each pair of
each
> > chips. If you observe the same thing as me, you will see that the
trend
> > is linear but with intercept != 0 and slope != 1.
> > 4) scale the B chip with those estimated intercept and slope
> >
> > Steps 1 and 2 are easily done with rma( , normalize=F).
> > Wolfgang Huber and I are currently writing a little package which
does
> > steps 3 and 4 automatically.
> >
> > I'm not sure whether this procedure could make sense or be adapted
> > somehow to your maize arrays (do they have enough probes in
common?),
> > but anyway, some food for thoughts...
> >
> > Adrien
> >
> >
> >>Gordon,
> >>
> >>Recently you advised someone with a split set of maize arrays
> >>that they could do their analysis by reading all the A slides
> >>into an RGList and normalizing, then doing the same with the
> >>B slides, and then combining the two datasets via
> >>rbind() of the two MAList objects. I have a similar (the
> >>same?) set of arrays and some of the users of these arrays
> >>have noted that the A and B slides perform differently, i.e.
> >>more background on the B slide, for whatever reason. Though
> >>I'm not actually convinced this is true, it makes me wonder
> >>whether the two datasets should be combined at all since
> >>there may be a "between array set"
> >>source of variation. Am I right to segregate these sets or is
> >>there some overwhelming benefit to combining them? I'm no
> >>statistician and would appreciate your take.
> >>
> >>Thanks,
> >>
> >>
> >
> > Matt
> >
> > ---------------------------------------------
> > College of Agriculture and Life Sciences Web Mail.
> >
http://ag.arizona.edu
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> >
https://stat.ethz.ch/mailman/listinfo/bioconductor
> >
>
> --
> Robert Gentleman, PhD
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M2-B876
> PO Box 19024
> Seattle, Washington 98109-1024
> 206-667-7700
> rgentlem at fhcrc.org
>
---------------------------------------------
College of Agriculture and Life Sciences Web Mail.
http://ag.arizona.edu