Entering edit mode
Margaret Gardiner-Garden
▴
160
@margaret-gardiner-garden-426
Last seen 10.2 years ago
Hi, There has been quite a bit of discussion on this listserve about
dealing with technical replicates (in sets of arrays where the gene
layout
is the same across the experiment).
We have just started analysis of a Stanford published data set where
different print runs have a different set of genes replicated (ie
different
layouts).
For example on Chip 1 (treatment) we may have clones A,B,X,Y and on
Chip 2
(treatment) we may have A,A,B,X,X, Y and on Chip 3 (control) we may
have
A,A,B,X,Y and Chip 4 (control) we may have A, B, X, Y. For
simplicity, I
have only two chips for each condition in this example.
Below I have the 3 alternatives for analysis and the pros and cons as
we see
them at the moment. I would really appreciate any comments anyone may
have
on our thinking!
(1) Averaging the technical replicates within an array
Pro's: (a) they are the same DNA sequence on the array so an average
is
meaningful
(b) it would be easier for downstream analyses to have a single
representation on a chip, Con's: you will be doing a linear model on
averaged data and non-averaged data on each chip so the spots are
measured
with different precision.
(2) Analysing all combinations
eg for clone A: Chip 1 has A1, Chip 2 has A2, A3, Chip 3 has A4, A5,
Chip 4
has A6 To compare treatment to control you could compare A1 &A2 vs A4
& A6,
A1 & A3 vs A5 & A6, A1 &A2 vs A5 &A6, A1&A3 vs A4&A6
Pro's: You are using all the data individually instead of averaging.
Con's You can have lots of combinations (depending on the number of
replicates) and the chances of getting a low p value in at least one
of the
combinations is increased compared to a gene that has no replicate
probes,
so you may be biasing the results
(3) Randomly chose one of the technical replicates to represent a gene
on a
chip. ie randomly chose one of the combinations above for the analysis
Pros: Not going to give some genes a higher chance of a low p values
just
by chance.
Cons. Not using all the data
I was wondering if any one had any thoughts on which of these
alternatives
is the best, or whether there is another alternative we haven't
considered.
Any ideas would be really appreciated!
Thanks and Regards
Marg
Dr Margaret Gardiner-Garden
Garvan Institute of Medical Research,
Sydney Australia