Entering edit mode
Jun Yin
▴
30
@jun-yin-2690
Last seen 10.2 years ago
Hi, all,
I have a problem with normalizing my Affymetrix microarray data. We
are using Affymetrix Zebrafish Genome Array. The experiment design
includes three treatments, namely A, C and G. We have three biological
replicates for each treatment, thus A1, A2, A3, C1, C2, C3 and G1, G2,
G3.
A1, C1 and G1 were from the first batch (microarray experiment was
performed earlier). A2, A3, C2, C3 and G2, G3 were from the second
batch. We have very strong batch effect. If I use gcrma to normalize
the data, the only effect I can see is the batch effect, e.g. in the
hierarchical clustering, A1, C1 and G1 are clustered together. Then,
no matter what comparison I used, I cannot get any differentially
expressed genes from the data. It is obviously because the batch
effect (or background noise between batch) destroyed everything.
By accident, I used gcrma to normalize the three replicates from each
treatment separately. Something dramatically happened, like this:
$data<-ReadAffy()
$data1.gcrma<-gcrma(data[,1:3]) #A samples
$data2.gcrma<-gcrma(data[,4:6]) #C samples
$data3.gcrma<-gcrma(data[,7:9]) #G samples
$data.gcrma.exprs<-cbind(exprs(data1.gcrma,data2.gcrma,data3.gcrma))
Then, all the batch effects were gone. The variance within each
group/treatment was dramatically reduced. But then, I realized that
gcrma/rma uses median polish to summarize the probe set value, which
iteratively substracting row median and column median. The probe set
signal is calculated by adding global median to column median, thus
highly depends on the original column median of the probe set. It
probably introduced artifact if I normalize different groups
separately.
The most interesting thing is that the genes we expected was in the
gene list generated by the group-wise gcrma normalization. So, I just
wonder if there is any reason that this group-wise gcrma is
acceptable. I am kinda desperate on deciding whether to use the data
or discard everything. Because the batch effect is so strong and also
because of the small sample size, no normalization works so far
(gcrma, rma, mas5, loess/quantile/contrasts/scale normalization).
Thanks in advance.
Jun Yin
Ph.D. student in U.C.D.
2009-02-25
Bioinformatics Laboratory
Conway Institute
University College Dublin
[[alternative HTML version deleted]]