Entering edit mode
Hi,
I'm trying to do an analysis of some tag3 Affy microarray data with
the limma package, and I've run into a problem I'm not sure how to
solve.
The data comes from a series of arrays recorded at different time
points. Each probe on the array represents a DNA tag from a yeast gene
deletion library. There are several different time points and
technical replicates for some of them. All I need to do is a linear
regression of the expression levels against time and find the slope of
that line.
So far so easy, but the problem I've run into is that each yeast gene
deletion mutant is represented on the array by multiple probes
(sometimes 2, sometimes more). What I'd like to do is fit all the data
for each deletion mutant simultaneously rather than on a probe-by-
probe basis. Hopefully this will improve the quality of the linear
regression.
So, is there any trick in limma (or in the preprocessing step prior to
limma) for combining probe level data (i.e. rows of the expression
matrix)? I could just average the data across probe sets for each
deleton, but that seems like it wouldn't be as powerful as fitting all
the data (I think?).
Alternatively, am I better off just taking the expression values and
doing the linear regression using the standard R lm function? In that
case, could anyone point me towards a method for accounting for the
technical replicates (which limma knows how to handle and does the
'right thing'). I've tried reading gls.series, but it's a bit scary
for a biologist.
Alex Gutteridge
Systems Biology Centre
University of Cambridge