Apologies if this topic has already come up (I suspect it has!), I did
try searching the mailing list but to no avail;
I am combining some old affy datasets from 2002, which I have obtained
via GEO. They are of the "Gene Expression Data Matrix" format; i.e.
each
array has been background processed, summarised etc. However no
differential expression has been undertaken.
The datasets are of the "RGU-34" type, so there are 3 arrays with
8,000
features, RGU34A, RGU34B and RGU34C respectively. I want to combine
this
data with some more recent data from RAT2302 arrays. I understand I
can
combine the identifiers using some spreadsheets downloaded from the
affymetrix website.
However my problem is how I work out differential expression for
comparison. Is it correct to combine the Gene Expression Data matrices
for A B and C to make big esets of (8,000 * 3 =) 24,000 identifiers,
and
then use these in Limma to work out differential expression? Or would
it
be better to treat the arrays separately, then combine the results,
ordering by p-value.
The reason I want to combine the 3 RGU arrays is because I wish to
(amongst other things) perform category analysis to make a comparison
between the old data and the new data.
Any information on the best way to combine these arrays would be much
appreciated.
Kindest regards,
James Perkins
On Mon, Nov 3, 2008 at 5:58 AM, James Perkins
<jperkins@biochem.ucl.ac.uk>wrote:
> Apologies if this topic has already come up (I suspect it has!), I
did try
> searching the mailing list but to no avail;
>
> I am combining some old affy datasets from 2002, which I have
obtained via
> GEO. They are of the "Gene Expression Data Matrix" format; i.e. each
array
> has been background processed, summarised etc. However no
differential
> expression has been undertaken.
>
> The datasets are of the "RGU-34" type, so there are 3 arrays with
8,000
> features, RGU34A, RGU34B and RGU34C respectively. I want to combine
this
> data with some more recent data from RAT2302 arrays. I understand I
can
> combine the identifiers using some spreadsheets downloaded from the
> affymetrix website.
>
> However my problem is how I work out differential expression for
> comparison. Is it correct to combine the Gene Expression Data
matrices for A
> B and C to make big esets of (8,000 * 3 =) 24,000 identifiers, and
then use
> these in Limma to work out differential expression? Or would it be
better to
> treat the arrays separately, then combine the results, ordering by
p-value.
If the three arrays have very few probesets in common, then
normalizing
separately and then combining for differential expression will work
just
fine. Just FYI, there are numerous posts in the archives on this
topic.
Sean
>
>
> The reason I want to combine the 3 RGU arrays is because I wish to
(amongst
> other things) perform category analysis to make a comparison between
the old
> data and the new data.
[[alternative HTML version deleted]]