Entering edit mode
Dear Stephanie,
> Date: 30 April 2014
> From: Pekka Kohonen <pkpekka at="" gmail.com="">
> From: Stefanie Busch <stefanie.busch2 at="" web.de="">
> To: Bioconductor <bioconductor at="" r-project.org="">
> Subject: Re: [BioC] 1. comparing chip Information in meta analysis /
Rankprod and 2. two color normalization
>
> Hello,
>
> I have two questions and I hope you can help me.
>
> I want to compare several studies with similar design but different
> arrays. The first step was to quantile normalize all data which
works
> well beside the two color experiment with an Agilent chip.
As you seem to have realized already, quantile normalization is not
usually appropriate for a two colour Agilent array. Loess
normalization
is generally for two colour arrays, and I recommend a normexp
background
correction step before that.
> I read the limma User Guide and find out that I must preprocess with
the
> function normalizeBetweenArrays. So I get M- and A-values and my
> question is which one shows the expression values for this
experiment?
Two colour arrays don't return expression values. Instead they return
log-ratios, which are stored in M.
When you compare Agilent to Affymetrix Chips and Illumina Beadarrays,
you
need to compare log-fold-changes and DE results, not expression
values.
> For comparing the results of the different studies I want to use the
> package: RankProd.
As far as I know, RankProd assesses differential expression and does
not
in itself help you compare one study to another.
The usual methods to compare one study to another are (i) to make a
scatterplot of logFC from the two experiments or (ii) to use a gene
set
test such as roast() in the limma package. The limma package can
compute
logFC for whatever comparison you are making.
> For a better comparision between the studies I used
> the Entrez IDs and I download the last chip information directly
from
> affymerix and illumina. So this reveal a new problem. For example
on
> the chip Affymetrix Mouse Genome 430 2.0 Array the ID 1449880_s_at
> stands for three gene names and entrez ids:Bglap /// Bglap2 ///
Bglap3 -
> 12095 /// 12096 /// 12097. On the Illumina Chip each gene has a
single
> Array ID:
> Bglap-rs1 - ILMN_1233122 - 12095
> Bglap1 - ILMN_2610166 - 12096
> Bglap2 - ILMN_2944508 - 12097
>
> So I don't no what I should do to compare the results of this two
> experiments. When I paste the expression values of 1449880_s_at
three
> times with the three different entrez-IDs the ranking which was
> calculating with the RankProd-Package was changed.
> Example:
> Chip ID Entrez-Id Control1 control 2 etc.
> 1449880_s_at - 12095 - 3,855 - 4,211 ...
> 1449880_s_at - 12096 - 3,855 - 4,211 ...
> 1449880_s_at - 12097 - 3,855 - 4,211 ...
>
> The other possibility is to take the three expression Values of the
> illumina chip to one value. But I don't know if the is the right
way.
> What is the better way?
For this purpose, I always recommend that, for each Entrez ID, you use
the
probe on each platform with the highest overall expression level. The
rationale of this is that you are using the probe that represents the
dominant transcript for that gene in the cell type. This method has
been
used for many published studies by now, the first of which may have
been:
http://www.biomedcentral.com/1471-2105/7/511
For example, you can proceed like this for the Agilent data, assuming
you
have put the EntrezIDs into the object:
MA <- normalizeBetweenArrays(RG, method="loess")
A <- rowMeans(MA$A)
o <- order(A,decreasing=TRUE)
MA2 <- MA[o,]
d <- duplicated(MA$genes$EntrezID)
MA2 <- MA2[!d,]
Now you have a data object with a unique probe for each EntrezID.
Simply averaging the probes or probe-sets is not generally
recommended,
because different probes for the same gene can have quite different
behaviour. A common situation is that one probe successfully probes
an
expressed transcript while another probe is essentially unexpressed.
Best wishes
Gordon
> Kind regards
> Stefanie Busch
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}