Entering edit mode
Stephen Henderson
★
1.0k
@stephen-henderson-71
Last seen 7.6 years ago
MDS and PCA are exploratory analyses the latter accentuating
difference. If you are trying to make a diagnostic then I would be
pretty worried if your signatures were too specific to a very narrow
protocol-- they might be biased. Bear in mind that the bias you
describe is not in the biological samples (the real target) but in the
protocol.
Try quantile or loess normalizing--train on one and test on the other.
Then try putting them both together and cross validating.
By using 2 slightly different biased datasets you are probably
ensuring it is not over-fit and is robust enough for practical use.
No?
Stephen Henderson
Wolfson Inst. for Biomedical Research
Cruciform Bldg., Gower Street
University College London
United Kingdom, WC1E 6BT
+44 (0)207 679 6827
-----Original Message-----
From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-
bounces@stat.math.ethz.ch] On Behalf Of Malick.PAYE@eu.biomerieux.com
Sent: 16 February 2006 17:01
To: bioconductor at stat.math.ethz.ch
Subject: [BioC] problem to compare two series of arrays hybridized
with two different protocols
Hello all,
I work for an in vitro diagnostic company and we are interested in
analysing microarray gene expression data, we try to identify
molecular
signature to discriminate two populations (healthy and Cancer) .
We have collected 100 samples and hybridized them with a given
protocol
and have identified a molecular signature based on these data.
We then plan to assess the performance of such a signature, so we
collect
100 new samples and have hybridized with an "upgrade" version of the
first
protocol.
According to biologist there is no big difference between the two
protocols.
But when comparing the two populations (first and second protocol)
with
classical exploration technics (MDS, PCA) we can see that there is a
clear
difference between the two series.
I try to explain to biologist that the two series are quite different
and
normalizing the data will not solve the problem of changing protocol
and
explain them consequently it's not a problem of normalization.
I proposed to the biologists to take few samples and to hybridize them
(same samples) with the two protocols to try to see if there can be a
relationship explaining the difference between the two protocols, so
that
each sample will be hybridized with the two protocols.
My questions are :
How to best analyse these 10 samples (5*2)?
Is there a way to try to make the two populations comparable (I tried
to
normalize the data with quantiles and invariant set but we still have
the
two groups) ?
Is it reasonable to combine the two series to try to identify a new
signature ?
Any help will be greatly appreciated,
Thanks in advance,
Malick,
Malick Paye | bioM?rieux | Biomathematician
Phone: (+33)4 78 87 70 97 | Fax: (+33)4 78 87 53 40
[Parc Polytec, 5 Rue des Berges, 38004 Cedex 01 Grenoble, France]
AVIS : Ce courrier et ses pieces jointes sont destines a leur seul
destinataire et peuvent contenir des informations confidentielles
appartenant a bioMerieux. Si vous n'etes pas destinataire, vous etes
informe que toute lecture, divulgation, ou reproduction de ce message
et des pieces jointe est strictement interdite. Si vous avez recu ce
message par erreur merci d'en prevenir l'expediteur et de le detruire,
ainsi que ses pieces jointes.
NOTICE: This message and attachments are intended only for the use of
their addressee and may contain confidential information belonging to
bioMerieux. If you are not the intended recipient, you are hereby
notified that any reading, dissemination, distribution, or copying of
this message, or any attachment, is strictly prohibited. If you have
received this message in error, please notify the original sender
immediately and delete this message, along with any attachments.
[[alternative HTML version deleted]]
**********************************************************************
This email and any files transmitted with it are
confidentia...{{dropped}}