Dear Sean, I agree completely with you. But in both cases what we will
probably have in our data set will be mixtures of distributions, with
maybe more than just one level of 'normal' copy number. So, although
there may be biological reasons behind, the effect is statistical and
is reflected in a mixture of distributions.
Oscar
-----Mensaje original-----
De: seandavi at gmail.com en nombre de Sean Davis
Enviado el: mi? 02/07/2008 21:41
Para: Benjamin Otto
CC: Rueda.Oscar_Manuel; bioconductor at stat.math.ethz.ch
Asunto: Re: [BioC] gains and losses via mode shifting
Oscar and Benjamin,
I do not think that one needs to suggest that two modes is a
statistical issue related to non-normal noise distribution. There are
at least two perfectly plausible biological interpretations of this
situation.
1) Aneuploidy of a large proportion of the genome (but homogeneous
population)
2) Tissue heterogeneity (two or more distinct populations with
different copy number profiles)
Sean
On Wed, Jul 2, 2008 at 10:31 AM, Benjamin Otto
<b.otto at="" uke.uni-hamburg.de=""> wrote:
> Good point. But then if I do assume a statistical effect what would
I expect
> concerning expression arrays of the same sample? Would the same
noise
> distribution fit to these data? In other words: Would this
distribution type
> be a feature of the sample or a feature of the technique (aCGH
specific)
> itself?
>
> Benjamin
>
>
>
> -----Urspr?ngliche Nachricht-----
> Von: Oscar Rueda [mailto:omrueda at cnio.es]
> Gesendet: Wednesday, July 02, 2008 4:18 PM
> An: Benjamin Otto; bioconductor at stat.math.ethz.ch
> Betreff: Re: AW: AW: AW: [BioC] gains and losses via mode shifting
>
> Yes, I have always thought that was the more plausible biological
reason.
> The other (statistical) possibililty is that the noise distribution
for
> normal copy probes is not gaussian, so depending on its shape a
single
> smoothed mean might not be the best summary.
>
> Oscar
>
>
> On Wed, 02 Jul 2008 15:57:16 +0200, Benjamin Otto
> <b.otto at="" uke.uni-hamburg.de=""> wrote:
>
>> Hmm, maybe you can help me understand that point a little bit
better. I'm
>> still not sure I really understand what I do see in this sample.
>>
>> Let me assume, even if it might not be true, that we are talking
about
>> tetraploid tumor cells. Let me take tetraploids to have a little
bigger
>> range for loss levels so the level changes might not always be so
clear.
>> So
>> from a technical point of view if I don't have any gains or losses
then I
>> would expect all the segment means to be on one level right? That's
>> because
>> the oligos for all chromosomes are distributed over the whole chip
so any
>> noise should apply for all chromosomes equally. There should be no
bias
>> "per
>> complete chromosome" in terms of physical position of oligos on the
chip,
>> hybridization quality or affinity or dye effects. All these should
apply
>> equally to all chromosomes. How can I observe clear shifts right on
the
>> border between chromosomes, even if they are small, which would not
>> correspond to a biological difference in copy number? Why should
the
>> break
>> be just right between the single chromosomes? Is there a technical
system
>> effect which can result in such profiles?
>>
>> The only thing occurring to my mind is a heterogeneous mixture of
cells
>> who
>> have different copy numbers for certain chromosomes. So a segment
mean
>> would
>> not correspond to a defined number of copies but something in
between.
>> But
>> is there another explanation?
>>
>>
>> Best regards,
>>
>> Benjamin
>>
>>
>>
>>
>> -----Urspr?ngliche Nachricht-----
>> Von: Oscar Rueda [mailto:omrueda at cnio.es]
>> Gesendet: Wednesday, July 02, 2008 11:10 AM
>> An: Benjamin Otto; bioconductor at stat.math.ethz.ch
>> Betreff: Re: AW: AW: [BioC] gains and losses via mode shifting
>>
>> Well, setting aside biological reasons to have these two modes,
from a
>> statistical point of view there is no problem in having two normal
>> levels.
>> In the case of gaussian mixtures, this could occur if for example
the
>> distribution of the normal probes would have negative kurtosis, so
two
>> normal distributions would be needed to model it. In the case of
DNACopy
>> it is not so clear, because it is just a smoothing method but what
I
>> would
>> do is consider both levels as normal levels, if mergeLevels does
not
>> merge
>> them.
>>
>> Bets,
>>
>> Oscar M. Rueda
>> Structural Computational Biology Group
>> Spanish National Cancer Centre (CNIO)
>> Madrid, SPAIN.
>>
>>
>>
>>
>> On Tue, 01 Jul 2008 18:17:39 +0200, Benjamin Otto
>> <b.otto at="" uke.uni-hamburg.de=""> wrote:
>>
>>> I'm, not sure, if changing to one of these methods will solve my
>>> problem.
>>> Here is one of the samples I mean. The first picture is the CBS
>>> segmentations. The second displays the density distribution of the
>>> segments
>>> on the right and the segments only on the left. The segments are
colored
>>> in
>>> black in their original level and in red after shifting by the
mode of
>>> the
>>> highest peak of the distribution.
>>>
>>> However, the distribution is what troubles me!!! I do agree with
Sean
>>> that
>>> usually the lower mode seems more preferable. But this
distribution
>>> looks
>>> nearly mirrored by the y-axis. Have a look at the logratios and
the
>>> segments. Even if you merge some of the smaller segments with
small
>>> inter
>>> distance you will end up with a similar distribution of segments
on both
>>> sides of the x-axis.
>>>
>>> Or do I misinterpret the might of the methods you mentioned?
>>>
>>> If higher picture quality is needed, send me a note.
>>>
>>> Thanks for your replies until now. :)
>>>
>>> Best regards,
>>>
>>> Benjamin
>>>
>>>
>>> -----Urspr?ngliche Nachricht-----
>>> Von: Oscar Rueda [mailto:omrueda at cnio.es]
>>> Gesendet: Tuesday, July 01, 2008 5:13 PM
>>> An: Benjamin Otto; bioconductor at stat.math.ethz.ch
>>> Betreff: Re: AW: [BioC] gains and losses via mode shifting
>>>
>>> Well, smoothed means are difficult to translate to alterations,
such as
>>> 'loss' or 'gain'. They are on the scale of the log-ratios, so they
>>> depend
>>> a lot on the variability of each array. I wouldn't expect the same
>>> levels
>>> for a set of arrays, even if they are normalized. The main problem
with
>>> methods as DNACopy is that you only have a smoothed mean, but not
a
>>> measure of the precision of that mean. You can use the MergeLevels
>>> algorithm (Willenbrock and Fridlyand 2005) to reduce the number of
>>> possible smoothed means in a hypothesis test fashion until you
only have
>>> levels for 'loss', 'normal' and 'gain', but this approach does not
>>> always
>>> produce good results.
>>> Methods based on Hidden Markov models, such as aCGH package,
BIOHMM or
>>> our
>>> package RJaCGH use hidden states and gaussian distributions to
represent
>>> copy numbers. Even in this case, every state does not have to
correspond
>>> to a different biological copy number, because we are fitting a
mixture
>>> of
>>> normal distributions and if the normal probes have a skewed
distribution
>>> we will need several components to model that distribution. But in
this
>>> case we can use the means and the variances of these states to
infere if
>>> they are well above zero (in that case we could classify them as
gains)
>>> or
>>> well below zero (in that case we could classify them as losses).
This is
>>> what our algorithm relabelStates() in RJCGH package does.
>>>
>>> Hope this helps,
>>>
>>> Oscar M. Rueda
>>> Structural Computational Biology Group
>>> Spanish National Cancer Centre (CNIO)
>>> Madrid, SPAIN.
>>>
>>>
>>>
>>> On Tue, 01 Jul 2008 12:54:59 +0200, Benjamin Otto
>>> <b.otto at="" uke.uni-hamburg.de=""> wrote:
>>>
>>>> The logratios are loess normalized with limma and the
>>>> smoothing/segmentation
>>>> is done with DNAcopy.
>>>>
>>>>
>>>> The problem is that some of the samples seem to belong to maniac
>>>> tumors.
>>>> The
>>>> intriguing point for some samples is not really chromosomes 1-3,
I only
>>>> use
>>>> them as a kind of clue, but more that I do observe two possible
base
>>>> lines
>>>> which exhibit nearly comparable peaks in my density function.
Each of
>>>> them
>>>> look as if it could be the real zero line, but I don't know which
one.
>>>> If I
>>>> used some criterion like 2*SD(50% quantile) for detection of
gains or
>>>> losses
>>>> then the shift direction would make a difference.
>>>>
>>>>
>>>> Benjamin
>>>>
>>>>
>>>>
>>>> -----Urspr?ngliche Nachricht-----
>>>> Von: Oscar Rueda [mailto:omrueda at cnio.es]
>>>> Gesendet: Tuesday, July 01, 2008 11:46 AM
>>>> An: Benjamin Otto; bioconductor at stat.math.ethz.ch
>>>> Betreff: Re: [BioC] gains and losses via mode shifting
>>>>
>>>> Dear Benjamin,
>>>>
>>>> I'm not sure if I understand correctly your problem, but are your
>>>> samples
>>>> normalized to have the same median?
>>>>
>>>> Oscar M. Rueda
>>>> Structural Computational Biology Group
>>>> Spanish National Cancer Centre (CNIO)
>>>> Madrid, SPAIN.
>>>>
>>>> On Mon, 30 Jun 2008 13:02:46 +0200, Benjamin Otto
>>>> <b.otto at="" uke.uni-hamburg.de=""> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> After the segmentation of CGH data in some papers the results
are
>>>>> frequently
>>>>> shifted by the density mode. To be more precise the mode of the
>>>>> highest
>>>>> peak
>>>>> is used. However this procedure depends on the condition that
there is
>>>>> clearly one prominent peak dominating the density function.
>>>>>
>>>>> Currently, in some of my samples, I do have the problem of two
>>>>> prominent
>>>>> peaks flanking the y-axis which make the decision about the
correct
>>>>> shift
>>>>> direction a difficult one. Moreover in some of the cases a shift
in
>>>>> one
>>>>> direction seems to be obvious, in some other cases a shift in
the
>>>>> other
>>>>> direction seems more preferable and in a third group the
preference is
>>>>> not
>>>>> quite clear. But in all groups a segmentation profile in
chromosomes
>>>>> 1-3
>>>>> is
>>>>> nearly identical which suggests that I do observe the same gain
or
>>>>> loss
>>>>> (depending on the shift direction) in all these samples.
>>>>>
>>>>> Does anyone have an idea how to assess this problem and how to
solve
>>>>> it?
>>>>> Is
>>>>> there another frequently used procedure aside the density mode
>>>>> shifting
>>>>> used
>>>>> for such data?
>>>>>
>>>>> I do have pictures of some samples displaying the problem but
they are
>>>>> too
>>>>> big for the mailing list. Is there an official repository I can
upload
>>>>> them
>>>>> to?
>>>>>
>>>>> Thanks in advance, best regards,
>>>>>
>>>>> Benjamin
>>>>>
>>>>> ======================================
>>>>> Benjamin Otto
>>>>> University Hospital Hamburg-Eppendorf
>>>>> Institute For Clinical Chemistry
>>>>> Martinistr. 52
>>>>> D-20246 Hamburg
>>>>>
>>>>> Tel.: +49 40 42803 1908
>>>>> Fax.: +49 40 42803 4971
>>>>> ======================================
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> **NOTA DE CONFIDENCIALIDAD** Este correo electr?nico, y en su
caso los
>>>> ficheros adjuntos, pueden contener informaci?n protegida para el
uso
>>>> exclusivo de su destinatario. Se proh?be la distribuci?n,
reproducci?n
>>>> o
>>>> cualquier otro tipo de transmisi?n por parte de otra persona que
no sea
>>>> el
>>>> destinatario. Si usted recibe por error este correo, se ruega
>>>> comunicarlo al
>>>> remitente y borrar el mensaje recibido.
>>>> **CONFIDENTIALITY NOTICE** This email communication and any
attachments
>>>> may
>>>> contain confidential and privileged information for the sole use
of the
>>>> designated recipient named above. Distribution, reproduction or
any
>>>> other
>>>> use of this transmission by any party other than the intended
recipient
>>>> is
>>>> prohibited. If you are not the intended recipient please contact
the
>>>> sender
>>>> and delete all copies.
>>>>
>>>>
>>>>
>>>
>>>
>>> **NOTA DE CONFIDENCIALIDAD** Este correo electr?nico, y en su caso
los
>>> ficheros adjuntos, pueden contener informaci?n protegida para el
uso
>>> exclusivo de su destinatario. Se proh?be la distribuci?n,
reproducci?n o
>>> cualquier otro tipo de transmisi?n por parte de otra persona que
no sea
>>> el
>>> destinatario. Si usted recibe por error este correo, se ruega
>>> comunicarlo al
>>> remitente y borrar el mensaje recibido.
>>> **CONFIDENTIALITY NOTICE** This email communication and any
attachments
>>> may
>>> contain confidential and privileged information for the sole use
of the
>>> designated recipient named above. Distribution, reproduction or
any
>>> other
>>> use of this transmission by any party other than the intended
recipient
>>> is
>>> prohibited. If you are not the intended recipient please contact
the
>>> sender
>>> and delete all copies.
>>>
>>>
>>>
>>
>>
>>
>> **NOTA DE CONFIDENCIALIDAD** Este correo electr?nico, y en su caso
los
>> ficheros adjuntos, pueden contener informaci?n protegida para el
uso
>> exclusivo de su destinatario. Se proh?be la distribuci?n,
reproducci?n o
>> cualquier otro tipo de transmisi?n por parte de otra persona que no
sea
>> el
>> destinatario. Si usted recibe por error este correo, se ruega
>> comunicarlo al
>> remitente y borrar el mensaje recibido.
>> **CONFIDENTIALITY NOTICE** This email communication and any
attachments
>> may
>> contain confidential and privileged information for the sole use of
the
>> designated recipient named above. Distribution, reproduction or any
other
>> use of this transmission by any party other than the intended
recipient
>> is
>> prohibited. If you are not the intended recipient please contact
the
>> sender
>> and delete all copies.
>>
>>
>>
>>
>
>
>
> --
> Oscar M. Rueda
> Structural Computational Biology Group
> Spanish National Cancer Centre (CNIO)
> Madrid, SPAIN.
>
> **NOTA DE CONFIDENCIALIDAD** Este correo electr?nico, y en su caso
los
> ficheros adjuntos, pueden contener informaci?n protegida para el uso
> exclusivo de su destinatario. Se proh?be la distribuci?n,
reproducci?n o
> cualquier otro tipo de transmisi?n por parte de otra persona que no
sea el
> destinatario. Si usted recibe por error este correo, se ruega
comunicarlo al
> remitente y borrar el mensaje recibido.
> **CONFIDENTIALITY NOTICE** This email communication and any
attachments may
> contain confidential and privileged information for the sole use of
the
> designated recipient named above. Distribution, reproduction or any
other
> use of this transmission by any party other than the intended
recipient is
> prohibited. If you are not the intended recipient please contact the
sender
> and delete all copies.
>
>
>
>
> --
> Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und
Genossenschaftsregister sowie das Unternehmensregister (EHUG):
>
> Universit?tsklinikum Hamburg-Eppendorf
> K?rperschaft des ?ffentlichen Rechts
> Gerichtsstand: Hamburg
>
> Vorstandsmitglieder:
> Prof. Dr. J?rg F. Debatin (Vorsitzender)
> Dr. Alexander Kirstein
> Ricarda Klein
> Prof. Dr. Dr. Uwe Koch-Gromus
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
>
https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
**NOTA DE CONFIDENCIALIDAD** Este correo electr?nico, y
...{{dropped:3}}