Hi everybody.
Imagine I have my ideal, conceptual, pipeline for working with my
Illumina 27k data:
A) Bead level intensities
---> (color adjustment, background subtraction, normalization, ?)
---> B) Meth and Unmeth intensities
---> (get ratios)
----> C) Beta values or M-values
As I understand, proper normalization, if needed, should be done over
the Red and Green signals (bead level signals, I know there are two
types of beads, M and U, for each locus, and that each locus is
assigned to a fixed color channel. I have studied the code in the
minfi package, although it is for 450k, just to understand how these
values get converted).
Q1) Am I right? Or, is there any way we can normalize by using the M
and U signals?
Q2) As an example, I have downloaded raw data from a given GEO
dataset, and found only the M and U (Signal A and B, in their
nomenclature) data. At that, point, I think I cannot do neither
normalization nor QC, can I? What happens if somebody tries to
normalize these signals and, for example, color balance was not
adjusted? Or does GenomeStudio care about that?
Q3) People from another lab want us to take a look to some methylation
datasets they have, and I am planning (having seen the facts above) to
ask them for the .txt or .idat files, in order to do QC and
normalization with lumi. Is that ok? Or is there any alternative? They
claim they are having a lot of problems for finding DMR's, and I
suspect they have low-quality data, and maybe that they are not
following a correct pipeline.
Regards,
Gustavo
---------------------------
Enviado con Sparrow (http://www.sparrowmailapp.com/?sig)
Hi,
I was wondering if there is a way of using alternative CDF
(brainarray) and PLIER normalization in the FARMS (Factor Analysis for
Robust Microarray Summarization) package. I want to further use the
I/NI filter to retain informative probesets. Any help would be
appreciated.
Thanks,Som.
[[alternative HTML version deleted]]
Hi Gustavo,
Normalization, at least when you do it the "lumi" way is done at the
probe
level, and as each locus is assigned to a fixed color, the
normalization is
actually done over the M and U values. So if you have real raw data
you can
pull it through your pipeline: color adjustment, background
subtraction,
normalization. The color adjustment takes the red and green signals
into
account and from my understanding this is done to make the loci more
comparable.
Q3) Sure that is alright. If you suspect low quality, you can use
HumMeth27QCReport (a CRAN package) to plot some of the control probe
values
or you can ask if they have some control plots from GenomeStudio.
Ofcourse
it could also be possible that the sample population is not suitable
for
finding DMR's (like very heterogeneous disease vs control)
Disclaimer: Im still a bachelor student, so dont make me accountable
for
any mistakes ;-) I'm sure there is someone around there with more
wisdom,
who can answer your questions more thoroughly.
Cheers,
Djie
2012/7/19 Gustavo Fernández Bayón <gbayon@gmail.com>
> Hi everybody.
>
> Imagine I have my ideal, conceptual, pipeline for working with my
Illumina
> 27k data:
>
> A) Bead level intensities
>
> ---> (color adjustment, background subtraction, normalization,
)
>
> ---> B) Meth and Unmeth intensities
>
> ---> (get ratios)
>
> ----> C) Beta values or M-values
>
> As I understand, proper normalization, if needed, should be done
over the
> Red and Green signals (bead level signals, I know there are two
types of
> beads, M and U, for each locus, and that each locus is assigned to a
fixed
> color channel. I have studied the code in the minfi package,
although it is
> for 450k, just to understand how these values get converted).
>
> Q1) Am I right? Or, is there any way we can normalize by using the M
and U
> signals?
>
> Q2) As an example, I have downloaded raw data from a given GEO
dataset,
> and found only the M and U (Signal A and B, in their nomenclature)
data. At
> that, point, I think I cannot do neither normalization nor QC, can
I? What
> happens if somebody tries to normalize these signals and, for
example,
> color balance was not adjusted? Or does GenomeStudio care about
that?
>
> Q3) People from another lab want us to take a look to some
methylation
> datasets they have, and I am planning (having seen the facts above)
to ask
> them for the .txt or .idat files, in order to do QC and
normalization with
> lumi. Is that ok? Or is there any alternative? They claim they are
having a
> lot of problems for finding DMR's, and I suspect they have low-
quality
> data, and maybe that they are not following a correct pipeline.
>
> Regards,
> Gustavo
>
> ---------------------------
> Enviado con Sparrow (http://www.sparrowmailapp.com/?sig)
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
[[alternative HTML version deleted]]
---------------------------
Enviado con Sparrow (http://www.sparrowmailapp.com/?sig)
El jueves 19 de julio de 2012 a las 10:06, Djie Tjwan Thung escribi?:
> Hi Gustavo,
>
> Normalization, at least when you do it the "lumi" way is done at the
probe level, and as each locus is assigned to a fixed color, the
normalization is actually done over the M and U values. So if you have
real raw data you can pull it through your pipeline: color adjustment,
background subtraction, normalization. The color adjustment takes the
red and green signals into account and from my understanding this is
done to make the loci more comparable.
That is what I was thinking about, but Tim's answers has made me think
that in the 27k it might not be as important, as the probes are all of
the same type. Unless, of course, we are planning to take into account
the wrong channel residual intensities in more advanced methods.
>
> Q3) Sure that is alright. If you suspect low quality, you can use
HumMeth27QCReport (a CRAN package)
Thanks for the link!
> to plot some of the control probe values or you can ask if they have
some control plots from GenomeStudio.
That's a good idea too. I'll ask them ASAP.
> Ofcourse it could also be possible that the sample population is not
suitable for finding DMR's (like very heterogeneous disease vs
control)
Yes, that might be a possibility. We currently have no clear idea of
what they are trying to do, nor the global structure of their
experiment.
> Disclaimer: Im still a bachelor student, so dont make me accountable
for any mistakes ;-) I'm sure there is someone around there with more
wisdom, who can answer your questions more thoroughly.
I currently hold a PhD in Machine Learning, although I have been apart
from the research world for 4 years. But, when talking about
bioinformatics, I am a complete beginner, willing to learn from
everybody, including you, of course. :) And, if your answers are going
to be as informative as this one, be sure I'll learn a lot from them.
>
> Cheers,
>
> Djie
Regards,
Gus
>
>
> 2012/7/19 Gustavo Fern?ndez Bay?n <gbayon at="" gmail.com="" (mailto:gbayon="" at="" gmail.com)="">
> > Hi everybody.
> >
> > Imagine I have my ideal, conceptual, pipeline for working with my
Illumina 27k data:
> >
> > A) Bead level intensities
> >
> > ---> (color adjustment, background subtraction, normalization, ?)
> >
> > ---> B) Meth and Unmeth intensities
> >
> > ---> (get ratios)
> >
> > ----> C) Beta values or M-values
> >
> > As I understand, proper normalization, if needed, should be done
over the Red and Green signals (bead level signals, I know there are
two types of beads, M and U, for each locus, and that each locus is
assigned to a fixed color channel. I have studied the code in the
minfi package, although it is for 450k, just to understand how these
values get converted).
> >
> > Q1) Am I right? Or, is there any way we can normalize by using the
M and U signals?
> >
> > Q2) As an example, I have downloaded raw data from a given GEO
dataset, and found only the M and U (Signal A and B, in their
nomenclature) data. At that, point, I think I cannot do neither
normalization nor QC, can I? What happens if somebody tries to
normalize these signals and, for example, color balance was not
adjusted? Or does GenomeStudio care about that?
> >
> > Q3) People from another lab want us to take a look to some
methylation datasets they have, and I am planning (having seen the
facts above) to ask them for the .txt or .idat files, in order to do
QC and normalization with lumi. Is that ok? Or is there any
alternative? They claim they are having a lot of problems for finding
DMR's, and I suspect they have low-quality data, and maybe that they
are not following a correct pipeline.
> >
> > Regards,
> > Gustavo
> >
> > ---------------------------
> > Enviado con Sparrow (http://www.sparrowmailapp.com/?sig)
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org (mailto:Bioconductor at
r-project.org)
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>