We are analyzing some data of Illumina. There are three kind of
normalization. First of them is the method of rank invariant
normalization, recommended by Illumina, and we would like to apply it:
BSData.bgnorm = backgroundNormalise(BSData)
T = apply(exprs(BSData.bgnorm), 1, mean)
BSData.rankinv = assayDataElementReplace(BSData.bgnorm, "exprs",
rankInvariantNormalise(exprs(BSData.bgnorm), T))
But in BSData.rankinv I have negative values so I cannot apply the
method lmFit in order to analyze the differential expression because
of
the log2 transformation applied.
fit = lmFit(log2(exprs(BSData.rankinv)), design)
Are these two methods (rank inv method and lmFit) incompatible?
What kind of normalization should I use in order to search
differentially expressed genes in micro arrays of Illumina?
Thanks
Do you really need to do background subtraction with Illumina data?
Our
experience is that this step is not necessary.
Lynn Amon
Research Scientist
University of Washington
Nieves Velez de Mendizabal wrote:
> We are analyzing some data of Illumina. There are three kind of
> normalization. First of them is the method of rank invariant
> normalization, recommended by Illumina, and we would like to apply
it:
>
>
> BSData.bgnorm = backgroundNormalise(BSData)
> T = apply(exprs(BSData.bgnorm), 1, mean)
> BSData.rankinv = assayDataElementReplace(BSData.bgnorm, "exprs",
> rankInvariantNormalise(exprs(BSData.bgnorm), T))
>
>
> But in BSData.rankinv I have negative values so I cannot apply the
> method lmFit in order to analyze the differential expression because
of
> the log2 transformation applied.
>
> fit = lmFit(log2(exprs(BSData.rankinv)), design)
>
> Are these two methods (rank inv method and lmFit) incompatible?
> What kind of normalization should I use in order to search
> differentially expressed genes in micro arrays of Illumina?
>
> Thanks
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
Be careful not to confuse terminology here. 'Background correction'
of
Illumina data occurs at the raw bead level, and is typically the
default
setting in the Illumina software.
'Background normalisation' occurs at the bead summary level, and makes
use
of the negative controls to try and calibrate the data between arrays
(see
https://stat.ethz.ch/pipermail/bioconductor/2006-March/012178.html for
further discussion). Our experience is that background correction can
be
worthwhile, provided that it is done carefully, while background
normalisation is unhelpful if you want to analyse data on the log
scale
because it often produces negatives. Also note that the BeadStudio
software
(we have version 2.3.47) has a pop-up message warning against
background
normalisation for expression data.
At the moment we use quantile normalisation on the log2 scale to
normalise
BeadStudio summary data which hasn't been normalised already. You
could
also try rank.invariant without doing the background normalisation
(I'm not
sure if this is better done on the original or log2 scale?), i.e.
T = apply(exprs(BSData), 1, mean)
BSData.rankinv = assayDataElementReplace(BSData.bgnorm, "exprs",
rankInvariantNormalise(exprs(BSData.bgnorm), T))
fit = lmFit(log2(exprs(BSData.rankinv)), design)
If this fails, Sean's suggestion of replacing the negative values with
small
positive values (or even NA's) should work.
I hope this helps. Best wishes,
Matt
On 15/2/07 19:14, "Lynn Amon" <lynnamon at="" u.washington.edu=""> wrote:
> Do you really need to do background subtraction with Illumina data?
Our
> experience is that this step is not necessary.
>
> Lynn Amon
> Research Scientist
> University of Washington
>
>
> Nieves Velez de Mendizabal wrote:
>> We are analyzing some data of Illumina. There are three kind of
>> normalization. First of them is the method of rank invariant
>> normalization, recommended by Illumina, and we would like to apply
it:
>>
>>
>> BSData.bgnorm = backgroundNormalise(BSData)
>> T = apply(exprs(BSData.bgnorm), 1, mean)
>> BSData.rankinv = assayDataElementReplace(BSData.bgnorm,
"exprs",
>> rankInvariantNormalise(exprs(BSData.bgnorm), T))
>>
>>
>> But in BSData.rankinv I have negative values so I cannot apply the
>> method lmFit in order to analyze the differential expression
because of
>> the log2 transformation applied.
>>
>> fit = lmFit(log2(exprs(BSData.rankinv)), design)
>>
>> Are these two methods (rank inv method and lmFit) incompatible?
>> What kind of normalization should I use in order to search
>> differentially expressed genes in micro arrays of Illumina?
>>
>> Thanks
On Thursday 15 February 2007 10:45, Nieves Velez de Mendizabal wrote:
> We are analyzing some data of Illumina. There are three kind of
> normalization. First of them is the method of rank invariant
> normalization, recommended by Illumina, and we would like to apply
it:
>
>
> BSData.bgnorm = backgroundNormalise(BSData)
> T = apply(exprs(BSData.bgnorm), 1, mean)
> BSData.rankinv = assayDataElementReplace(BSData.bgnorm, "exprs",
> rankInvariantNormalise(exprs(BSData.bgnorm), T))
>
>
> But in BSData.rankinv I have negative values so I cannot apply the
> method lmFit in order to analyze the differential expression because
of
> the log2 transformation applied.
>
> fit = lmFit(log2(exprs(BSData.rankinv)), design)
>
> Are these two methods (rank inv method and lmFit) incompatible?
> What kind of normalization should I use in order to search
> differentially expressed genes in micro arrays of Illumina?
Unfortunately, the rank-invariant method of normalization does produce
negative values, irregardless of background correction. The problem
with
this is not the lmFit function, but the log2 function. You need to
either
set your negative values to some small positive value or not use
rank-invariant normalization.
Sean