Normalization
2
0
Entering edit mode
@dr-narendra-k-kaushik-npm_nmd-446
Last seen 10.2 years ago
I have lots of negative values. What is the best way to get rid of them or to normalize the data. I am working with Avg_diif values. Narendra Kaushik Imperial College of Medicine, London SW3 6NP UK
• 970 views
ADD COMMENT
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 10.2 years ago
I assume you mean you have lots of negative values after subtracting background? There are many options, none ideal. The problem is that microarrays really don't handle data very well where the gene is off in one channel and on in another. By definition, off is zero, and we are obsessed by calculating ratios, where a zero value really screws things up :-( Anyway, your options are: - set all neg. values to zero (makes most sense, but this will screw up ratios) - set all neg. values to one, or some other nominally small figure (this won't screw up ratios but it is, after all, simply inventing numbers) - adjust your whole distribution such that 95% of spots are > 0 (adjust by adding/subtracting the 5th percentile value from your distribution) - this is quite popular, though again, dubious in it's validity - do not subtract background - after all, no one has proved that the relationship of background to foreground intensity is an additive one, nor that it has any effect whatsoever. So if you have what appears to be a nice uniform background, both within and across your slides, then why bother subtracting background at all? This is a problem that troubles me greatly too and I have yet to find a suitable answer. Personally, I set all negative values to 1 and then create a flag that basically says "don't trust the magnitude of this ratio" :-) I do none of this in Bioconductor by the way. I use Perl and/or a SQL database and munge it before putting it in BioC. Cheers Mick -----Original Message----- From: Kaushik, Narendra K [mailto:n.kaushik@imperial.ac.uk] Sent: 08 October 2003 15:23 To: 'bioconductor@stat.math.ethz.ch' Subject: [BioC] Normalization I have lots of negative values. What is the best way to get rid of them or to normalize the data. I am working with Avg_diif values. Narendra Kaushik Imperial College of Medicine, London SW3 6NP UK _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENT
0
Entering edit mode
Hi - > ... The problem is that microarrays really don't handle data very well > where the gene is off in one channel and on in another. By definition, > off is zero, and we are obsessed by calculating ratios, where a zero > value really screws things up :-( Microarrays are pretty fine at measuring small intensities, as you say it is just our taking ratios or logarithms that can potentially make it problematic. > Anyway, your options are: > - adjust your whole distribution such that 95% of spots are > 0 > ... this is quite popular, though again, dubious in it's validity Another option that is a little more valid is implemented in the vsn package; see the accompanying Bioinformatics/ISMB 2002 paper for details. Instead of shifting the distribution, we adjust the transformation (which in some sense is equivalent), and we have some statistical arguments for why this is a valid thing to do, and a proper model-based parameter estimation procedure for actually doing it. Best regards Wolfgang
ADD REPLY
0
Entering edit mode
@adaikalavan-ramasamy-437
Last seen 10.2 years ago
The Average Difference value you describe is generated by MAS 4.0 algorithm which fairly outdated. The newer algorithm, MAS 5.0, uses Ideal Mismatch values to force the resulting signal to be very small positive values. [ Even then there is some evidence this algorithm is suboptimal ] Some reference (below) treat any value less than +20 (or any other small positive number) as unreliable/missing. If you have the raw experiment files (CEL files), you might be able to get outputs according to newer algorithms which does not produce negative values. http://expression.gnf.org/faq.html#avgdiff http://www.nature.com/cgi- taf/DynaPage.taf?file=/ng/journal/v33/n1/full/ ng1061.html Regards, -- Adaikalavan Ramasamy -----Original Message----- From: Kaushik, Narendra K [mailto:n.kaushik@imperial.ac.uk] Sent: Wednesday, October 08, 2003 10:23 PM To: 'bioconductor@stat.math.ethz.ch' Subject: [BioC] Normalization I have lots of negative values. What is the best way to get rid of them or to normalize the data. I am working with Avg_diif values. Narendra Kaushik Imperial College of Medicine, London SW3 6NP UK _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENT

Login before adding your answer.

Traffic: 572 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6