Microarray data normalization

0

Entering edit mode

Levi Waldron ▴ 80

@levi-waldron-6357

Last seen 10.6 years ago

Depending on which Affy platform you have, you may also be able to use Frozen RMA (bioc library frma), then it doesn't matter which option you choose. On Wed, Jul 30, 2014 at 5:32 AM, Wolfgang Huber <whuber@embl.de> wrote: > Dear Bernarnd > > my preference would be option 2, but the first thing to do if you're > unsure is to try both and see if it makes any difference. Presumably the > differences are minimal and within the uncertainty of your analysis. > > If option 2 were the right thing to do, then with the same logic you could > go out to the internet (ArrayExpress, GEO), download a few thousand more > arrays, throw them in, and get even better results. > > The view "purpose of normalization is to remove batch effects" is not > quite right, as batch effects can affect the data in all sorts of ways, but > e.g. rma only addresses those types of efffects that affect all the data on > an array in the same way, i.e. overall higher or lower background, or > overall more or less cDNA used, over overall longer or shorter exposure to > the scanner. What it does not remove is, for instance, if the way that the > signal depends on probe GC content or cDNA length changes (and this can > happen as reagents & material change). > > Best wishes > Wolfgang > > > > > > > > > Il giorno Jul 29, 2014, alle ore 20:32 EDT, Bernard Lee Kok Bang < > bernard.lee@carif.com.my> ha scritto: > > > Dear all, I would like to ask a question in regards to microarray data > normalization. > > > > Scenario; > > I have in hand a collection of 300 cancer cell lines (multiple cancer > types) raw '.CEL' files, all from the same study/batch. My aim is to obtain > the gene expression values and use them downstream. However I am only > interested in a subset of these .CEL files, for example I am only > interested in NON-blood cancer cell lines (n=250). > > > > I'm wondering which of these two options is more appropriate for my > scenario: > > > > Option 1: > > 1) Normalize all 300 .CEL by rma. > > 2) After normalization, manually remove the 50 blood samples I am NOT > interested in > > 3) Use the normalized data of 250 samples for downstream analysis > > > > Option 2: > > 1) Normalize ONLY the 250 .CEL by rma (imagine as if the 50 blood > samples does not exists) > > 2) Use the normalized data of 250 samples for downstream analysis > > > > My downstream analysis simply involves ranking the gene from highest > expression to the lowest. > > > >> From my point of view, I am favoring the first option. This is because > since I have all the solid tumor and blood cell line data, I might as well > normalized them altogether first before manually excluding the blood cell > line, as to my knowledge the purpose of normalization is to remove batch > effects?? So the larger the sample size during rma normalization the > better?? > > > > > > Thanks in advance. > > > > Bernard Lee > > Research Assistant > > Cancer Research Initiatives Foundation (CARIF) > > University of Malaya (UM) > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Levi Waldron Assistant Professor of Biostatistics City University of New York School of Public Health, Hunter College 2180 3rd Ave Rm 538 New York NY 10035-4003 phone: 212-396-7747 www.waldronlab.org [[alternative HTML version deleted]]

Normalization GO Cancer probe affy Normalization GO Cancer probe affy • 1.0k views

ADD COMMENT • link 10.7 years ago Levi Waldron ▴ 80

Login before adding your answer.