RMA vs gcRMA on 2 groups of samples
2
0
Entering edit mode
Bogdan ▴ 670
@bogdan-2367
Last seen 13 months ago
Palo Alto, CA, USA
Hi folks, I would like to ask for your opinions on the following: I have 60 expression profiles of 60 samples (cells and organs in resting conditions). I normalized these arrays in many ways, including RMA. Considering the biological arguments (cells samples vs organs samples), I am planning to do the normalization separately, on the group of cell samples, and on the group of organ samples. My questions are: - after RMA normalization on separate groups of samples (cells vs organs), the results are different, but are these better ? GO analysis do not display major differences. - would gcRMA work better than RMA ? The majority of opinions in SoCal are pro-RMA. thanks, Bogdan
Normalization GO gcrma Normalization GO gcrma • 1.7k views
ADD COMMENT
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 3.6 years ago
United States
Dear Bogdan, I do not have an opinion on gcRMA versus RMA. But if you are doing differential expression analysis comparing the cell samples with the organ samples, you need to normalize all the samples together. --Naomi At 11:31 AM 11/1/2007, Bogdan Tanasa wrote: >Hi folks, > >I would like to ask for your opinions on the following: > >I have 60 expression profiles of 60 samples (cells and organs in >resting conditions). >I normalized these arrays in many ways, including RMA. > >Considering the biological arguments (cells samples vs organs >samples), I am planning to do the normalization separately, on the >group of cell samples, and on the group of organ samples. > >My questions are: > >- after RMA normalization on separate groups of samples (cells vs >organs), the results are different, but are these better ? GO analysis >do not display major differences. > >- would gcRMA work better than RMA ? The majority of opinions in SoCal >are pro-RMA. > >thanks, > >Bogdan > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD COMMENT
0
Entering edit mode
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20071102/ 1134ab38/attachment.pl
ADD REPLY
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 3.6 years ago
United States
Dear Bogdan, Any normalization method that uses a set of arrays, reduces the variability among those arrays. So, if you have 2 sets of arrays and normalize separately, you will find that the within set variability is smaller than the between set variability - i.e. you induce significant differential expression simply by the normalization. To avoid this effect, when you are doing differential expression analysis (or sample clustering) you must either use methods that normalize each array separately (MAS) or normalize all together. --Naomi At 12:01 PM 11/2/2007, Bogdan Tanasa wrote: >Greetings Naomi, > >thanks for reply. To generalize my question: when dealing with 2 sets of >samples, let's say X1, X2, ...., Xn and Y1, Y2, ..., Yn, >I could run the normalization in 2 ways: A. only X(1,n) and only Y(1,n), or >B. both X(1,n),Y(1,n). Are there any a priori statistical >criteria that favors a way or the other ? If I would take into >consideration biological criteria (the things I am interested in), the >results >from A may sometimes look better than B', or vice versa. Thanks ! > >Bogdan > > > >On 11/2/07, Naomi Altman <naomi at="" stat.psu.edu=""> wrote: > > > > Dear Bogdan, > > I do not have an opinion on gcRMA versus RMA. But if you are doing > > differential expression analysis comparing the cell samples with the > > organ samples, you need to normalize > > all the samples together. > > > > --Naomi > > > > At 11:31 AM 11/1/2007, Bogdan Tanasa wrote: > > >Hi folks, > > > > > >I would like to ask for your opinions on the following: > > > > > >I have 60 expression profiles of 60 samples (cells and organs in > > >resting conditions). > > >I normalized these arrays in many ways, including RMA. > > > > > >Considering the biological arguments (cells samples vs organs > > >samples), I am planning to do the normalization separately, on the > > >group of cell samples, and on the group of organ samples. > > > > > >My questions are: > > > > > >- after RMA normalization on separate groups of samples (cells vs > > >organs), the results are different, but are these better ? GO analysis > > >do not display major differences. > > > > > >- would gcRMA work better than RMA ? The majority of opinions in SoCal > > >are pro-RMA. > > > > > >thanks, > > > > > >Bogdan > > > > > >_______________________________________________ > > >Bioconductor mailing list > > >Bioconductor at stat.math.ethz.ch > > >https://stat.ethz.ch/mailman/listinfo/bioconductor > > >Search the archives: > > >http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > Naomi S. Altman 814-865-3791 (voice) > > Associate Professor > > Dept. of Statistics 814-863-7114 (fax) > > Penn State University 814-865-1348 (Statistics) > > University Park, PA 16802-2111 > > > > > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD COMMENT
0
Entering edit mode
Naomi Altman wrote: > Dear Bogdan, > Any normalization method that uses a set of arrays, reduces the > variability among those arrays. > > So, if you have 2 sets of arrays and normalize separately, you will > find that the within set variability is smaller than the between set > variability - i.e. you induce significant differential expression > simply by the normalization. To avoid this effect, when you are > doing differential expression analysis (or sample clustering) you > must either use methods that normalize each array separately (MAS) or > normalize all together. An alternative (and the one that I prefer) is to do separate normalizations, and to then use some sort of batch effect term in the model used to assess differentially expressed genes. Normalization is intended to clean up the relatively minor issues that arise due to slightly different conditions etc. for arrays that are essentially the same. As far as I can see it is not intended to adjust for batch effects, and in my experience generally does a bad job of that. Just because you can normalize (or fit any statistical model) does not mean that you should. best wishes Robert > > --Naomi > > At 12:01 PM 11/2/2007, Bogdan Tanasa wrote: >> Greetings Naomi, >> >> thanks for reply. To generalize my question: when dealing with 2 sets of >> samples, let's say X1, X2, ...., Xn and Y1, Y2, ..., Yn, >> I could run the normalization in 2 ways: A. only X(1,n) and only Y(1,n), or >> B. both X(1,n),Y(1,n). Are there any a priori statistical >> criteria that favors a way or the other ? If I would take into >> consideration biological criteria (the things I am interested in), the >> results >>from A may sometimes look better than B', or vice versa. Thanks ! >> Bogdan >> >> >> >> On 11/2/07, Naomi Altman <naomi at="" stat.psu.edu=""> wrote: >>> Dear Bogdan, >>> I do not have an opinion on gcRMA versus RMA. But if you are doing >>> differential expression analysis comparing the cell samples with the >>> organ samples, you need to normalize >>> all the samples together. >>> >>> --Naomi >>> >>> At 11:31 AM 11/1/2007, Bogdan Tanasa wrote: >>>> Hi folks, >>>> >>>> I would like to ask for your opinions on the following: >>>> >>>> I have 60 expression profiles of 60 samples (cells and organs in >>>> resting conditions). >>>> I normalized these arrays in many ways, including RMA. >>>> >>>> Considering the biological arguments (cells samples vs organs >>>> samples), I am planning to do the normalization separately, on the >>>> group of cell samples, and on the group of organ samples. >>>> >>>> My questions are: >>>> >>>> - after RMA normalization on separate groups of samples (cells vs >>>> organs), the results are different, but are these better ? GO analysis >>>> do not display major differences. >>>> >>>> - would gcRMA work better than RMA ? The majority of opinions in SoCal >>>> are pro-RMA. >>>> >>>> thanks, >>>> >>>> Bogdan >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> Naomi S. Altman 814-865-3791 (voice) >>> Associate Professor >>> Dept. of Statistics 814-863-7114 (fax) >>> Penn State University 814-865-1348 (Statistics) >>> University Park, PA 16802-2111 >>> >>> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
ADD REPLY
0
Entering edit mode
Yes but if I am not mistaken, the OP had a situation in which the samples were simply different cell or tissue types, rather than different batches. I this case I would favor normalizing all together rather than doing things in batches. Best, Jim Robert Gentleman wrote: > > Naomi Altman wrote: >> Dear Bogdan, >> Any normalization method that uses a set of arrays, reduces the >> variability among those arrays. >> >> So, if you have 2 sets of arrays and normalize separately, you will >> find that the within set variability is smaller than the between set >> variability - i.e. you induce significant differential expression >> simply by the normalization. To avoid this effect, when you are >> doing differential expression analysis (or sample clustering) you >> must either use methods that normalize each array separately (MAS) or >> normalize all together. > > An alternative (and the one that I prefer) is to do separate > normalizations, and to then use some sort of batch effect term in the > model used to assess differentially expressed genes. > > Normalization is intended to clean up the relatively minor issues that > arise due to slightly different conditions etc. for arrays that are > essentially the same. As far as I can see it is not intended to adjust > for batch effects, and in my experience generally does a bad job of > that. Just because you can normalize (or fit any statistical model) > does not mean that you should. > > best wishes > Robert > > >> --Naomi >> >> At 12:01 PM 11/2/2007, Bogdan Tanasa wrote: >>> Greetings Naomi, >>> >>> thanks for reply. To generalize my question: when dealing with 2 sets of >>> samples, let's say X1, X2, ...., Xn and Y1, Y2, ..., Yn, >>> I could run the normalization in 2 ways: A. only X(1,n) and only Y(1,n), or >>> B. both X(1,n),Y(1,n). Are there any a priori statistical >>> criteria that favors a way or the other ? If I would take into >>> consideration biological criteria (the things I am interested in), the >>> results >> >from A may sometimes look better than B', or vice versa. Thanks ! >>> Bogdan >>> >>> >>> >>> On 11/2/07, Naomi Altman <naomi at="" stat.psu.edu=""> wrote: >>>> Dear Bogdan, >>>> I do not have an opinion on gcRMA versus RMA. But if you are doing >>>> differential expression analysis comparing the cell samples with the >>>> organ samples, you need to normalize >>>> all the samples together. >>>> >>>> --Naomi >>>> >>>> At 11:31 AM 11/1/2007, Bogdan Tanasa wrote: >>>>> Hi folks, >>>>> >>>>> I would like to ask for your opinions on the following: >>>>> >>>>> I have 60 expression profiles of 60 samples (cells and organs in >>>>> resting conditions). >>>>> I normalized these arrays in many ways, including RMA. >>>>> >>>>> Considering the biological arguments (cells samples vs organs >>>>> samples), I am planning to do the normalization separately, on the >>>>> group of cell samples, and on the group of organ samples. >>>>> >>>>> My questions are: >>>>> >>>>> - after RMA normalization on separate groups of samples (cells vs >>>>> organs), the results are different, but are these better ? GO analysis >>>>> do not display major differences. >>>>> >>>>> - would gcRMA work better than RMA ? The majority of opinions in SoCal >>>>> are pro-RMA. >>>>> >>>>> thanks, >>>>> >>>>> Bogdan >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at stat.math.ethz.ch >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> Naomi S. Altman 814-865-3791 (voice) >>>> Associate Professor >>>> Dept. of Statistics 814-863-7114 (fax) >>>> Penn State University 814-865-1348 (Statistics) >>>> University Park, PA 16802-2111 >>>> >>>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> Naomi S. Altman 814-865-3791 (voice) >> Associate Professor >> Dept. of Statistics 814-863-7114 (fax) >> Penn State University 814-865-1348 (Statistics) >> University Park, PA 16802-2111 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> > -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD REPLY
0
Entering edit mode
Hi, If they were assayed at approximately the same time, using approximately the same protocols then yes, one normalization is likely to be better than two. I think that there may also be issues if the set of genes that are expressed is very different in the different tissue types (as them being the same is one of the basic assumptions in most normalization methods). But if very much is different, then it is better not to try and normalize, but rather to adjust after normalization. best wishes Robert James W. MacDonald wrote: > Yes but if I am not mistaken, the OP had a situation in which the > samples were simply different cell or tissue types, rather than > different batches. I this case I would favor normalizing all together > rather than doing things in batches. > > Best, > > Jim > > > Robert Gentleman wrote: >> >> Naomi Altman wrote: >>> Dear Bogdan, >>> Any normalization method that uses a set of arrays, reduces the >>> variability among those arrays. >>> >>> So, if you have 2 sets of arrays and normalize separately, you will >>> find that the within set variability is smaller than the between set >>> variability - i.e. you induce significant differential expression >>> simply by the normalization. To avoid this effect, when you are >>> doing differential expression analysis (or sample clustering) you >>> must either use methods that normalize each array separately (MAS) or >>> normalize all together. >> >> An alternative (and the one that I prefer) is to do separate >> normalizations, and to then use some sort of batch effect term in the >> model used to assess differentially expressed genes. >> >> Normalization is intended to clean up the relatively minor issues >> that arise due to slightly different conditions etc. for arrays that >> are essentially the same. As far as I can see it is not intended to >> adjust for batch effects, and in my experience generally does a bad >> job of that. Just because you can normalize (or fit any statistical >> model) does not mean that you should. >> >> best wishes >> Robert >> >> >>> --Naomi >>> >>> At 12:01 PM 11/2/2007, Bogdan Tanasa wrote: >>>> Greetings Naomi, >>>> >>>> thanks for reply. To generalize my question: when dealing with 2 >>>> sets of >>>> samples, let's say X1, X2, ...., Xn and Y1, Y2, ..., Yn, >>>> I could run the normalization in 2 ways: A. only X(1,n) and only >>>> Y(1,n), or >>>> B. both X(1,n),Y(1,n). Are there any a priori statistical >>>> criteria that favors a way or the other ? If I would take into >>>> consideration biological criteria (the things I am interested in), the >>>> results >>> >from A may sometimes look better than B', or vice versa. Thanks ! >>>> Bogdan >>>> >>>> >>>> >>>> On 11/2/07, Naomi Altman <naomi at="" stat.psu.edu=""> wrote: >>>>> Dear Bogdan, >>>>> I do not have an opinion on gcRMA versus RMA. But if you are doing >>>>> differential expression analysis comparing the cell samples with the >>>>> organ samples, you need to normalize >>>>> all the samples together. >>>>> >>>>> --Naomi >>>>> >>>>> At 11:31 AM 11/1/2007, Bogdan Tanasa wrote: >>>>>> Hi folks, >>>>>> >>>>>> I would like to ask for your opinions on the following: >>>>>> >>>>>> I have 60 expression profiles of 60 samples (cells and organs in >>>>>> resting conditions). >>>>>> I normalized these arrays in many ways, including RMA. >>>>>> >>>>>> Considering the biological arguments (cells samples vs organs >>>>>> samples), I am planning to do the normalization separately, on the >>>>>> group of cell samples, and on the group of organ samples. >>>>>> >>>>>> My questions are: >>>>>> >>>>>> - after RMA normalization on separate groups of samples (cells vs >>>>>> organs), the results are different, but are these better ? GO >>>>>> analysis >>>>>> do not display major differences. >>>>>> >>>>>> - would gcRMA work better than RMA ? The majority of opinions in >>>>>> SoCal >>>>>> are pro-RMA. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Bogdan >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor at stat.math.ethz.ch >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> Naomi S. Altman 814-865-3791 (voice) >>>>> Associate Professor >>>>> Dept. of Statistics 814-863-7114 (fax) >>>>> Penn State University 814-865-1348 >>>>> (Statistics) >>>>> University Park, PA 16802-2111 >>>>> >>>>> >>>> [[alternative HTML version deleted]] >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> Naomi S. Altman 814-865-3791 (voice) >>> Associate Professor >>> Dept. of Statistics 814-863-7114 (fax) >>> Penn State University 814-865-1348 (Statistics) >>> University Park, PA 16802-2111 >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> > -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
ADD REPLY

Login before adding your answer.

Traffic: 909 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6