Seperate quantile normalization but common probe summary by median polish (oligo package)?

0

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 10.5 years ago

Dear list, I am pre-processing Affymetrix Mouse Gene 1.0 ST Arrays and use the oligo package. I do not want to quantile normalize them all together, because my samples come from different polysome fractions or compartments of the cell, and therefore show consistent and biologically meaningful differences in signal distribution. For seperate probe summary by median polish, however, the groups are too small: The smallest group has only 3 microarrays, which leads to identical values within many probe sets across the three samples. My idea is to perform quantile normalization for the individual groups, but probe summary for all microarrays (30) together, to have a more reliable estimate of the probe effect and to avoid that I lose the variability of my samples when a group consists of only 3 microarrays. Is this reasonable, or is anyone aware of artifacts that I would introduce by performing median polish for probe summary on microarrays that have not been quantile normalized together? Here is some code to illustrate what I am doing: # I load the required packages: library("oligo") library("pd.mogene.1.0.st.v1") # the CEL files are opened twice, once in groups (here only group 1 as an example), and once all together: list_cel <- list.celfiles("group1") group1 <- read.celfiles(list_cel) list_cel <- list.celfiles("all_groups") all_groups <- read.celfiles(list_cel) # I perform background correction and quantile normalization for the pm values of the individual groups (here only group1): pms_group1 <- pm(group1) bg_group1 <- backgroundCorrect(pms_group1) norm_group1 <- normalize(bg_group1) # I replace the pm values in the GeneFeatureSet all_groups by the normalized values of group 1: exprs(all_groups)[pmindex(all_groups), 1] <- norm_group1[,1] exprs(all_groups)[pmindex(all_groups), 2] <- norm_group1[,2] exprs(all_groups)[pmindex(all_groups), 3] <- norm_group1[,3] # after having done this for ALL the groups, I perform only the probe summary on all_groups: pp_all <- rma(all_groups, background = F, normalize = F, target = "core") I guess that fRMA together with fRMAtools would be an alternative for pre-processing my microarrays in small groups? Thank you very much in advance for warning me if my idea is wrong! Johanna Schott -- output of sessionInfo(): R version 2.15.1 (2012-06-22) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 LC_NUMERIC=C LC_TIME=German_Germany.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] mogene10sttranscriptcluster.db_8.0.1 org.Mm.eg.db_2.7.1 AnnotationDbi_1.18.1 Biobase_2.16.0 [5] BiocGenerics_0.2.0 pd.mogene.1.0.st.v1_3.6.0 RSQLite_0.11.1 DBI_0.2-5 [9] oligo_1.20.4 oligoClasses_1.18.0 loaded via a namespace (and not attached): [1] affxparser_1.28.1 affyio_1.24.0 BiocInstaller_1.4.7 Biostrings_2.24.1 bit_1.1-8 codetools_0.2-8 ff_2.2-7 foreach_1.4.0 [9] IRanges_1.14.4 iterators_1.0.6 preprocessCore_1.18.0 splines_2.15.1 stats4_2.15.1 tools_2.15.1 zlibbioc_1.2.0 -- Sent via the guest posting facility at bioconductor.org.

Normalization probe frma frmaTools Normalization probe frma frmaTools • 1.3k views

ADD COMMENT • link updated 12.6 years ago by James W. MacDonald 68k • written 12.6 years ago by Guest User ★ 13k

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 17 hours ago

United States

Hi Johanna, On 8/2/2012 5:30 AM, Johanna Schott [guest] wrote: > Dear list, > > I am pre-processing Affymetrix Mouse Gene 1.0 ST Arrays and use the oligo package. I do not want to quantile normalize them all together, because my samples come from different polysome fractions or compartments of the cell, and therefore show consistent and biologically meaningful differences in signal distribution. > For seperate probe summary by median polish, however, the groups are too small: > The smallest group has only 3 microarrays, which leads to identical values within many probe sets across the three samples. > > My idea is to perform quantile normalization for the individual groups, but probe summary for all microarrays (30) together, to have a more reliable estimate of the probe effect and to avoid that I lose the variability of my samples when a group consists of only 3 microarrays. > > Is this reasonable, or is anyone aware of artifacts that I would introduce by performing median polish for probe summary on microarrays that have not been quantile normalized together? I don't think that is a good idea. When you fit a model using median polish, the underlying assumptions are of similar distributions and variances of the data, which will clearly not be the case if you normalize separately. I assume you are planning to compare the different polysome fractions. In addition, I assume that extracting the polysomes is a relatively laborious process that is likely to introduce technical variability. Given the above assumptions, this will likely be a difficult data set to analyze, and I would think you will have to make some pretty strong assumptions. I (and most others who answer questions on this list) am hesitant to offer any statistical advice - without data in hand it is very difficult to say what you should do. In addition, most of us make a living by analyzing data, so doing our work for free isn't a viable strategy. That said, I would tend to start off simple, processing all samples together and seeing if I have evidence that doing so was a bad idea, rather than the converse. Best, Jim > > Here is some code to illustrate what I am doing: > > # I load the required packages: > library("oligo") > library("pd.mogene.1.0.st.v1") > > # the CEL files are opened twice, once in groups (here only group 1 as an example), and once all together: > list_cel<- list.celfiles("group1") > group1<- read.celfiles(list_cel) > > list_cel<- list.celfiles("all_groups") > all_groups<- read.celfiles(list_cel) > > # I perform background correction and quantile normalization for the pm values of the individual groups (here only group1): > pms_group1<- pm(group1) > bg_group1<- backgroundCorrect(pms_group1) > norm_group1<- normalize(bg_group1) > > # I replace the pm values in the GeneFeatureSet all_groups by the normalized values of group 1: > exprs(all_groups)[pmindex(all_groups), 1]<- norm_group1[,1] > exprs(all_groups)[pmindex(all_groups), 2]<- norm_group1[,2] > exprs(all_groups)[pmindex(all_groups), 3]<- norm_group1[,3] > > # after having done this for ALL the groups, I perform only the probe summary on all_groups: > pp_all<- rma(all_groups, background = F, normalize = F, target = "core") > > > I guess that fRMA together with fRMAtools would be an alternative for pre-processing my microarrays in small groups? > > Thank you very much in advance for warning me if my idea is wrong! > > Johanna Schott > > -- output of sessionInfo(): > > R version 2.15.1 (2012-06-22) > Platform: x86_64-pc-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 LC_NUMERIC=C LC_TIME=German_Germany.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] mogene10sttranscriptcluster.db_8.0.1 org.Mm.eg.db_2.7.1 AnnotationDbi_1.18.1 Biobase_2.16.0 > [5] BiocGenerics_0.2.0 pd.mogene.1.0.st.v1_3.6.0 RSQLite_0.11.1 DBI_0.2-5 > [9] oligo_1.20.4 oligoClasses_1.18.0 > > loaded via a namespace (and not attached): > [1] affxparser_1.28.1 affyio_1.24.0 BiocInstaller_1.4.7 Biostrings_2.24.1 bit_1.1-8 codetools_0.2-8 ff_2.2-7 foreach_1.4.0 > [9] IRanges_1.14.4 iterators_1.0.6 preprocessCore_1.18.0 splines_2.15.1 stats4_2.15.1 tools_2.15.1 zlibbioc_1.2.0 > > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD COMMENT • link 12.6 years ago James W. MacDonald 68k

Login before adding your answer.