quantile robust and RMA in xps
1
0
Entering edit mode
@mayte-suarez-farinas-2068
Last seen 9.7 years ago
United States
Hi everybody. I am working with xps and I have to admit I still dont get all the nuances, but I am trying my best. To summarize the data, I want to use rma but with an alteration to the normalization step. so I need to do the 3 steps: bgcorrect, normalize and summarize. I got two problems trying to do so: 1. In background correction: the default RMA background is: data.bg.rma <- bgcorrect (G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx", select="none", option="pmonly:epanechnikov",params=c(16384)) but I got the following error: g.rma <- bgcorrect(G1ST_data2,"tmp_bg",method="rma",exonlevel="all", select="none", option="pmonly:epanechnikov",params=c(16384)) Error in .local(object, ...) : error in function ‘BgCorrect’ Opening file </users> in <read> mode... Creating new temporary file </volumes>... Preprocessing data using method <adjustbgrd>... Background correcting raw data... calculating background for <1_HuGene 1_0 ST_050409.cel>... Error: Number of PMs or MMs is zero. An error has occured: Need to abort current process. So, I try: data.bg.rma <- bgcorrect (G1ST_data2,"tmp_bg2",method="rma",exonlevel="core+affx", select="antigenomic", option="pmonly:epanechnikov",params=c(16384)) which works OK but I dont know if it is OK. After that I want to use normalize.quantiles.robust function from affy (is not available in xps) so I did: data.bg.rma<-attachInten(data.bg.rma) data.int<-intensity(data.bg.rma) detach(package:xps) library(affy) data.int.norm<-normalize.quantiles.robustas.matrixdata.int[,-c (1,2)]),n.remove=5,remove.extreme='both') which shows that the data is normalized. Then I have to update the intensitities in the xps object data.bg.rma, which I did and after library(xps) strdata.int) data.int[,-c(1,2)]<-data.int.norm intensity(data.bg.rma)<-data.int boxplot(data.bg.rma) #boxplot is OK The problem comes when I sumarized the resulting data using median polish, the resulting data is not normalized: data.mp.rma <- summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core +affx") boxplot(data.mp.rma) #boxplot is not OK. I dont know if I make a mistake specially in updating the intensities after the normalization step. I will really appreciate any insight on this. Below is my session info... > sessionInfo() R version 2.8.1 (2008-12-22) i386-apple-darwin8.11.1 locale: en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] grid splines tools stats graphics grDevices utils datasets methods base other attached packages: [1] xps_1.2.10 affy_1.20.2 arrayQualityMetrics_1.8.1 marray_1.20.0 latticeExtra_0.5-4 vsn_3.8.0 [7] beadarray_1.10.0 sma_0.5.15 hwriter_1.0 affycoretools_1.14.1 annaffy_1.14.0 KEGG.db_2.2.5 [13] biomaRt_1.16.0 GOstats_2.8.0 Category_2.8.4 RBGL_1.18.0 GO.db_2.2.5 RSQLite_0.7-1 [19] DBI_0.2-4 graph_1.20.0 limma_2.16.5 affyQCReport_1.20.0 geneplotter_1.20.0 annotate_1.20.1 [25] AnnotationDbi_1.5.18 lattice_0.17-17 RColorBrewer_1.0-2 affyPLM_1.18.1 preprocessCore_1.4.0 xtable_1.5-4 [31] simpleaffy_2.18.0 gcrma_2.14.1 matchprobes_1.14.1 genefilter_1.22.0 survival_2.34-1 Biobase_2.2.2 loaded via a namespace (and not attached): [1] GSEABase_1.4.0 KernSmooth_2.22-22 RCurl_0.94-1 XML_2.1-0 affyio_1.10.1 cluster_1.11.11 [[alternative HTML version deleted]]
Normalization PROcess xps Normalization PROcess xps • 2.0k views
ADD COMMENT
0
Entering edit mode
cstrato ★ 3.9k
@cstrato-908
Last seen 6.2 years ago
Austria
Dear Mayte, Although not recommended, this is in principle possible, however your xps version is too old, you need version "xps_1.4.x", where I have modified method "intensity()<-" for these purposes, see the help file "?intensity". See my further comments below. Mayte Suarez-Farinas wrote: > Hi everybody. > > I am working with xps and I have to admit I still dont get all the > nuances, but I am trying my best. > To summarize the data, I want to use rma but with an alteration to > the normalization step. > so I need to do the 3 steps: bgcorrect, normalize and summarize. I > got two problems trying to do so: > > 1. In background correction: > > the default RMA background is: > data.bg.rma <- bgcorrect > (G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx", > select="none", option="pmonly:epanechnikov",params=c(16384)) > but I got the following error: > > g.rma <- bgcorrect(G1ST_data2,"tmp_bg",method="rma",exonlevel="all", > select="none", option="pmonly:epanechnikov",params=c(16384)) > Error in .local(object, ...) : error in function ?BgCorrect? > Opening file </users> Scheme_HuGene10stv1r4_na28.root> in <read> mode... > Creating new temporary file </volumes>... > Preprocessing data using method <adjustbgrd>... > Background correcting raw data... > calculating background for <1_HuGene 1_0 ST_050409.cel>... > Error: Number of PMs or MMs is zero. > An error has occured: Need to abort current process. > Please note that the default settings are always for expression arrays, so the error tells you that there are no MMs. > So, I try: > > data.bg.rma <- bgcorrect > (G1ST_data2,"tmp_bg2",method="rma",exonlevel="core+affx", > select="antigenomic", option="pmonly:epanechnikov",params=c(16384)) > > which works OK but I dont know if it is OK. > This is the correct setting for whole genome and exon arrays. select="antigenomic" tells the program to use the antigenomic background probes as MMs, e.g. if you use option "mmonly" instead of "pmonly". > After that I want to use normalize.quantiles.robust function from > affy (is not available in xps) > so I did: > > data.bg.rma<-attachInten(data.bg.rma) > data.int<-intensity(data.bg.rma) > detach(package:xps) > library(affy) > data.int.norm<-normalize.quantiles.robust(as.matrixdata.int[,-c > (1,2)]),n.remove=5,remove.extreme='both') > In version R-2.9.0 which I am using, this function has moved to package "preprocessCore" but it seems not to work: library(preprocessCore) data.int.norm <- normalize.quantiles.robust(as.matrixdata.int[,-c(1,2)]), n.remove=1, remove.extreme='both') I get the following error message: Error in normalize.quantiles.robust(as.matrixdata.int[, -c(1, 2)]), n.remove = 1, : VECTOR_ELT() can only be applied to a 'list', not a 'character Thus to simulate your setting I use function "normalize.quantiles" and delete one sample by hand: data.int.norm <- normalize.quantiles(as.matrixdata.int[,-c(1,2)])) data.int.norm <- data.int.norm[,-4] colnames(data.int.norm) <- c("Breast01","Breast02","Breast03","Prostate02","Prostate03") Note that (at least for me) the output is a matrix w/o column names, thus you need to set the correct column names manually. (In my example I am using the breast/prostate triplicates from the Affy dataset.) > which shows that the data is normalized. Then I have to update the > intensitities in the xps object data.bg.rma, > which I did and after > > library(xps) > strdata.int) > data.int[,-c(1,2)]<-data.int.norm > intensity(data.bg.rma)<-data.int > boxplot(data.bg.rma) #boxplot is OK > The new replacement method "intensity()<-" has an option to create a new ROOT file (see?intensity), thus you need to do: library(xps) strdata.int) data.int.norm <- as.data.frame(cbinddata.int[,c(1,2)],data.int.norm)) Here you see that I added the (x,y) coordinates, but it is up to you to make sure that the order is correct. I am using cbind() to prevent cycling of the samples, which is what I get when using "data.int[,-c(1,2)]". Now I can use the replacement method: intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm str(data.bg.rma) boxplot(data.bg.rma) #boxplot is OK Please note that this will take some time since the background- corrected intensities will first be saved as CEL-files which are then imported into the new ROOT file "tmp_int2_cel.root". > The problem comes when I sumarized the resulting data using median > polish, > the resulting data is not normalized: > > data.mp.rma <- summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core > +affx") > boxplot(data.mp.rma) #boxplot is not OK. > Now you can summarize the data using xps, but you need to replace the setname first: setName(data.bg.rma) <- "DataSet" data.mp.rma <- summarize.rma(data.bg.rma, "tmp_sum_rma", exonlevel="core+affx") boxplot(data.mp.rma) #boxplot is now OK. I hope this helps. Best regards Christian > I dont know if I make a mistake specially in updating the intensities > after the normalization step. I will really appreciate any insight on > this. Below is my session info... > > > > sessionInfo() > R version 2.8.1 (2008-12-22) > i386-apple-darwin8.11.1 > > locale: > en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] grid splines tools stats graphics grDevices > utils datasets methods base > > other attached packages: > [1] xps_1.2.10 affy_1.20.2 > arrayQualityMetrics_1.8.1 marray_1.20.0 > latticeExtra_0.5-4 vsn_3.8.0 > [7] beadarray_1.10.0 sma_0.5.15 > hwriter_1.0 affycoretools_1.14.1 > annaffy_1.14.0 KEGG.db_2.2.5 > [13] biomaRt_1.16.0 GOstats_2.8.0 > Category_2.8.4 RBGL_1.18.0 > GO.db_2.2.5 RSQLite_0.7-1 > [19] DBI_0.2-4 graph_1.20.0 > limma_2.16.5 affyQCReport_1.20.0 > geneplotter_1.20.0 annotate_1.20.1 > [25] AnnotationDbi_1.5.18 lattice_0.17-17 > RColorBrewer_1.0-2 affyPLM_1.18.1 > preprocessCore_1.4.0 xtable_1.5-4 > [31] simpleaffy_2.18.0 gcrma_2.14.1 > matchprobes_1.14.1 genefilter_1.22.0 > survival_2.34-1 Biobase_2.2.2 > > loaded via a namespace (and not attached): > [1] GSEABase_1.4.0 KernSmooth_2.22-22 RCurl_0.94-1 > XML_2.1-0 affyio_1.10.1 cluster_1.11.11 > > > > [[alternative HTML version deleted]] > > > -------------------------------------------------------------------- ---- > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
I'm not seeing any issue with this is in preprocessCore. Perhaps you are running an old version? > library(preprocessCore) > data.int <- matrix(rexp(1000000),ncol=10) > data.int.norm <- + normalize.quantiles.robustas.matrixdata.int[,-c(1,2)]), n.remove=1, + remove.extreme='both') > sessionInfo() R version 2.9.0 RC (2009-04-10 r48319) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US .UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_N AME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTI FICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] preprocessCore_1.6.0 > > > After that I want to use normalize.quantiles.robust function from > > affy (is not available in xps) > > so I did: > > > > data.bg.rma<-attachInten(data.bg.rma) > > data.int<-intensity(data.bg.rma) > > detach(package:xps) > > library(affy) > > data.int.norm<-normalize.quantiles.robustas.matrixdata.int[,-c > > (1,2)]),n.remove=5,remove.extreme='both') > > > > In version R-2.9.0 which I am using, this function has moved to package > "preprocessCore" but it seems not to work: > > library(preprocessCore) > data.int.norm <- > normalize.quantiles.robustas.matrixdata.int[,-c(1,2)]), n.remove=1, > remove.extreme='both') > > I get the following error message: > Error in normalize.quantiles.robustas.matrixdata.int[, -c(1, 2)]), > n.remove = 1, : > VECTOR_ELT() can only be applied to a 'list', not a 'character >
ADD REPLY
0
Entering edit mode
You are right, my version was 1.4.0, although I thought I have updated all packages. When running the code with 1.6.0 everything is now ok. Best regards Christian Ben Bolstad wrote: > I'm not seeing any issue with this is in preprocessCore. Perhaps you are > running an old version? > > > >> library(preprocessCore) >> data.int <- matrix(rexp(1000000),ncol=10) >> data.int.norm <- >> > + normalize.quantiles.robustas.matrixdata.int[,-c(1,2)]), n.remove=1, > + remove.extreme='both') > >> sessionInfo() >> > R version 2.9.0 RC (2009-04-10 r48319) > x86_64-unknown-linux-gnu > > locale: > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_ US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC _NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDEN TIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods > base > > other attached packages: > [1] preprocessCore_1.6.0 > > > >>> After that I want to use normalize.quantiles.robust function from >>> affy (is not available in xps) >>> so I did: >>> >>> data.bg.rma<-attachInten(data.bg.rma) >>> data.int<-intensity(data.bg.rma) >>> detach(package:xps) >>> library(affy) >>> data.int.norm<-normalize.quantiles.robustas.matrixdata.int[,-c >>> (1,2)]),n.remove=5,remove.extreme='both') >>> >>> >> In version R-2.9.0 which I am using, this function has moved to package >> "preprocessCore" but it seems not to work: >> >> library(preprocessCore) >> data.int.norm <- >> normalize.quantiles.robustas.matrixdata.int[,-c(1,2)]), n.remove=1, >> remove.extreme='both') >> >> I get the following error message: >> Error in normalize.quantiles.robustas.matrixdata.int[, -c(1, 2)]), >> n.remove = 1, : >> VECTOR_ELT() can only be applied to a 'list', not a 'character >> >> > > > > >
ADD REPLY
0
Entering edit mode
>> Hi Christian. Tx for your answer. For my first question, I am sorry but I am still confused, I dont know what the correct answer is. I am working with HuGene 1_0 ST, measuring expression, I though I had to used the common (default) RMA with PM's only. But it does not work. the option that works with "antigenomic" is using MM's. Then, is this option right for my case? best, Mayte >> 1. In background correction: >> >> the default RMA background is: >> data.bg.rma <- bgcorrect >> (G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx", >> select="none", option="pmonly:epanechnikov",params=c(16384)) >> but I got the following error: >> >> g.rma <- bgcorrect >> (G1ST_data2,"tmp_bg",method="rma",exonlevel="all", select="none", >> option="pmonly:epanechnikov",params=c(16384)) >> Error in .local(object, ...) : error in function ?BgCorrect? >> Opening file </users>> Scheme_HuGene10stv1r4_na28.root> in <read> mode... >> Creating new temporary file </volumes>... >> Preprocessing data using method <adjustbgrd>... >> Background correcting raw data... >> calculating background for <1_HuGene 1_0 ST_050409.cel>... >> Error: Number of PMs or MMs is zero. >> An error has occured: Need to abort current process. >> > > Please note that the default settings are always for expression > arrays, so the error tells you that there are no MMs. > >> So, I try: >> >> data.bg.rma <- bgcorrect >> (G1ST_data2,"tmp_bg2",method="rma",exonlevel="core+affx", >> select="antigenomic", option="pmonly:epanechnikov",params=c(16384)) >> >> which works OK but I dont know if it is OK. >> > > > This is the correct setting for whole genome and exon arrays. > select="antigenomic" tells the program to use the antigenomic > background probes as MMs, e.g. if you use option "mmonly" instead > of "pmonly". > On May 23, 2009, at 11:42 AM, cstrato wrote: > Dear Mayte, > > Although not recommended, this is in principle possible, however > your xps version is too old, you need version "xps_1.4.x", where I > have modified method "intensity()<-" for these purposes, see the > help file "?intensity". > > See my further comments below. > > > Mayte Suarez-Farinas wrote: >> Hi everybody. >> >> I am working with xps and I have to admit I still dont get all >> the nuances, but I am trying my best. >> To summarize the data, I want to use rma but with an alteration >> to the normalization step. >> so I need to do the 3 steps: bgcorrect, normalize and summarize. >> I got two problems trying to do so: >> >> 1. In background correction: >> >> the default RMA background is: >> data.bg.rma <- bgcorrect >> (G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx", >> select="none", option="pmonly:epanechnikov",params=c(16384)) >> but I got the following error: >> >> g.rma <- bgcorrect >> (G1ST_data2,"tmp_bg",method="rma",exonlevel="all", select="none", >> option="pmonly:epanechnikov",params=c(16384)) >> Error in .local(object, ...) : error in function ?BgCorrect? >> Opening file </users>> Scheme_HuGene10stv1r4_na28.root> in <read> mode... >> Creating new temporary file </volumes>... >> Preprocessing data using method <adjustbgrd>... >> Background correcting raw data... >> calculating background for <1_HuGene 1_0 ST_050409.cel>... >> Error: Number of PMs or MMs is zero. >> An error has occured: Need to abort current process. >> > > Please note that the default settings are always for expression > arrays, so the error tells you that there are no MMs. > >> So, I try: >> >> data.bg.rma <- bgcorrect >> (G1ST_data2,"tmp_bg2",method="rma",exonlevel="core+affx", >> select="antigenomic", option="pmonly:epanechnikov",params=c(16384)) >> >> which works OK but I dont know if it is OK. >> > > > This is the correct setting for whole genome and exon arrays. > select="antigenomic" tells the program to use the antigenomic > background probes as MMs, e.g. if you use option "mmonly" instead > of "pmonly". > > >> After that I want to use normalize.quantiles.robust function from >> affy (is not available in xps) >> so I did: >> >> data.bg.rma<-attachInten(data.bg.rma) >> data.int<-intensity(data.bg.rma) >> detach(package:xps) >> library(affy) >> data.int.norm<-normalize.quantiles.robust(as.matrixdata.int[,-c >> (1,2)]),n.remove=5,remove.extreme='both') >> > > In version R-2.9.0 which I am using, this function has moved to > package "preprocessCore" but it seems not to work: > > library(preprocessCore) > data.int.norm <- normalize.quantiles.robust(as.matrixdata.int[,-c > (1,2)]), n.remove=1, remove.extreme='both') > > I get the following error message: > Error in normalize.quantiles.robust(as.matrixdata.int[, -c(1, > 2)]), n.remove = 1, : > VECTOR_ELT() can only be applied to a 'list', not a 'character > > Thus to simulate your setting I use function "normalize.quantiles" > and delete one sample by hand: > > data.int.norm <- normalize.quantiles(as.matrixdata.int[,-c(1,2)])) > data.int.norm <- data.int.norm[,-4] > colnames(data.int.norm) <- c > ("Breast01","Breast02","Breast03","Prostate02","Prostate03") > > Note that (at least for me) the output is a matrix w/o column > names, thus you need to set the correct column names manually. > (In my example I am using the breast/prostate triplicates from the > Affy dataset.) > > >> which shows that the data is normalized. Then I have to update >> the intensitities in the xps object data.bg.rma, >> which I did and after >> >> library(xps) >> strdata.int) >> data.int[,-c(1,2)]<-data.int.norm >> intensity(data.bg.rma)<-data.int >> boxplot(data.bg.rma) #boxplot is OK >> > > The new replacement method "intensity()<-" has an option to create > a new ROOT file (see?intensity), thus you need to do: > > library(xps) > strdata.int) > > data.int.norm <- as.data.frame(cbinddata.int[,c(1,2)],data.int.norm)) > > Here you see that I added the (x,y) coordinates, but it is up to > you to make sure that the order is correct. > I am using cbind() to prevent cycling of the samples, which is what > I get when using "data.int[,-c(1,2)]". > > Now I can use the replacement method: > > intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm > str(data.bg.rma) > boxplot(data.bg.rma) #boxplot is OK > > Please note that this will take some time since the background- > corrected intensities will first be saved as CEL-files which are > then imported into the new ROOT file "tmp_int2_cel.root". > > >> The problem comes when I sumarized the resulting data using >> median polish, >> the resulting data is not normalized: >> >> data.mp.rma <- summarize.rma >> (data.bg.rma,"tmp_sum_rma",exonlevel="core +affx") >> boxplot(data.mp.rma) #boxplot is not OK. >> > > Now you can summarize the data using xps, but you need to replace > the setname first: > > setName(data.bg.rma) <- "DataSet" > data.mp.rma <- summarize.rma(data.bg.rma, "tmp_sum_rma", > exonlevel="core+affx") > boxplot(data.mp.rma) #boxplot is now OK. > > I hope this helps. > Best regards > Christian > > >> I dont know if I make a mistake specially in updating the >> intensities after the normalization step. I will really >> appreciate any insight on this. Below is my session info... >> >> >> > sessionInfo() >> R version 2.8.1 (2008-12-22) >> i386-apple-darwin8.11.1 >> >> locale: >> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] grid splines tools stats graphics grDevices >> utils datasets methods base >> >> other attached packages: >> [1] xps_1.2.10 affy_1.20.2 >> arrayQualityMetrics_1.8.1 marray_1.20.0 >> latticeExtra_0.5-4 vsn_3.8.0 >> [7] beadarray_1.10.0 sma_0.5.15 >> hwriter_1.0 affycoretools_1.14.1 >> annaffy_1.14.0 KEGG.db_2.2.5 >> [13] biomaRt_1.16.0 GOstats_2.8.0 >> Category_2.8.4 RBGL_1.18.0 >> GO.db_2.2.5 RSQLite_0.7-1 >> [19] DBI_0.2-4 graph_1.20.0 >> limma_2.16.5 affyQCReport_1.20.0 >> geneplotter_1.20.0 annotate_1.20.1 >> [25] AnnotationDbi_1.5.18 lattice_0.17-17 >> RColorBrewer_1.0-2 affyPLM_1.18.1 >> preprocessCore_1.4.0 xtable_1.5-4 >> [31] simpleaffy_2.18.0 gcrma_2.14.1 >> matchprobes_1.14.1 genefilter_1.22.0 >> survival_2.34-1 Biobase_2.2.2 >> >> loaded via a namespace (and not attached): >> [1] GSEABase_1.4.0 KernSmooth_2.22-22 RCurl_0.94-1 >> XML_2.1-0 affyio_1.10.1 cluster_1.11.11 >> >> >> >> [[alternative HTML version deleted]] >> >> >> --------------------------------------------------------------------- >> --- >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/ >> gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Dear Mayte I must admit that it may be confusing, so I need to update the help files. Please note that "bgcorrect()" is a general function. There are specific functions for different background methods such as "bgcorrect.rma()" and "bgcorrect.mas5()" (see ?bgcorrect). Each of these methods has parameter "select" to select the probes to be used for background computation. For expression arrays you can select c("pmonly", "mmonly", "both"). For whole genome arrays you can select only "antigenomic". For exon arrays you can select c("antigenomic", "genomic").. Thus for "bgcorrect.mas5()" parameter "select" tells the function which probes to be used for background computation. The rma background is special, since rma normally uses PM probes ("pmonly") for background computation. Thus in this case I am using select="antigenomic" only to indicate that a whole genome or exon array is used. However, in the case of MM probes ("mmonly") parameter "select" tells the function to use "antigenomic" probes as MM probes. Here are three examples how to use function "bgcorrect()" to compute the background: 1. Expression array, PM probes are used for background computation: > bg.rma <- bgcorrect (data, "tmp_bg2", method="rma", exonlevel="", select="none", option="pmonly:epanechnikov", params=c(16384)) 2. Whole genome array, PM probes are used for background computation: > bg.rma <- bgcorrect (data, "tmp_bg2", method="rma", exonlevel="core+affx", select="antigenomic", option="pmonly:epanechnikov", params=c(16384)) Please note that in this case "core+affx" probes will be used for background computation, and "antigenomic" is only an indicator to use whole genome or exon arrays. 3. Whole genome array, antigenomic MM probes are used for background computation: > bg.rma <- bgcorrect (data, "tmp_bg2", method="rma", exonlevel="core+affx", select="antigenomic", option="mmonly:epanechnikov", params=c(16384)) In this case, "antigenomic" probes will be used for background computation, since "option" tells you to use "mmonly" probes. I hope that I could explain how to use function "bgcorrect()". Best regards Christian Mayte Suarez-Farinas wrote: >>> > > Hi Christian. > Tx for your answer. For my first question, I am sorry but I am still > confused, I dont know what the correct answer is. I am working with > HuGene 1_0 ST, measuring expression, I though I had to used the common > (default) RMA with PM's only. But it does not work. the option that > works with "antigenomic" is using MM's. Then, is this option right for > my case? > best, > Mayte > >>> 1. In background correction: >>> >>> the default RMA background is: >>> data.bg.rma <- bgcorrect >>> (G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx", >>> select="none", option="pmonly:epanechnikov",params=c(16384)) >>> but I got the following error: >>> >>> g.rma <- >>> bgcorrect(G1ST_data2,"tmp_bg",method="rma",exonlevel="all", >>> select="none", option="pmonly:epanechnikov",params=c(16384)) >>> Error in .local(object, ...) : error in function ?BgCorrect? >>> Opening file </users>>> Scheme_HuGene10stv1r4_na28.root> in <read> mode... >>> Creating new temporary file </volumes>... >>> Preprocessing data using method <adjustbgrd>... >>> Background correcting raw data... >>> calculating background for <1_HuGene 1_0 ST_050409.cel>... >>> Error: Number of PMs or MMs is zero. >>> An error has occured: Need to abort current process. >>> >> >> Please note that the default settings are always for expression >> arrays, so the error tells you that there are no MMs. >> >>> So, I try: >>> >>> data.bg.rma <- bgcorrect >>> (G1ST_data2,"tmp_bg2",method="rma",exonlevel="core+affx", >>> select="antigenomic", option="pmonly:epanechnikov",params=c(16384)) >>> >>> which works OK but I dont know if it is OK. >>> >> >> >> This is the correct setting for whole genome and exon arrays. >> select="antigenomic" tells the program to use the antigenomic >> background probes as MMs, e.g. if you use option "mmonly" instead of >> "pmonly". >> > > > > > On May 23, 2009, at 11:42 AM, cstrato wrote: > >> Dear Mayte, >> >> Although not recommended, this is in principle possible, however your >> xps version is too old, you need version "xps_1.4.x", where I have >> modified method "intensity()<-" for these purposes, see the help file >> "?intensity". >> >> See my further comments below. >> >> >> Mayte Suarez-Farinas wrote: >>> Hi everybody. >>> >>> I am working with xps and I have to admit I still dont get all the >>> nuances, but I am trying my best. >>> To summarize the data, I want to use rma but with an alteration to >>> the normalization step. >>> so I need to do the 3 steps: bgcorrect, normalize and summarize. I >>> got two problems trying to do so: >>> >>> 1. In background correction: >>> >>> the default RMA background is: >>> data.bg.rma <- bgcorrect >>> (G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx", >>> select="none", option="pmonly:epanechnikov",params=c(16384)) >>> but I got the following error: >>> >>> g.rma <- >>> bgcorrect(G1ST_data2,"tmp_bg",method="rma",exonlevel="all", >>> select="none", option="pmonly:epanechnikov",params=c(16384)) >>> Error in .local(object, ...) : error in function ?BgCorrect? >>> Opening file </users>>> Scheme_HuGene10stv1r4_na28.root> in <read> mode... >>> Creating new temporary file </volumes>... >>> Preprocessing data using method <adjustbgrd>... >>> Background correcting raw data... >>> calculating background for <1_HuGene 1_0 ST_050409.cel>... >>> Error: Number of PMs or MMs is zero. >>> An error has occured: Need to abort current process. >>> >> >> Please note that the default settings are always for expression >> arrays, so the error tells you that there are no MMs. >> >>> So, I try: >>> >>> data.bg.rma <- bgcorrect >>> (G1ST_data2,"tmp_bg2",method="rma",exonlevel="core+affx", >>> select="antigenomic", option="pmonly:epanechnikov",params=c(16384)) >>> >>> which works OK but I dont know if it is OK. >>> >> >> >> This is the correct setting for whole genome and exon arrays. >> select="antigenomic" tells the program to use the antigenomic >> background probes as MMs, e.g. if you use option "mmonly" instead of >> "pmonly". >> >> >>> After that I want to use normalize.quantiles.robust function from >>> affy (is not available in xps) >>> so I did: >>> >>> data.bg.rma<-attachInten(data.bg.rma) >>> data.int<-intensity(data.bg.rma) >>> detach(package:xps) >>> library(affy) >>> data.int.norm<-normalize.quantiles.robust(as.matrixdata.int[,-c >>> (1,2)]),n.remove=5,remove.extreme='both') >>> >> >> In version R-2.9.0 which I am using, this function has moved to >> package "preprocessCore" but it seems not to work: >> >> library(preprocessCore) >> data.int.norm <- >> normalize.quantiles.robust(as.matrixdata.int[,-c(1,2)]), n.remove=1, >> remove.extreme='both') >> >> I get the following error message: >> Error in normalize.quantiles.robust(as.matrixdata.int[, -c(1, 2)]), >> n.remove = 1, : >> VECTOR_ELT() can only be applied to a 'list', not a 'character >> >> Thus to simulate your setting I use function "normalize.quantiles" >> and delete one sample by hand: >> >> data.int.norm <- normalize.quantiles(as.matrixdata.int[,-c(1,2)])) >> data.int.norm <- data.int.norm[,-4] >> colnames(data.int.norm) <- >> c("Breast01","Breast02","Breast03","Prostate02","Prostate03") >> >> Note that (at least for me) the output is a matrix w/o column names, >> thus you need to set the correct column names manually. >> (In my example I am using the breast/prostate triplicates from the >> Affy dataset.) >> >> >>> which shows that the data is normalized. Then I have to update the >>> intensitities in the xps object data.bg.rma, >>> which I did and after >>> >>> library(xps) >>> strdata.int) >>> data.int[,-c(1,2)]<-data.int.norm >>> intensity(data.bg.rma)<-data.int >>> boxplot(data.bg.rma) #boxplot is OK >>> >> >> The new replacement method "intensity()<-" has an option to create a >> new ROOT file (see?intensity), thus you need to do: >> >> library(xps) >> strdata.int) >> >> data.int.norm <- as.data.frame(cbinddata.int[,c(1,2)],data.int.norm)) >> >> Here you see that I added the (x,y) coordinates, but it is up to you >> to make sure that the order is correct. >> I am using cbind() to prevent cycling of the samples, which is what I >> get when using "data.int[,-c(1,2)]". >> >> Now I can use the replacement method: >> >> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm >> str(data.bg.rma) >> boxplot(data.bg.rma) #boxplot is OK >> >> Please note that this will take some time since the >> background-corrected intensities will first be saved as CEL-files >> which are then imported into the new ROOT file "tmp_int2_cel.root". >> >> >>> The problem comes when I sumarized the resulting data using median >>> polish, >>> the resulting data is not normalized: >>> >>> data.mp.rma <- >>> summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core +affx") >>> boxplot(data.mp.rma) #boxplot is not OK. >>> >> >> Now you can summarize the data using xps, but you need to replace the >> setname first: >> >> setName(data.bg.rma) <- "DataSet" >> data.mp.rma <- summarize.rma(data.bg.rma, "tmp_sum_rma", >> exonlevel="core+affx") >> boxplot(data.mp.rma) #boxplot is now OK. >> >> I hope this helps. >> Best regards >> Christian >> >> >>> I dont know if I make a mistake specially in updating the >>> intensities after the normalization step. I will really appreciate >>> any insight on this. Below is my session info... >>> >>> >>> > sessionInfo() >>> R version 2.8.1 (2008-12-22) >>> i386-apple-darwin8.11.1 >>> >>> locale: >>> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 >>> >>> attached base packages: >>> [1] grid splines tools stats graphics grDevices >>> utils datasets methods base >>> >>> other attached packages: >>> [1] xps_1.2.10 affy_1.20.2 >>> arrayQualityMetrics_1.8.1 marray_1.20.0 >>> latticeExtra_0.5-4 vsn_3.8.0 >>> [7] beadarray_1.10.0 sma_0.5.15 >>> hwriter_1.0 affycoretools_1.14.1 >>> annaffy_1.14.0 KEGG.db_2.2.5 >>> [13] biomaRt_1.16.0 GOstats_2.8.0 >>> Category_2.8.4 RBGL_1.18.0 >>> GO.db_2.2.5 RSQLite_0.7-1 >>> [19] DBI_0.2-4 graph_1.20.0 >>> limma_2.16.5 affyQCReport_1.20.0 >>> geneplotter_1.20.0 annotate_1.20.1 >>> [25] AnnotationDbi_1.5.18 lattice_0.17-17 >>> RColorBrewer_1.0-2 affyPLM_1.18.1 >>> preprocessCore_1.4.0 xtable_1.5-4 >>> [31] simpleaffy_2.18.0 gcrma_2.14.1 >>> matchprobes_1.14.1 genefilter_1.22.0 >>> survival_2.34-1 Biobase_2.2.2 >>> >>> loaded via a namespace (and not attached): >>> [1] GSEABase_1.4.0 KernSmooth_2.22-22 RCurl_0.94-1 >>> XML_2.1-0 affyio_1.10.1 cluster_1.11.11 >>> >>> >>> >>> [[alternative HTML version deleted]] >>> >>> >>> ------------------------------------------------------------------ ------ >>> >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > >
ADD REPLY
0
Entering edit mode
Dear Christian, Tx for your answer. > > 1. Expression array, PM probes are used for background computation: > > bg.rma <- bgcorrect (data, "tmp_bg2", method="rma", exonlevel="", > select="none", option="pmonly:epanechnikov", params=c(16384)) > I updated to the last version of xps. still I get a problem with background correction using either bgcorrect or bgcorrect.rma: library(xps) scmdir <- "/Users/Mayte/Rlibrary/AffyDB/ROOTSchemes" scheme.hugene10stv1r4 <- root.scheme(paste(scmdir, "Scheme_HuGene10stv1r4_na28.root",sep = "/")) ########################################### outchip<-c(16,20,21) G1ST_data2<-import.data(scheme.hugene10stv1r4, "MyCEL_dataxps_162021x", celdir=getwd(), celfiles = as.character(PD[- outchip,'File']), verbose = FALSE) bg.rma <- bgcorrect.rma (G1ST_data2, "tmp_bg1") Error in exonLevel(exonlevel, chiptype) : invalid argument ‘exonlevel’ bg.rma <- bgcorrect (G1ST_data2, "tmp_bg1", method="rma", exonlevel="", select="none", option="pmonly:epanechnikov", params=c (16384)) Error in exonLevel(exonlevel, chiptype) : invalid argument ‘exonlevel’ Any idea what is going on?? > > 1. Expression array, PM probes are used for background computation: > > bg.rma <- bgcorrect (data, "tmp_bg2", method="rma", exonlevel="", > select="none", option="pmonly:epanechnikov", params=c(16384)) > > sessionInfo() R version 2.8.1 (2008-12-22) i386-apple-darwin8.11.1 locale: en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] tools stats graphics grDevices utils datasets methods base other attached packages: [1] hugene10st.db_1.0.2 RSQLite_0.7-1 DBI_0.2-4 AnnotationDbi_1.5.18 Biobase_2.2.2 preprocessCore_1.4.0 xps_1.4.3 loaded via a namespace (and not attached): [1] affy_1.20.2 affyio_1.10.1 annotate_1.20.1 genefilter_1.22.0 simpleaffy_2.18.0 splines_2.8.1 survival_2.34-1 [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Dear Mayte For HuGene arrays parameter exonlevel must have one of the values described in "?exonLevel", for example exonlevel="core+affx". Setting exonlevel="" is necessary for expression arrays but not allowed for HuGene arrays. The correct background setting for your case is the one you have used initially: data.bg.rma <- bgcorrect(G1ST_data2, "tmp_bg2" ,method="rma", exonlevel="core+affx", select="antigenomic", option="pmonly:epanechnikov", params=c(16384)) Best regards Christian Mayte Suarez-Farinas wrote: > Dear Christian, > > Tx for your answer. > >> >> 1. Expression array, PM probes are used for background computation: >> > bg.rma <- bgcorrect (data, "tmp_bg2", method="rma", exonlevel="", >> select="none", option="pmonly:epanechnikov", params=c(16384)) >> > > I updated to the last version of xps. > still I get a problem with background correction using either > bgcorrect or bgcorrect.rma: > > library(xps) > scmdir <- "/Users/Mayte/Rlibrary/AffyDB/ROOTSchemes" > scheme.hugene10stv1r4 <- root.scheme(paste(scmdir, > "Scheme_HuGene10stv1r4_na28.root",sep = "/")) > > ########################################### > outchip<-c(16,20,21) > G1ST_data2<-import.data(scheme.hugene10stv1r4, > "MyCEL_dataxps_162021x", celdir=getwd(), celfiles = > as.character(PD[-outchip,'File']), verbose = FALSE) > > bg.rma <- bgcorrect.rma (G1ST_data2, "tmp_bg1") > Error in exonLevel(exonlevel, chiptype) : invalid argument ?exonlevel? > > bg.rma <- bgcorrect (G1ST_data2, "tmp_bg1", method="rma", > exonlevel="", select="none", option="pmonly:epanechnikov", > params=c(16384)) > Error in exonLevel(exonlevel, chiptype) : invalid argument ?exonlevel? > > Any idea what is going on?? > >> >> 1. Expression array, PM probes are used for background computation: >> > bg.rma <- bgcorrect (data, "tmp_bg2", method="rma", exonlevel="", >> select="none", option="pmonly:epanechnikov", params=c(16384)) >> >> > > sessionInfo() > R version 2.8.1 (2008-12-22) > i386-apple-darwin8.11.1 > > locale: > en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] tools stats graphics grDevices utils datasets > methods base > > other attached packages: > [1] hugene10st.db_1.0.2 RSQLite_0.7-1 DBI_0.2-4 > AnnotationDbi_1.5.18 Biobase_2.2.2 preprocessCore_1.4.0 > xps_1.4.3 > > loaded via a namespace (and not attached): > [1] affy_1.20.2 affyio_1.10.1 annotate_1.20.1 > genefilter_1.22.0 simpleaffy_2.18.0 splines_2.8.1 survival_2.34-1 >
ADD REPLY
0
Entering edit mode
> Dear Christian, I am sorry I need to bother you gain ! Everything worked fine with the background correction, the quantile normalization and the substitution using function "intensity()<-". When I do the boxplot after this, teh data is normalized. Then when I use summarize.rma, after that the data is not normalized anymore. intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm boxplot(data.bg.rma) ## Boxplot is perfect! setName(data.bg.rma) <- "DataSet" data.mp.rma <- summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core +affx") boxplot (data.mp.rma) #Boxplot is NOT ok any hint? Best Mayte > > The new replacement method "intensity()<-" has an option to create > a new ROOT file (see?intensity), thus you need to do: > > library(xps) > strdata.int) > > data.int.norm <- as.data.framecbinddata.int[,c(1,2)],data.int.norm)) > > Here you see that I added the (x,y) coordinates, but it is up to > you to make sure that the order is correct. > I am using cbind() to prevent cycling of the samples, which is what > I get when using "data.int[,-c(1,2)]". > > Now I can use the replacement method: > > intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm > str(data.bg.rma) > boxplot(data.bg.rma) #boxplot is OK > > Please note that this will take some time since the background- > corrected intensities will first be saved as CEL-files which are > then imported into the new ROOT file "tmp_int2_cel.root". > > Now you can summarize the data using xps, but you need to replace > the setname first: > > setName(data.bg.rma) <- "DataSet" > data.mp.rma <- summarize.rma(data.bg.rma, "tmp_sum_rma", > exonlevel="core+affx") > boxplot(data.mp.rma) #boxplot is now OK. > > I hope this helps. > Best regards > Christian > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Dear Mayte, Can you please send me your complete code from the beginning, so that I can test it. Especially, did you add the (x,y)-coordinates to data.int.norm: > data.int.norm <- as.data.framecbinddata.int[,c(1,2)], data.int.norm)) Best regards Christian Mayte Suarez-Farinas wrote: >> Dear Christian, > I am sorry I need to bother you gain ! > Everything worked fine with the background correction, the quantile > normalization and the substitution > using function "intensity()<-". When I do the boxplot after this, teh > data is normalized. Then when I use summarize.rma, > after that the data is not normalized anymore. > > intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm > boxplot(data.bg.rma) ## Boxplot is perfect! > setName(data.bg.rma) <- "DataSet" > data.mp.rma <- > summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core+affx") > boxplot(data.mp.rma) > #Boxplot is NOT ok > > any hint? > > Best > > Mayte > > >> >> The new replacement method "intensity()<-" has an option to create a >> new ROOT file (see?intensity), thus you need to do: >> >> library(xps) >> strdata.int) >> >> data.int.norm <- as.data.framecbinddata.int[,c(1,2)],data.int.norm)) >> >> Here you see that I added the (x,y) coordinates, but it is up to you >> to make sure that the order is correct. >> I am using cbind() to prevent cycling of the samples, which is what I >> get when using "data.int[,-c(1,2)]". >> >> Now I can use the replacement method: >> >> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm >> str(data.bg.rma) >> boxplot(data.bg.rma) #boxplot is OK >> >> Please note that this will take some time since the >> background-corrected intensities will first be saved as CEL-files >> which are then imported into the new ROOT file "tmp_int2_cel.root". >> >> Now you can summarize the data using xps, but you need to replace the >> setname first: >> >> setName(data.bg.rma) <- "DataSet" >> data.mp.rma <- summarize.rma(data.bg.rma, "tmp_sum_rma", >> exonlevel="core+affx") >> boxplot(data.mp.rma) #boxplot is now OK. >> >> I hope this helps. >> Best regards >> Christian >> >>
ADD REPLY
0
Entering edit mode
Dear Mayte, Thank you for sending me your code. I have just run your code using the Affymetrix breast, heart, prostate HuGene data, and everything is ok, including the final boxplot. However, the results are slightly different when I replace "normalize.quantiles()" from package preProcessCore with "normalize.quantiles()" from xps. Furthermore I have found two potential problems in your code: 1, To simulate the removal of chips as output from "normalize.quantiles.robust()" I have deleted one column. Thus I also needed to delete the corresponding column name. Since you have set "n.remove=5", up to 5 columns may be removed, so you would need to remove these column names manually. 2, Since you store your original CEL-files in your working directory, replacement function "intensity<-" did replace your original CEL-files with the background corrected CEL-files of the same name. There are two ways to prevent this. The best way is to store the original CEL-files in another directory, e.g. in "raw". The second possibility is to use parameter "celnames" of function "import.data()" to rename the imported CEL-files. A third problem may be that function "normalize.quantiles.robust()" re-orders the matrix, so that the (x,y)-coordinates are no longer correct. Although this should not be the case I cannot exclude this possibility. Here is the complete code that I used for testing. - - - - - - - - - - - - - - - - - - - - - - - - - ### new R session: load library xps library(xps) scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes" scheme.hugene10stv1r4 <- root.scheme(paste(scmdir, "Scheme_HuGene10stv1r4_na28.root",sep = "/")) celfiles <- c("Breast_01.CEL", "Breast_02.CEL", "Breast_03.CEL", "Heart_01.CEL", "Heart_02.CEL", "Heart_03.CEL", "Prostate_01.CEL", "Prostate_02.CEL", "Prostate_03.CEL") G1ST_data2<-import.data(scheme.hugene10stv1r4, "Pamela_NOMID_dataxps_162021", celdir=getwd(), celfiles=celfiles, verbose=TRUE) ## RMA background data.bg.rma <- bgcorrect(G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx", select="antigenomic", option="pmonly:epanechnikov",params=c(16384)) # get intensities data.bg.rma<-attachInten(data.bg.rma) data.int<-intensity(data.bg.rma) # normalize with affy functions detach(package:xps) library(preprocessCore) data.int.norm<-normalize.quantiles.robustas.matrixdata.int[,-c(1,2)] ),n.remove=2,remove.extreme='both') # manually remove one chip data.int.norm <- data.int.norm[,-4] # replace intensity slot library(xps) aaa<-as.data.framecbinddata.int[,c(1,2)],data.int.norm)) # Problem: need to remove colname of chip which was removed colnames(aaa)<-colnamesdata.int)[-6] intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- aaa # Problem: does overwrite original CEL-files boxplot(data.bg.rma) #boxplot is OK ## summarize medianpolish setName(data.bg.rma) <- "DataSet" data.mp.rma <- summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core+affx") boxplot(data.mp.rma) - - - - - - - - - - - - - - - - - - - - - - - - - Best regards Christian Mayte Suarez-Farinas wrote: >> Dear Christian, > I am sorry I need to bother you gain ! > Everything worked fine with the background correction, the quantile > normalization and the substitution > using function "intensity()<-". When I do the boxplot after this, teh > data is normalized. Then when I use summarize.rma, > after that the data is not normalized anymore. > > intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm > boxplot(data.bg.rma) ## Boxplot is perfect! > setName(data.bg.rma) <- "DataSet" > data.mp.rma <- > summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core+affx") > boxplot(data.mp.rma) > #Boxplot is NOT ok > > any hint? > > Best > > Mayte > > >> >> The new replacement method "intensity()<-" has an option to create a >> new ROOT file (see?intensity), thus you need to do: >> >> library(xps) >> strdata.int) >> >> data.int.norm <- as.data.framecbinddata.int[,c(1,2)],data.int.norm)) >> >> Here you see that I added the (x,y) coordinates, but it is up to you >> to make sure that the order is correct. >> I am using cbind() to prevent cycling of the samples, which is what I >> get when using "data.int[,-c(1,2)]". >> >> Now I can use the replacement method: >> >> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm >> str(data.bg.rma) >> boxplot(data.bg.rma) #boxplot is OK >> >> Please note that this will take some time since the >> background-corrected intensities will first be saved as CEL-files >> which are then imported into the new ROOT file "tmp_int2_cel.root". >> >> Now you can summarize the data using xps, but you need to replace the >> setname first: >> >> setName(data.bg.rma) <- "DataSet" >> data.mp.rma <- summarize.rma(data.bg.rma, "tmp_sum_rma", >> exonlevel="core+affx") >> boxplot(data.mp.rma) #boxplot is now OK. >> >> I hope this helps. >> Best regards >> Christian >> >>
ADD REPLY
0
Entering edit mode
>Dear Mayte, > >Thank you for sending me your code. I have just run your code using the >Affymetrix breast, heart, prostate HuGene data, and everything is ok, >including the final boxplot. However, the results are slightly different >when I replace "normalize.quantiles()" from package preProcessCore with >"normalize.quantiles()" from xps. Furthermore I have found two potential >problems in your code: > >1, To simulate the removal of chips as output from >"normalize.quantiles.robust()" I have deleted one column. Thus I also >needed to delete the corresponding column name. Since you have set >"n.remove=5", up to 5 columns may be removed, so you would need to >remove these column names manually. I do not want to remove those chips from the output. the remove option on normalize.quantiles.robust do not remove the columns per se, it only uses the remaining chips to calculate the reference distribution to which all the chips (including the 'removed') are normalized. This step produces a matrix with the same number of columns as the original. I don't want to removed those chips, only not to include them in the estimation of the reference distribution (if I include them the reference distribution has utterly low range and I want to avoid to delete those chips, their qc is not that bad) > >2, Since you store your original CEL-files in your working directory, >replacement function "intensity<-" did replace your original CEL- files >with the background corrected CEL-files of the same name. There are two >ways to prevent this. The best way is to store the original CEL-files in >another directory, e.g. in "raw". The second possibility is to use >parameter "celnames" of function "import.data()" to rename the imported >CEL-files. Do you mean that my CEL files were changed?. they are not more the raw data ? Oh Oh! I was not aware of that. Oh my! I hope that the facility kept a copy of my CEL files otherwise I am in deep trouble! > >A third problem may be that function "normalize.quantiles.robust()" >re-orders the matrix, so that the (x,y)-coordinates are no longer >correct. Although this should not be the case I cannot exclude this >possibility. Maybe Bolstad can answer this... > >Here is the complete code that I used for testing. > >- - - - - - - - - - - - - - - - - - - - - - - - - > ### new R session: load library xps >library(xps) >scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes" >scheme.hugene10stv1r4 <- root.scheme(paste(scmdir, >"Scheme_HuGene10stv1r4_na28.root",sep = "/")) > >celfiles <- c("Breast_01.CEL", "Breast_02.CEL", "Breast_03.CEL", >"Heart_01.CEL", "Heart_02.CEL", "Heart_03.CEL", "Prostate_01.CEL", >"Prostate_02.CEL", "Prostate_03.CEL") >G1ST_data2<-import.data(scheme.hugene10stv1r4, >"Pamela_NOMID_dataxps_162021", celdir=getwd(), celfiles=celfiles, >verbose=TRUE) > >## RMA background >data.bg.rma <- >bgcorrect(G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx", >select="antigenomic", option="pmonly:epanechnikov",params=c(16384)) > ># get intensities >data.bg.rma<-attachInten(data.bg.rma) >data.int<-intensity(data.bg.rma) > ># normalize with affy functions >detach(package:xps) >library(preprocessCore) >data.int.norm<-normalize.quantiles.robustas.matrixdata.int[,- c(1,2)]),n.remove=2,remove.extreme='both') ># manually remove one chip >data.int.norm <- data.int.norm[,-4] > ># replace intensity slot >library(xps) >aaa<-as.data.framecbinddata.int[,c(1,2)],data.int.norm)) ># Problem: need to remove colname of chip which was removed >colnames(aaa)<-colnamesdata.int)[-6] >intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- aaa ># Problem: does overwrite original CEL-files >boxplot(data.bg.rma) #boxplot is OK > >## summarize medianpolish >setName(data.bg.rma) <- "DataSet" >data.mp.rma <- >summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core+affx") >boxplot(data.mp.rma) >- - - - - - - - - - - - - - - - - - - - - - - - - > >Best regards >Christian > > >Mayte Suarez-Farinas wrote: >>> Dear Christian, >> I am sorry I need to bother you gain ! >> Everything worked fine with the background correction, the quantile >> normalization and the substitution >> using function "intensity()<-". When I do the boxplot after this, teh >> data is normalized. Then when I use summarize.rma, >> after that the data is not normalized anymore. >> >> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm >> boxplot(data.bg.rma) ## Boxplot is perfect! >> setName(data.bg.rma) <- "DataSet" >> data.mp.rma <- >> summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core+affx") >> boxplot(data.mp.rma) >> #Boxplot is NOT ok >> >> any hint? >> >> Best >> >> Mayte >> >> >>> >>> The new replacement method "intensity()<-" has an option to create a >>> new ROOT file (see?intensity), thus you need to do: >>> >>> library(xps) >>> strdata.int) >>> >>> data.int.norm <- as.data.framecbinddata.int[,c(1,2)],data.int.norm)) >>> >>> Here you see that I added the (x,y) coordinates, but it is up to you >>> to make sure that the order is correct. >>> I am using cbind() to prevent cycling of the samples, which is what I >>> get when using "data.int[,-c(1,2)]". >>> >>> Now I can use the replacement method: >>> >>> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm >>> str(data.bg.rma) >>> boxplot(data.bg.rma) #boxplot is OK >>> >>> Please note that this will take some time since the >>> background-corrected intensities will first be saved as CEL-files >>> which are then imported into the new ROOT file "tmp_int2_cel.root". >>> >>> Now you can summarize the data using xps, but you need to replace the >>> setname first: >>> >>> setName(data.bg.rma) <- "DataSet" >>> data.mp.rma <- summarize.rma(data.bg.rma, "tmp_sum_rma", >>> exonlevel="core+affx") >>> boxplot(data.mp.rma) #boxplot is now OK. >>> >>> I hope this helps. >>> Best regards >>> Christian >>> >>> > > > ---------End of Included Message---------- Mayte Suarez Farinas Research Associate The Rockefeller University Hospital 1230 York Avenue, Box 178 New York, NY 10021 (212) 327-8213 - phone (212) 327-7422 - fax farinam at rockefeller.edu
ADD REPLY
0
Entering edit mode
Dear Mayte, Yes, your CEL-files were modified as is clearly stated in the help file "?intensity<-" of the modified replacement function "intensity<-": "Warning: Do not use replacement method intensity<- until you really know what you are doing! Note: If you do not want to replace your current object, create first a copy of type DataTreeSet by simply writing newobj <- oldobj, and use newobj for replacement. This is important since intensity<- does also update slots rootfile, filedir and treenames when a new filename was chosen. Warning: The CEL-files created WILL REPLACE THE ORIGINAL CEL-files, if they have identical names to the original CEL-files and the original CEL-files are located in the working directory. Thus the original CEL-files should preferable be located in directory celdir of function import.data." To my knowledge microarray facilities usually store all CEL-files created in a common place and people either copy the CEL-files of interest to their working directories or they create links to the original CEL-files. When importing CEL-files into ROOT using function "import.data()" even this is not necessary, as the help file "?import.data" says: "To import CEL-files from different directories, vector celfiles must contain the full path for each CEL-file and celdir must be celdir=NULL." At the moment I can only apologize and ask people starting to use xps to read the corresponding help files, especially when using advanced features. For this reason I mention often the corresponding help files in my replies to questions, as I did in my original reply to your question. Best regards Christian Mayte Suarez Farinas wrote: > >> Dear Mayte, >> >> Thank you for sending me your code. I have just run your code using the >> Affymetrix breast, heart, prostate HuGene data, and everything is ok, >> including the final boxplot. However, the results are slightly different >> when I replace "normalize.quantiles()" from package preProcessCore with >> "normalize.quantiles()" from xps. Furthermore I have found two potential >> problems in your code: >> >> 1, To simulate the removal of chips as output from >> "normalize.quantiles.robust()" I have deleted one column. Thus I also >> needed to delete the corresponding column name. Since you have set >> "n.remove=5", up to 5 columns may be removed, so you would need to >> remove these column names manually. >> > > I do not want to remove those chips from the output. the remove option on > normalize.quantiles.robust do not remove the columns per se, it only uses the > remaining chips to calculate the reference distribution to which all the chips > (including the 'removed') are normalized. This step produces a matrix with the same > number of columns as the original. I don't want to removed those chips, only not to > include them in the estimation of the reference distribution (if I include them the > reference distribution has utterly low range and I want to avoid to delete those > chips, their qc is not that bad) > > >> 2, Since you store your original CEL-files in your working directory, >> replacement function "intensity<-" did replace your original CEL- files >> with the background corrected CEL-files of the same name. There are two >> ways to prevent this. The best way is to store the original CEL- files in >> another directory, e.g. in "raw". The second possibility is to use >> parameter "celnames" of function "import.data()" to rename the imported >> CEL-files. >> > > Do you mean that my CEL files were changed?. they are not more the raw data ? > Oh Oh! I was not aware of that. Oh my! I hope that the facility kept a copy of my > CEL files otherwise I am in deep trouble! > > >> A third problem may be that function "normalize.quantiles.robust()" >> re-orders the matrix, so that the (x,y)-coordinates are no longer >> correct. Although this should not be the case I cannot exclude this >> possibility. >> > > Maybe Bolstad can answer this... > > > > >> Here is the complete code that I used for testing. >> >> - - - - - - - - - - - - - - - - - - - - - - - - - >> ### new R session: load library xps >> library(xps) >> scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes" >> scheme.hugene10stv1r4 <- root.scheme(paste(scmdir, >> "Scheme_HuGene10stv1r4_na28.root",sep = "/")) >> >> celfiles <- c("Breast_01.CEL", "Breast_02.CEL", "Breast_03.CEL", >> "Heart_01.CEL", "Heart_02.CEL", "Heart_03.CEL", "Prostate_01.CEL", >> "Prostate_02.CEL", "Prostate_03.CEL") >> G1ST_data2<-import.data(scheme.hugene10stv1r4, >> "Pamela_NOMID_dataxps_162021", celdir=getwd(), celfiles=celfiles, >> verbose=TRUE) >> >> ## RMA background >> data.bg.rma <- >> bgcorrect(G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx", >> select="antigenomic", option="pmonly:epanechnikov",params=c(16384)) >> >> # get intensities >> data.bg.rma<-attachInten(data.bg.rma) >> data.int<-intensity(data.bg.rma) >> >> # normalize with affy functions >> detach(package:xps) >> library(preprocessCore) >> data.int.norm<-normalize.quantiles.robustas.matrixdata.int[,- >> > c(1,2)]),n.remove=2,remove.extreme='both') > >> # manually remove one chip >> data.int.norm <- data.int.norm[,-4] >> >> # replace intensity slot >> library(xps) >> aaa<-as.data.framecbinddata.int[,c(1,2)],data.int.norm)) >> # Problem: need to remove colname of chip which was removed >> colnames(aaa)<-colnamesdata.int)[-6] >> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- aaa >> # Problem: does overwrite original CEL-files >> boxplot(data.bg.rma) #boxplot is OK >> >> ## summarize medianpolish >> setName(data.bg.rma) <- "DataSet" >> data.mp.rma <- >> summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core+affx") >> boxplot(data.mp.rma) >> - - - - - - - - - - - - - - - - - - - - - - - - - >> >> Best regards >> Christian >> >> >> Mayte Suarez-Farinas wrote: >> >>>> Dear Christian, >>>> >>> I am sorry I need to bother you gain ! >>> Everything worked fine with the background correction, the quantile >>> normalization and the substitution >>> using function "intensity()<-". When I do the boxplot after this, teh >>> data is normalized. Then when I use summarize.rma, >>> after that the data is not normalized anymore. >>> >>> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm >>> boxplot(data.bg.rma) ## Boxplot is perfect! >>> setName(data.bg.rma) <- "DataSet" >>> data.mp.rma <- >>> summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core+affx") >>> boxplot(data.mp.rma) >>> #Boxplot is NOT ok >>> >>> any hint? >>> >>> Best >>> >>> Mayte >>> >>> >>> >>>> The new replacement method "intensity()<-" has an option to create a >>>> new ROOT file (see?intensity), thus you need to do: >>>> >>>> library(xps) >>>> strdata.int) >>>> >>>> data.int.norm <- as.data.framecbinddata.int[,c(1,2)],data.int.norm)) >>>> >>>> Here you see that I added the (x,y) coordinates, but it is up to you >>>> to make sure that the order is correct. >>>> I am using cbind() to prevent cycling of the samples, which is what I >>>> get when using "data.int[,-c(1,2)]". >>>> >>>> Now I can use the replacement method: >>>> >>>> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm >>>> str(data.bg.rma) >>>> boxplot(data.bg.rma) #boxplot is OK >>>> >>>> Please note that this will take some time since the >>>> background-corrected intensities will first be saved as CEL-files >>>> which are then imported into the new ROOT file "tmp_int2_cel.root". >>>> >>>> Now you can summarize the data using xps, but you need to replace the >>>> setname first: >>>> >>>> setName(data.bg.rma) <- "DataSet" >>>> data.mp.rma <- summarize.rma(data.bg.rma, "tmp_sum_rma", >>>> exonlevel="core+affx") >>>> boxplot(data.mp.rma) #boxplot is now OK. >>>> >>>> I hope this helps. >>>> Best regards >>>> Christian >>>> >>>> >>>> >> >> > ---------End of Included Message---------- > > Mayte Suarez Farinas > Research Associate > The Rockefeller University Hospital > 1230 York Avenue, Box 178 > New York, NY 10021 > (212) 327-8213 - phone > (212) 327-7422 - fax > farinam at rockefeller.edu > > > > >
ADD REPLY

Login before adding your answer.

Traffic: 460 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6