RMA/QuantileNormalization results difference between oligo and aroma.affymetrix for Hugene
1
0
Entering edit mode
@mikhail-pachkov-2806
Last seen 10.3 years ago
Dear All, I am new in microarray analysis and need your expertise. I need to develop procedure for producing expression values from CEL files. Data should processed with RMA and quantile normalized. I have tried two packages - oligo and aroma.affymetrix. Obtained results are quite different form my point of view. Moreover aroma.affymetrix::QuantileNormalization function produce dta which do not look like they were quantile normalized. I have made density plots of data after RMA and after quantile normalization which are available here http://test.swissregulon.unibas.ch/bioc/index.html There are also links to two CEL files I have used. I have a few questions: Why RMA results are so different? Which RMA implementation is correct? Why does quantile normalization in aroma.affymetrics produce two different distributions? Thank you in advance! Here are R scripts I have used: ################################ #aroma.affymetrix library(aroma.affymetrix); verbose <- Arguments$getVerbose(-8, timestamp=TRUE); # read files cdf <- AffymetrixCdfFile('annotationData/chipTypes/HuGene-1_0-st-v1 /HuGene-1_0-st-v1.cdf'); cs <- AffymetrixCelSet$byPath("rawData/mine/HuGene-1_0-st-v1/"); # RMA bc <- RmaBackgroundCorrection(cs); csBC <- process(bc,verbose=verbose); # QuantileNormalization qn <- QuantileNormalization(csBC, typesToUpdate="pm"); csN <- process(qn); # Plots image_file <- ("aroma.affymetrix.RMA.png"); png(image_file,width=1028,height=768); plotDensity(csBC); title("aroma.affymetrix RMA data"); dev.off(); image_file <- ("aroma.affymetrix.QN.png"); png(image_file,width=1028,height=768); plotDensity(csN); title("aroma.affymetrix QN data"); dev.off() ################################ ################################ # oligo library(oligo); rawdata=read.celfiles(c("rawData/mine/HuGene- 1_0-st-v1/sample1.CEL","rawData/mine/HuGene-1_0-st-v1/sample2.CEL")); rmadata=rma(rawdata); qndata=normalize.quantiles(exprs(rmadata)) library(affy) # Plots image_file <- ("oligo.RMA.png"); png(image_file,width=1028,height=768); plotDensity(exprs(rmadata)); title("oligo RMA data"); dev.off(); image_file <- ("oligo.QN.png"); png(image_file,width=1028,height=768); plotDensity(qndata); title("oligo QN data"); dev.off() ############################### Kind regards, Mikhail Pachkov
Microarray Normalization cdf oligo DTA Microarray Normalization cdf oligo DTA • 3.1k views
ADD COMMENT
0
Entering edit mode
@benilton-carvalho-1375
Last seen 4.7 years ago
Brazil/Campinas/UNICAMP
Quantile normalization is already one step in the RMA workflow. Therefore, there's no need to normalize the data again once you've gone RMA, ie. (regarding oligo) your call "normalize.quantiles(exprs(rmadata))" should be dropped. Using the defaults, rma() in oligo will: 1) Background correct (via the RMA convolution model) 2) Quantile normalize 3) Summarize via median-polish. b On Fri, Feb 26, 2010 at 10:46 AM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: > Dear All, > > I am new in microarray analysis and need your expertise. > I need to develop procedure for producing expression values from CEL > files. Data should processed with RMA and quantile normalized. I have > tried two packages - oligo and aroma.affymetrix. Obtained results are > quite different form my point of view. Moreover > aroma.affymetrix::QuantileNormalization function produce dta which do > not look like they were quantile normalized. > ?I have made density plots of data after RMA and after quantile > normalization which are available here > http://test.swissregulon.unibas.ch/bioc/index.html There are also > links to two CEL files I have used. > > I have a few questions: > Why RMA results are so different? > Which RMA implementation is correct? > Why does quantile normalization in aroma.affymetrics produce two > different distributions? > > Thank you in advance! > > Here are R scripts I have used: > > ################################ > #aroma.affymetrix > library(aroma.affymetrix); > verbose <- Arguments$getVerbose(-8, timestamp=TRUE); > > # read files > cdf <- AffymetrixCdfFile('annotationData/chipTypes/HuGene-1_0-st-v1 /HuGene-1_0-st-v1.cdf'); > cs <- AffymetrixCelSet$byPath("rawData/mine/HuGene-1_0-st-v1/"); > > # RMA > bc <- RmaBackgroundCorrection(cs); > csBC <- process(bc,verbose=verbose); > > # QuantileNormalization > qn <- QuantileNormalization(csBC, typesToUpdate="pm"); > csN <- process(qn); > > # Plots > image_file <- ("aroma.affymetrix.RMA.png"); > png(image_file,width=1028,height=768); > plotDensity(csBC); > title("aroma.affymetrix RMA data"); > dev.off(); > > image_file <- ("aroma.affymetrix.QN.png"); > png(image_file,width=1028,height=768); > plotDensity(csN); > title("aroma.affymetrix QN data"); > dev.off() > ################################ > > ################################ > # oligo > library(oligo); > rawdata=read.celfiles(c("rawData/mine/HuGene- 1_0-st-v1/sample1.CEL","rawData/mine/HuGene-1_0-st-v1/sample2.CEL")); > rmadata=rma(rawdata); > qndata=normalize.quantiles(exprs(rmadata)) > > library(affy) > # Plots > image_file <- ("oligo.RMA.png"); > png(image_file,width=1028,height=768); > plotDensity(exprs(rmadata)); > title("oligo RMA data"); > dev.off(); > > image_file <- ("oligo.QN.png"); > png(image_file,width=1028,height=768); > plotDensity(qndata); > title("oligo QN data"); > dev.off() > ############################### > > Kind regards, > > Mikhail Pachkov > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Hi, you concerns about reproducibility are very important. Luckily your observations are based on mistakes as explained below. On Fri, Feb 26, 2010 at 12:05 PM, Benilton Carvalho <beniltoncarvalho at="" gmail.com=""> wrote: > Quantile normalization is already one step in the RMA workflow. > Therefore, there's no need to normalize the data again once you've > gone RMA, ie. (regarding oligo) your call > "normalize.quantiles(exprs(rmadata))" should be dropped. > > Using the defaults, rma() in oligo will: > > 1) Background correct (via the RMA convolution model) > 2) Quantile normalize > 3) Summarize via median-polish. Yes, as Benilton points out it seems like you've misunderstood what RMA does. The author of RMA (Ben Bolstad) defines the term RMA to mean the complete preprocessing suite including summarization. > > b > > On Fri, Feb 26, 2010 at 10:46 AM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: >> Dear All, >> >> I am new in microarray analysis and need your expertise. >> I need to develop procedure for producing expression values from CEL >> files. Data should processed with RMA and quantile normalized. I have >> tried two packages - oligo and aroma.affymetrix. Obtained results are >> quite different form my point of view. Moreover >> aroma.affymetrix::QuantileNormalization function produce dta which do >> not look like they were quantile normalized. What is 'dta'? >> ?I have made density plots of data after RMA and after quantile >> normalization which are available here >> http://test.swissregulon.unibas.ch/bioc/index.html There are also >> links to two CEL files I have used. >> >> I have a few questions: Below, I will take that you mean "RMA background correct" when you say "RMA". >> Why RMA results are so different? The RMA-style background correction in aroma.affymetrix utilizes affy::bg.adjust() [and normalizes PM probes only]. I'm not sure what algorithm/implementation oligo is using for this step, but they should give very similar corrected probe signals. >> Which RMA implementation is correct? So, aroma.affymetrix is basically just a wrapper for affy::bg.adjust(), which I think was the original implementation of RMA background correction. I let Benilton comment on the oligo implementation and it's origin. >> Why does quantile normalization in aroma.affymetrics produce two >> different distributions? Because you first run quantile normalization on PMs only, then you look at the density plot for all (PMs & MMs). More below. >> >> Thank you in advance! >> >> Here are R scripts I have used: >> >> ################################ >> #aroma.affymetrix >> library(aroma.affymetrix); >> verbose <- Arguments$getVerbose(-8, timestamp=TRUE); >> >> # read files >> cdf <- AffymetrixCdfFile('annotationData/chipTypes/HuGene-1_0-st-v1 /HuGene-1_0-st-v1.cdf'); >> cs <- AffymetrixCelSet$byPath("rawData/mine/HuGene-1_0-st-v1/"); Have to bring it up: Please, do not setup your annotation data and and data sets like this. An aroma.affymetrix script should not contain paths/pathnames, cf. "Dos and Don'ts": http://aroma-project.org/node/102 The correct way to do the above is: cs <- AffymetrixCelSet$byName("mine", chipType="HuGene-1_0-st-v1"); alternatively, if you wish to be explicit in what CDF is used, you can do: cdf <- AffymetrixCdfFile$byChipType("HuGene-1_0-st-v1"); cs <- AffymetrixCelSet$byName("mine", cdf=cdf); >> >> # RMA >> bc <- RmaBackgroundCorrection(cs); >> csBC <- process(bc,verbose=verbose); >> >> # QuantileNormalization >> qn <- QuantileNormalization(csBC, typesToUpdate="pm"); >> csN <- process(qn); Note, the argument 'typesToUpdate' says that it is only PM probes that will be updated. The default is that MMs are left "as is". >> >> # Plots >> image_file <- ("aroma.affymetrix.RMA.png"); >> png(image_file,width=1028,height=768); >> plotDensity(csBC); Here you are plotting all probes on the array. Since RmaBackgroundCorrection is only correcting PM probes, you probably want to do: plotDensity(csBC, types="pm"); >> title("aroma.affymetrix RMA data"); >> dev.off(); >> >> image_file <- ("aroma.affymetrix.QN.png"); >> png(image_file,width=1028,height=768); >> plotDensity(csN); plotDensity(csN, types="pm"); This is the key to why you get different density plots. For a thorough explaination of the various QN approaches, see Page 'Empirical probe-signal densities and rank-based quantile normalization': http://aroma-project.org/node/141 >> title("aroma.affymetrix QN data"); >> dev.off() What you haven't compared yet, because you misunderstood the RMA pipeline, are the summarized probe signals from fitting the log-additive RMA model. FYI, it is part of our (24 hours) redundancy testing to assert that the aroma.affymetrix RMA pipeline can reproduce the RMA pipeline and RMA summary estimates of the affyPLM package. You can see how well this is done on Page 'Replication: RMA (background, normalization & summarization)': http://www.aroma-project.org/replication/RMA Hope this helps. Henrik >> ################################ >> >> ################################ >> # oligo >> library(oligo); >> rawdata=read.celfiles(c("rawData/mine/HuGene- 1_0-st-v1/sample1.CEL","rawData/mine/HuGene-1_0-st-v1/sample2.CEL")); >> rmadata=rma(rawdata); >> qndata=normalize.quantiles(exprs(rmadata)) >> >> library(affy) >> # Plots >> image_file <- ("oligo.RMA.png"); >> png(image_file,width=1028,height=768); >> plotDensity(exprs(rmadata)); >> title("oligo RMA data"); >> dev.off(); >> >> image_file <- ("oligo.QN.png"); >> png(image_file,width=1028,height=768); >> plotDensity(qndata); >> title("oligo QN data"); >> dev.off() >> ############################### >> >> Kind regards, >> >> Mikhail Pachkov >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
regarding oligo's implementation of RMA, it's the same as the one in affy. b On Fri, Feb 26, 2010 at 12:47 PM, Henrik Bengtsson <hb at="" stat.berkeley.edu=""> wrote: > Hi, > > you concerns about reproducibility are very important. ?Luckily your > observations are based on mistakes as explained below. > > On Fri, Feb 26, 2010 at 12:05 PM, Benilton Carvalho > <beniltoncarvalho at="" gmail.com=""> wrote: >> Quantile normalization is already one step in the RMA workflow. >> Therefore, there's no need to normalize the data again once you've >> gone RMA, ie. (regarding oligo) your call >> "normalize.quantiles(exprs(rmadata))" should be dropped. >> >> Using the defaults, rma() in oligo will: >> >> 1) Background correct (via the RMA convolution model) >> 2) Quantile normalize >> 3) Summarize via median-polish. > > Yes, as Benilton points out it seems like you've misunderstood what > RMA does. ?The author of RMA (Ben Bolstad) defines the term RMA to > mean the complete preprocessing suite including summarization. > >> >> b >> >> On Fri, Feb 26, 2010 at 10:46 AM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: >>> Dear All, >>> >>> I am new in microarray analysis and need your expertise. >>> I need to develop procedure for producing expression values from CEL >>> files. Data should processed with RMA and quantile normalized. I have >>> tried two packages - oligo and aroma.affymetrix. Obtained results are >>> quite different form my point of view. Moreover >>> aroma.affymetrix::QuantileNormalization function produce dta which do >>> not look like they were quantile normalized. > > What is 'dta'? > >>> ?I have made density plots of data after RMA and after quantile >>> normalization which are available here >>> http://test.swissregulon.unibas.ch/bioc/index.html There are also >>> links to two CEL files I have used. >>> >>> I have a few questions: > > Below, I will take that you mean "RMA background correct" when you say "RMA". > >>> Why RMA results are so different? > > The RMA-style background correction in aroma.affymetrix utilizes > affy::bg.adjust() [and normalizes PM probes only]. ?I'm not sure what > algorithm/implementation oligo is using for this step, but they should > give very similar corrected probe signals. > >>> Which RMA implementation is correct? > > So, aroma.affymetrix is basically just a wrapper for > affy::bg.adjust(), which I think was the original implementation of > RMA background correction. ?I let Benilton comment on the oligo > implementation and it's origin. > >>> Why does quantile normalization in aroma.affymetrics produce two >>> different distributions? > > Because you first run quantile normalization on PMs only, then you > look at the density plot for all (PMs & MMs). ?More below. > >>> >>> Thank you in advance! >>> >>> Here are R scripts I have used: >>> >>> ################################ >>> #aroma.affymetrix >>> library(aroma.affymetrix); >>> verbose <- Arguments$getVerbose(-8, timestamp=TRUE); >>> >>> # read files >>> cdf <- AffymetrixCdfFile('annotationData/chipTypes/HuGene- 1_0-st-v1/HuGene-1_0-st-v1.cdf'); >>> cs <- AffymetrixCelSet$byPath("rawData/mine/HuGene-1_0-st-v1/"); > > Have to bring it up: Please, do not setup your annotation data and and > data sets like this. An aroma.affymetrix script should not contain > paths/pathnames, cf. "Dos and Don'ts": > > ?http://aroma-project.org/node/102 > > The correct way to do the above is: > > cs <- AffymetrixCelSet$byName("mine", chipType="HuGene-1_0-st-v1"); > > alternatively, if you wish to be explicit in what CDF is used, you can do: > > cdf <- AffymetrixCdfFile$byChipType("HuGene-1_0-st-v1"); > cs <- AffymetrixCelSet$byName("mine", cdf=cdf); > >>> >>> # RMA >>> bc <- RmaBackgroundCorrection(cs); >>> csBC <- process(bc,verbose=verbose); >>> >>> # QuantileNormalization >>> qn <- QuantileNormalization(csBC, typesToUpdate="pm"); >>> csN <- process(qn); > > Note, the argument 'typesToUpdate' says that it is only PM probes that > will be updated. The default is that MMs are left "as is". > >>> >>> # Plots >>> image_file <- ("aroma.affymetrix.RMA.png"); >>> png(image_file,width=1028,height=768); >>> plotDensity(csBC); > > Here you are plotting all probes on the array. ?Since > RmaBackgroundCorrection is only correcting PM probes, you probably > want to do: > > plotDensity(csBC, types="pm"); > >>> title("aroma.affymetrix RMA data"); >>> dev.off(); >>> >>> image_file <- ("aroma.affymetrix.QN.png"); >>> png(image_file,width=1028,height=768); >>> plotDensity(csN); > > plotDensity(csN, types="pm"); > > This is the key to why you get different density plots. ?For a > thorough explaination of the various QN approaches, see Page > 'Empirical probe-signal densities and rank-based quantile > normalization': > > ?http://aroma-project.org/node/141 > >>> title("aroma.affymetrix QN data"); >>> dev.off() > > What you haven't compared yet, because you misunderstood the RMA > pipeline, are the summarized probe signals from fitting the > log-additive RMA model. > > FYI, it is part of our (24 hours) redundancy testing to assert that > the aroma.affymetrix RMA pipeline can reproduce the RMA pipeline and > RMA summary estimates of the affyPLM package. ?You can see how well > this is done on Page 'Replication: RMA (background, normalization & > summarization)': > > ?http://www.aroma-project.org/replication/RMA > > Hope this helps. > > Henrik > >>> ################################ >>> >>> ################################ >>> # oligo >>> library(oligo); >>> rawdata=read.celfiles(c("rawData/mine/HuGene- 1_0-st-v1/sample1.CEL","rawData/mine/HuGene-1_0-st-v1/sample2.CEL")); >>> rmadata=rma(rawdata); >>> qndata=normalize.quantiles(exprs(rmadata)) >>> >>> library(affy) >>> # Plots >>> image_file <- ("oligo.RMA.png"); >>> png(image_file,width=1028,height=768); >>> plotDensity(exprs(rmadata)); >>> title("oligo RMA data"); >>> dev.off(); >>> >>> image_file <- ("oligo.QN.png"); >>> png(image_file,width=1028,height=768); >>> plotDensity(qndata); >>> title("oligo QN data"); >>> dev.off() >>> ############################### >>> >>> Kind regards, >>> >>> Mikhail Pachkov >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> >
ADD REPLY
0
Entering edit mode
On Fri, Feb 26, 2010 at 2:29 PM, Benilton Carvalho <beniltoncarvalho at="" gmail.com=""> wrote: > regarding oligo's implementation of RMA, it's the same as the one in affy. Just in case there are future bug fixes etc, is it based on the same code base or are you utilizing/calling the code base in the affy package? If, say, oligo changes/fixes something in the RMA code, is there a risk that the code then will be different from affy? /Henrik > > b > > On Fri, Feb 26, 2010 at 12:47 PM, Henrik Bengtsson <hb at="" stat.berkeley.edu=""> wrote: >> Hi, >> >> you concerns about reproducibility are very important. ?Luckily your >> observations are based on mistakes as explained below. >> >> On Fri, Feb 26, 2010 at 12:05 PM, Benilton Carvalho >> <beniltoncarvalho at="" gmail.com=""> wrote: >>> Quantile normalization is already one step in the RMA workflow. >>> Therefore, there's no need to normalize the data again once you've >>> gone RMA, ie. (regarding oligo) your call >>> "normalize.quantiles(exprs(rmadata))" should be dropped. >>> >>> Using the defaults, rma() in oligo will: >>> >>> 1) Background correct (via the RMA convolution model) >>> 2) Quantile normalize >>> 3) Summarize via median-polish. >> >> Yes, as Benilton points out it seems like you've misunderstood what >> RMA does. ?The author of RMA (Ben Bolstad) defines the term RMA to >> mean the complete preprocessing suite including summarization. >> >>> >>> b >>> >>> On Fri, Feb 26, 2010 at 10:46 AM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: >>>> Dear All, >>>> >>>> I am new in microarray analysis and need your expertise. >>>> I need to develop procedure for producing expression values from CEL >>>> files. Data should processed with RMA and quantile normalized. I have >>>> tried two packages - oligo and aroma.affymetrix. Obtained results are >>>> quite different form my point of view. Moreover >>>> aroma.affymetrix::QuantileNormalization function produce dta which do >>>> not look like they were quantile normalized. >> >> What is 'dta'? >> >>>> ?I have made density plots of data after RMA and after quantile >>>> normalization which are available here >>>> http://test.swissregulon.unibas.ch/bioc/index.html There are also >>>> links to two CEL files I have used. >>>> >>>> I have a few questions: >> >> Below, I will take that you mean "RMA background correct" when you say "RMA". >> >>>> Why RMA results are so different? >> >> The RMA-style background correction in aroma.affymetrix utilizes >> affy::bg.adjust() [and normalizes PM probes only]. ?I'm not sure what >> algorithm/implementation oligo is using for this step, but they should >> give very similar corrected probe signals. >> >>>> Which RMA implementation is correct? >> >> So, aroma.affymetrix is basically just a wrapper for >> affy::bg.adjust(), which I think was the original implementation of >> RMA background correction. ?I let Benilton comment on the oligo >> implementation and it's origin. >> >>>> Why does quantile normalization in aroma.affymetrics produce two >>>> different distributions? >> >> Because you first run quantile normalization on PMs only, then you >> look at the density plot for all (PMs & MMs). ?More below. >> >>>> >>>> Thank you in advance! >>>> >>>> Here are R scripts I have used: >>>> >>>> ################################ >>>> #aroma.affymetrix >>>> library(aroma.affymetrix); >>>> verbose <- Arguments$getVerbose(-8, timestamp=TRUE); >>>> >>>> # read files >>>> cdf <- AffymetrixCdfFile('annotationData/chipTypes/HuGene- 1_0-st-v1/HuGene-1_0-st-v1.cdf'); >>>> cs <- AffymetrixCelSet$byPath("rawData/mine/HuGene-1_0-st-v1/"); >> >> Have to bring it up: Please, do not setup your annotation data and and >> data sets like this. An aroma.affymetrix script should not contain >> paths/pathnames, cf. "Dos and Don'ts": >> >> ?http://aroma-project.org/node/102 >> >> The correct way to do the above is: >> >> cs <- AffymetrixCelSet$byName("mine", chipType="HuGene-1_0-st-v1"); >> >> alternatively, if you wish to be explicit in what CDF is used, you can do: >> >> cdf <- AffymetrixCdfFile$byChipType("HuGene-1_0-st-v1"); >> cs <- AffymetrixCelSet$byName("mine", cdf=cdf); >> >>>> >>>> # RMA >>>> bc <- RmaBackgroundCorrection(cs); >>>> csBC <- process(bc,verbose=verbose); >>>> >>>> # QuantileNormalization >>>> qn <- QuantileNormalization(csBC, typesToUpdate="pm"); >>>> csN <- process(qn); >> >> Note, the argument 'typesToUpdate' says that it is only PM probes that >> will be updated. The default is that MMs are left "as is". >> >>>> >>>> # Plots >>>> image_file <- ("aroma.affymetrix.RMA.png"); >>>> png(image_file,width=1028,height=768); >>>> plotDensity(csBC); >> >> Here you are plotting all probes on the array. ?Since >> RmaBackgroundCorrection is only correcting PM probes, you probably >> want to do: >> >> plotDensity(csBC, types="pm"); >> >>>> title("aroma.affymetrix RMA data"); >>>> dev.off(); >>>> >>>> image_file <- ("aroma.affymetrix.QN.png"); >>>> png(image_file,width=1028,height=768); >>>> plotDensity(csN); >> >> plotDensity(csN, types="pm"); >> >> This is the key to why you get different density plots. ?For a >> thorough explaination of the various QN approaches, see Page >> 'Empirical probe-signal densities and rank-based quantile >> normalization': >> >> ?http://aroma-project.org/node/141 >> >>>> title("aroma.affymetrix QN data"); >>>> dev.off() >> >> What you haven't compared yet, because you misunderstood the RMA >> pipeline, are the summarized probe signals from fitting the >> log-additive RMA model. >> >> FYI, it is part of our (24 hours) redundancy testing to assert that >> the aroma.affymetrix RMA pipeline can reproduce the RMA pipeline and >> RMA summary estimates of the affyPLM package. ?You can see how well >> this is done on Page 'Replication: RMA (background, normalization & >> summarization)': >> >> ?http://www.aroma-project.org/replication/RMA >> >> Hope this helps. >> >> Henrik >> >>>> ################################ >>>> >>>> ################################ >>>> # oligo >>>> library(oligo); >>>> rawdata=read.celfiles(c("rawData/mine/HuGene- 1_0-st-v1/sample1.CEL","rawData/mine/HuGene-1_0-st-v1/sample2.CEL")); >>>> rmadata=rma(rawdata); >>>> qndata=normalize.quantiles(exprs(rmadata)) >>>> >>>> library(affy) >>>> # Plots >>>> image_file <- ("oligo.RMA.png"); >>>> png(image_file,width=1028,height=768); >>>> plotDensity(exprs(rmadata)); >>>> title("oligo RMA data"); >>>> dev.off(); >>>> >>>> image_file <- ("oligo.QN.png"); >>>> png(image_file,width=1028,height=768); >>>> plotDensity(qndata); >>>> title("oligo QN data"); >>>> dev.off() >>>> ############################### >>>> >>>> Kind regards, >>>> >>>> Mikhail Pachkov >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >
ADD REPLY
0
Entering edit mode
Dear Henrik and Benilton, Thank you for explanations. I have tried plotDensity with types="pm" and results now look consistent. I have also found that oligo::rma perform summarization to the *probeset* level not probe level. I guess that is why my plots looked so different. I have used rma.background.correct from preprocessCore and now data look close to each other. I would like to ask Henrik, how can I get probe id list within aroma.affymetrix in order to print table of probe id and corresponding expression values? Kind regards, Mikhail
ADD REPLY
0
Entering edit mode
Dear Mikhail, when you read the CEL files, you have the information at the probe level. So, when you summarize, you will summarize to a 'less granular' level (in this case probeset). Maybe what you're trying to achieve is the comparison of the background correction (or possibly normalization) method *before* summarization? b On Fri, Feb 26, 2010 at 4:38 PM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: > Dear Henrik and Benilton, > > Thank you for explanations. I have tried plotDensity with types="pm" > and results now look consistent. > I have also found that oligo::rma perform summarization to the > *probeset* level not probe level. I guess that is why my plots looked > so different. I have used rma.background.correct from preprocessCore > and now data look close to each other. > > I would like to ask Henrik, how can I get probe id list within > aroma.affymetrix in order to print table of probe id and corresponding > expression values? > > Kind regards, > > Mikhail > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
On Fri, Feb 26, 2010 at 5:48 PM, Benilton Carvalho <beniltoncarvalho at="" gmail.com=""> wrote: > Dear Mikhail, > > when you read the CEL files, you have the information at the probe > level. So, when you summarize, you will summarize to a 'less granular' > level (in this case probeset). > > Maybe what you're trying to achieve is the comparison of the > background correction (or possibly normalization) method *before* > summarization? You are right we need expression data on *probe-level* without summarization. Comparison is not my goal. I am just trying to learn how to use these packages and choose one for our project. The discrepancy between results was very confusing so I asked for help. Obviously the reason was my ignorance. In our project we need to read cel files and get expression values (bg corrected and normalized) on *probe-level* for further analysis. Best regards, Mikhail
ADD REPLY
0
Entering edit mode
If you're using the latest oligo, you can use the (not exposed - still working on further details) function getFidProbeset() : probeInfo = oligo:::getFidProbeset(rawdata) idx = probeInfo[["fid"]] ## probeset names are in probeInfo[["fsetid"]] intensities = exprs(rawdata)[idx,] and work with 'intensities' (which includes the PMs and controls). If you rather use only PMs: pms = pm(rawdata) pns = probeNames(rawdata) these are now regular matrices and you can use rma.background.correct() and normalize.quantiles(). best, b On Fri, Feb 26, 2010 at 5:13 PM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: > On Fri, Feb 26, 2010 at 5:48 PM, Benilton Carvalho > <beniltoncarvalho at="" gmail.com=""> wrote: >> Dear Mikhail, >> >> when you read the CEL files, you have the information at the probe >> level. So, when you summarize, you will summarize to a 'less granular' >> level (in this case probeset). >> >> Maybe what you're trying to achieve is the comparison of the >> background correction (or possibly normalization) method *before* >> summarization? > > You are right we need expression data on *probe-level* without summarization. > > Comparison is not my goal. I am just trying to learn how to use these > packages and choose one for our project. The discrepancy between > results was very confusing so I asked for help. Obviously the reason > was my ignorance. > > In our project we need to read cel files and get expression values (bg > corrected and normalized) on *probe-level* for further analysis. > > Best regards, > > Mikhail >
ADD REPLY
0
Entering edit mode
On Fri, Feb 26, 2010 at 6:13 PM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: > On Fri, Feb 26, 2010 at 5:48 PM, Benilton Carvalho > <beniltoncarvalho at="" gmail.com=""> wrote: >> Dear Mikhail, >> >> when you read the CEL files, you have the information at the probe >> level. So, when you summarize, you will summarize to a 'less granular' >> level (in this case probeset). >> >> Maybe what you're trying to achieve is the comparison of the >> background correction (or possibly normalization) method *before* >> summarization? > > You are right we need expression data on *probe-level* without summarization. > > Comparison is not my goal. I am just trying to learn how to use these > packages and choose one for our project. The discrepancy between > results was very confusing so I asked for help. Obviously the reason > was my ignorance. > > In our project we need to read cel files and get expression values (bg > corrected and normalized) on *probe-level* for further analysis. FYI, if you do: library("aroma.affymetrix"); cs <- AffymetrixCelSet$byName("mine", chipType="HuGene-1_0-st-v1"); bc <- RmaBackgroundCorrection(cs); csBC <- process(bc,verbose=verbose); qn <- QuantileNormalization(csBC, typesToUpdate="pm"); csN <- process(qn); then ou will end up with a new set of CEL files that are background corrected and quantile normalized according to the RMA pipeline. These files are located in getPath(csN). These CEL files are regular CEL files meaning you can use them in whatever software understanding CEL files. /Henrik > > Best regards, > > Mikhail > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Hi. On Fri, Feb 26, 2010 at 5:38 PM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: > Dear Henrik and Benilton, > > Thank you for explanations. I have tried plotDensity with types="pm" > and results now look consistent. Great. There is almost always a simple explanation to everything. > I have also found that oligo::rma perform summarization to the > *probeset* level not probe level. I guess that is why my plots looked > so different. I have used rma.background.correct from preprocessCore > and now data look close to each other. Good. > > I would like to ask Henrik, how can I get probe id list within > aroma.affymetrix in order to print table of probe id and corresponding > expression values? This is really a question for the aroma.affymetrix mailing list, http://www.aroma-project.org/forum and it has been asked and answered there before. I make an exception and answer it here. First, you probably mean "probset id" not "probe id" (big difference and there is no such thing as "probe id"). Second, in the aroma.affymetrix world we use the term "unit names" instead of "probeset id" (there is good reason for this, but I won't explain it here). You are probably looking for what extractDataFrame() provides, cf. http://www.aroma-project.org/howtos/extractDataFrame Make sure to look through http://www.aroma-project.org/ - that's the number one source for all documentation of aroma.affymetrix and friends. Hope this helps Henrik /Henrik > > Kind regards, > > Mikhail > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 729 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6