an error in normalize.quantiles {preprocessCore} function?
2
0
Entering edit mode
697820169 ▴ 20
@697820169-4007
Last seen 10.3 years ago
Hi all, I am using normalize.quantiles in package preprocessCore to deal with my data now, and when I am trying to average the expression value of each chip to visualize the result of quantile normalization, I curiously found one chip seem to have a different average expression value from others. I have uploaded the image to imageshack: http://img685.imageshack.us/img685/8320/mean.gif It is noted that the average expression value of case no.184 is clearly away from other cases. After checking the normalized data, I have found two cells which seemed should be 2.287524785 and 2.287870326 are replaced with both 2.28769392. I am not sure what is causing the problem, and have tried the normalization on two different computers, one with R 2.9.1 and preprocessCore 1.6 on a x64 system, and the other with R 2.10.1 and preprocessCore 1.8.0 on a x86 system. However the results are identical. My code is as follows: library(preprocessCore) alldata.q=as.matrix(alldata) alldata.q=normalize.quantiles(alldata.q) alldata.q=data.frame(alldata.q) row.names(alldata.q)=row.names(alldata) names(alldata.q)=names(alldata) plot(mean(alldata.q)) And to specify which is the different spot, I have used another code: mean(alldata.q)==mean(alldata.q)[1] And the result are all TRUEs, except one FALSE for case no.184. I am not sure if there is an error in my code, or really in the function itself. In order to reproduce the error to have further information, I have uploaded the data somewhere else, since I think it unlikely possible to attach a file as big as 31mb. Please find the file in following url: http://webhd.ndmctsgh.edu.tw/invite/tw/webhd/bNzQ5OS85ODkxNi8xMjcxNDEx NjEw If any further information is needed to clearify the problem, please let me know. kindest regards, Tseng, Chih-hao --==Mailed via NDMCTSGH Webmail==--
Normalization Normalization • 3.1k views
ADD COMMENT
0
Entering edit mode
Ben Bolstad ★ 1.2k
@ben-bolstad-1494
Last seen 7.3 years ago
Without actually looking at your data, a reasonable explanation for what you have observed would be in the handling of ties. The algorithm ensures that values that are equal on input in a given column are also equal on output. > set.seed(1) > X <- rnorm(100000) > X <- round(X,3) ### This creates a bunch of non-unique values. > X <- matrix(X,ncol=10) > library(preprocessCore) > X.norm <- normalize.quantiles(X) > colMeans(X.norm) [1] -0.00224544 -0.00224477 -0.00224519 -0.00224287 -0.00224322 -0.00224448 [7] -0.00224846 -0.00224612 -0.00224265 -0.00224595 > set.seed(1) > X <- rnorm(100000) > X <- matrix(X,ncol=10) ## no rounding here so every value is unique > X.norm <- normalize.quantiles(X) > colMeans(X.norm) [1] -0.002244083 -0.002244083 -0.002244083 -0.002244083 -0.002244083 [6] -0.002244083 -0.002244083 -0.002244083 -0.002244083 -0.002244083 > > Hi all, > > I am using normalize.quantiles in package preprocessCore > to deal with my data now, and when I am trying to average > the expression value of each chip to visualize the result > of quantile normalization, I curiously found one chip seem > to have a different average expression value from others. I > have uploaded the image to imageshack: > http://img685.imageshack.us/img685/8320/mean.gif > It is noted that the average expression value of case no.184 > is clearly away from other cases. > > After checking the normalized data, I have found two > cells which seemed should be 2.287524785 and 2.287870326 > are replaced with both 2.28769392. I am not sure what is > causing the problem, and have tried the normalization on > two different computers, one with R 2.9.1 and preprocessCore > 1.6 on a x64 system, and the other with R 2.10.1 and > preprocessCore 1.8.0 on a x86 system. However the results > are identical. My code is as follows: > > library(preprocessCore) > alldata.q=as.matrix(alldata) > alldata.q=normalize.quantiles(alldata.q) > alldata.q=data.frame(alldata.q) > row.names(alldata.q)=row.names(alldata) > names(alldata.q)=names(alldata) > plot(mean(alldata.q)) > > And to specify which is the different spot, I have used > another code: > > mean(alldata.q)==mean(alldata.q)[1] > > And the result are all TRUEs, except one FALSE for case > no.184. I am not sure if there is an error in my code, or > really in the function itself. > > In order to reproduce the error to have further information, I > have uploaded the data somewhere else, since I think it unlikely > possible to attach a file as big as 31mb. Please find the file in > following url: > http://webhd.ndmctsgh.edu.tw/invite/tw/webhd/bNzQ5OS85ODkxNi8xMjcxND ExNjEw > > If any further information is needed to clearify the problem, please > let me know. > > kindest regards, > Tseng, Chih-hao > > > --==Mailed via NDMCTSGH Webmail==-- > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
@benilton-carvalho-1375
Last seen 4.8 years ago
Brazil/Campinas/UNICAMP
you're looking at differences of the order 1e-10. look at boxplot(alldata.q), and if you must compare the means, try: means <- colMeans(alldata.q) names(means) <- NULL sapply(means, function(x) all.equal(x, means[1])) b On Fri, Apr 16, 2010 at 3:15 PM, 697820169 <697820169 at mail.ndmctsgh.edu.tw> wrote: > > Hi all, > > ?I am using normalize.quantiles in package preprocessCore > to deal with my data now, and when I am trying to average > the expression value of each chip to visualize the result > of quantile normalization, I curiously found one chip seem > to have a different average expression value from others. I > have uploaded the image to imageshack: > http://img685.imageshack.us/img685/8320/mean.gif > It is noted that the average expression value of case no.184 > is clearly away from other cases. > > ?After checking the normalized data, I have found two > cells which seemed should be 2.287524785 and 2.287870326 > are replaced with both 2.28769392. I am not sure what is > causing the problem, and have tried the normalization on > two different computers, one with R 2.9.1 and preprocessCore > 1.6 on a x64 system, and the other with R 2.10.1 and > preprocessCore 1.8.0 on a x86 system. However the results > are identical. My code is as follows: > > library(preprocessCore) > alldata.q=as.matrix(alldata) > alldata.q=normalize.quantiles(alldata.q) > alldata.q=data.frame(alldata.q) > row.names(alldata.q)=row.names(alldata) > names(alldata.q)=names(alldata) > plot(mean(alldata.q)) > > And to specify which is the different spot, I have used > another code: > > mean(alldata.q)==mean(alldata.q)[1] > > And the result are all TRUEs, except one FALSE for case > no.184. I am not sure if there is an error in my code, or > really in the function itself. > > ?In order to reproduce the error to have further information, I > have uploaded the data somewhere else, since I think it unlikely > possible to attach a file as big as 31mb. Please find the file in > following url: > http://webhd.ndmctsgh.edu.tw/invite/tw/webhd/bNzQ5OS85ODkxNi8xMjcxND ExNjEw > > If any further information is needed to clearify the problem, please > let me know. > > kindest regards, > Tseng, Chih-hao > > > --==Mailed via NDMCTSGH Webmail==-- > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT

Login before adding your answer.

Traffic: 632 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6