Unable to \'standardise\' logtransformed dataset of contrasts using Mfuzz package
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 10.2 years ago
Dear Maintainer, I'm analyzing metabolomic LC-MS intensity values. To reduced these large numbers, I log transformed the dataset. To replace any Inf/-Inf/NA with a zero: > logtransfo[!is.finite(logtransfo)]<-0 and checked for NAs: > is.na(logtransfo)) CA.2wk11.H11 CA.4wk11.CA.2wk11 CA.8wk11.CA.4wk11 CA.12wk11.CA.8wk11 123.1166295 FALSE FALSE FALSE FALSE 109.1012434 FALSE FALSE FALSE FALSE All cells printed FALSE. I made contrasts using limma package. > head(wCA12m) CA.2wk11.H11 CA.4wk11.CA.2wk11 CA.8wk11.CA.4wk11 CA.12wk11.CA.8wk11 123.1166295 " 0.018961357" "-0.091637119" " 3.268257162" "-1.025643391" 109.1012434 " 0.146168274" "-0.055655014" " 3.172041095" "-0.969301615" Made an expression set: > wCA12me ExpressionSet (storageMode: lockedEnvironment) assayData: 124 features, 4 samples element names: exprs protocolData: none phenoData: none featureData: none experimentData: use 'experimentData(object)' Annotation: is.na(exprs(wCA12me)) CA.2wk11.H11 CA.4wk11.CA.2wk11 CA.8wk11.CA.4wk11 CA.12wk11.CA.8wk11 123.1166295 FALSE FALSE FALSE FALSE 109.1012434 FALSE FALSE FALSE FALSE I loaded library(Mfuzz), and went through the steps as indicated in the manual. > wCA12me.r=filter.NA(wCA12me) 0 genes excluded. > wCA12me.f=fill.NA(wCA12me.r, mode="knn") #after failing to standardise, I also tried using the other mode options. I could get a nice plot with "knn" and "knnw", but using "mean" and "median" gave an error for fill.NA. > tmp=filter.std(wCA12me, min.std=0) 0 genes excluded. Also, changed the min.std value. > tmp=filter.std(wCA12me, min.std=2) 67 genes excluded. For either case of changing the mode="", and min.std="", I always get the same error message when using the call to 'standardise': > wCA12me.s=standardise(wCA12me.f) Error in data[i, ] - mean(data[i, ], na.rm = TRUE) : non-numeric argument to binary operator In addition: Warning message: In mean.default(data[i, ], na.rm = TRUE) : argument is not numeric or logical: returning NA Checking my file several times, I showed that no data points contain NA. I think I understand what the error is saying, but I didn't expect negative values to affect the clustering algorithm. I was able to complete the package with non- transformed values, however, the transformed values give slightly different results, and wanted to compare the non-transformed and log- transformed datasets. This being LC-MS metabolomic data, could I use a different function to transform the data to not get negative values? Thanks for your attention. Regards, Franklin -- output of sessionInfo(): R version 3.0.1 (2013-05-16) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] tcltk parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] limma_3.16.5 Mfuzz_2.18.0 DynDoc_1.38.0 [4] widgetTools_1.38.0 e1071_1.6-1 class_7.3-7 [7] Biobase_2.20.0 BiocGenerics_0.6.0 BiocInstaller_1.10.2 loaded via a namespace (and not attached): [1] tkWidgets_1.38.0 tools_3.0.1 -- Sent via the guest posting facility at bioconductor.org.
Clustering limma Clustering limma • 1.1k views
ADD COMMENT
0
Entering edit mode
@matthias-futschik-6015
Last seen 4.9 years ago
University of Algarve
Dear Franklin, it looks like that your numbers are in fact characters > head(wCA12m) CA.2wk11.H11 CA.4wk11.CA.2wk11 CA.8wk11.CA.4wk11 CA.12wk11.CA.8wk11 123.1166295 " 0.018961357" "-0.091637119" " 3.268257162" "-1.025643391" 109.1012434 " 0.146168274" "-0.055655014" " 3.172041095" "-0.969301615" since quotation marks are appearing. Once you convert them in real numbers, it should be ok. hth, Matthias. Em 26-06-2013 22:17, FRANKLIN JOHNSON [guest] escreveu: > Dear Maintainer, > > I'm analyzing metabolomic LC-MS intensity values. > To reduced these large numbers, I log transformed the dataset. To replace any Inf/-Inf/NA with a zero: >> logtransfo[!is.finite(logtransfo)]<-0 > and checked for NAs: >> is.na(logtransfo)) > CA.2wk11.H11 CA.4wk11.CA.2wk11 CA.8wk11.CA.4wk11 CA.12wk11.CA.8wk11 > 123.1166295 FALSE FALSE FALSE FALSE > 109.1012434 FALSE FALSE FALSE FALSE > > All cells printed FALSE. > I made contrasts using limma package. >> head(wCA12m) > CA.2wk11.H11 CA.4wk11.CA.2wk11 CA.8wk11.CA.4wk11 CA.12wk11.CA.8wk11 > 123.1166295 " 0.018961357" "-0.091637119" " 3.268257162" "-1.025643391" > 109.1012434 " 0.146168274" "-0.055655014" " 3.172041095" "-0.969301615" > > Made an expression set: >> wCA12me > ExpressionSet (storageMode: lockedEnvironment) > assayData: 124 features, 4 samples > element names: exprs > protocolData: none > phenoData: none > featureData: none > experimentData: use 'experimentData(object)' > Annotation: > > is.na(exprs(wCA12me)) > CA.2wk11.H11 CA.4wk11.CA.2wk11 CA.8wk11.CA.4wk11 CA.12wk11.CA.8wk11 > 123.1166295 FALSE FALSE FALSE FALSE > 109.1012434 FALSE FALSE FALSE FALSE > > I loaded library(Mfuzz), and went through the steps as indicated in the manual. > >> wCA12me.r=filter.NA(wCA12me) > 0 genes excluded. >> wCA12me.f=fill.NA(wCA12me.r, mode="knn") #after failing to standardise, I also tried using the other mode options. > I could get a nice plot with "knn" and "knnw", but using "mean" and "median" gave an error for fill.NA. > >> tmp=filter.std(wCA12me, min.std=0) > 0 genes excluded. > > Also, changed the min.std value. >> tmp=filter.std(wCA12me, min.std=2) > 67 genes excluded. > > For either case of changing the mode="", and min.std="", > I always get the same error message when using the call to 'standardise': > >> wCA12me.s=standardise(wCA12me.f) > Error in data[i, ] - mean(data[i, ], na.rm = TRUE) : > non-numeric argument to binary operator > In addition: Warning message: > In mean.default(data[i, ], na.rm = TRUE) : > argument is not numeric or logical: returning NA > > Checking my file several times, I showed that no data points contain NA. I think I understand what the error is saying, but I didn't expect negative values to affect > the clustering algorithm. I was able to complete the package with non-transformed values, however, the transformed values give slightly different results, and wanted to compare the non-transformed and log- transformed datasets. > > This being LC-MS metabolomic data, could I use a different function to transform the data to not get negative values? > > Thanks for your attention. > Regards, > Franklin > > > > > -- output of sessionInfo(): > > R version 3.0.1 (2013-05-16) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] tcltk parallel stats graphics grDevices utils datasets > [8] methods base > > other attached packages: > [1] limma_3.16.5 Mfuzz_2.18.0 DynDoc_1.38.0 > [4] widgetTools_1.38.0 e1071_1.6-1 class_7.3-7 > [7] Biobase_2.20.0 BiocGenerics_0.6.0 BiocInstaller_1.10.2 > > loaded via a namespace (and not attached): > [1] tkWidgets_1.38.0 tools_3.0.1 > > -- > Sent via the guest posting facility at bioconductor.org. > > -- ************************************************************ Dr. Matthias E. Futschik Principal Investigator in Systems Biology and Bioinformatics Centre for Molecular and Structural Biomedicine University of Algarve, Campus of Gambelas 8005-139 Faro, Portugal url: www.sysbiolab.eu email: mfutschik at ualg.pt
ADD COMMENT

Login before adding your answer.

Traffic: 971 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6