Limma normalization error and loess.R segmentation type fault (windows)
3
0
Entering edit mode
@gordon-smyth
Last seen 3 minutes ago
WEHI, Melbourne, Australia
Hi Marcus, I haven't seen this problem myself. I've just tried running the Weaver case study in the limma User's Guide, and it still runs correctly for me, using either R 2.15.1 or R-devel on Windows. There weren't any changes to that part of the limma code between R 2.14.1 and R 2.15.1, so the change you are seeing may be in the stats package. Best wishes Gordon ---------------- original message --------------- [BioC] Limma normalization error and loess.R segmentation type fault (windows) Marcus Davy mdavy86 at gmail.com Tue Sep 4 07:11:43 CEST 2012 Is anyone having recent problems after upgrading to R-2.15.1 on windows with limma or other bioconductor packages that use loess functions in the stats core package.? The normalization function normalizeWithinArrays(..., method=printtiploess) is crashing on two independent windows machines since upgrading R from 2.14.1 to 2.15.1 using the stable release packages on bioconductor. I have a segmentation type fault of Rgui.exe running this code snippet; system.time(MA <- normalizeWithinArrays(RG, method="printtiploess", bc.method="none")) Process R exited abnormally with code 148 at Tue Sep 04 14:42:30 2012 I can get a more meaningful error if I install the development release of limma; useDevel() biocLite("limma") system.time(MA <- normalizeWithinArrays(RG, method="printtiploess", bc.method="none")) Error in stats:::simpleLoess(y = yobs, x = xobs, weights = wobs, span = span, : NA/NaN/Inf in foreign function call (arg 1) Timing stopped at: 12.34 2.11 14.5 The number of NAs and weights in the offending array are; > sumis.na(tmp$M)) [1] 717 > sumis.na(tmp$A)) [1] 717 > table(tmp$weights) 0 1 8362 10070 The error appears to be caused in R core->stats:::simpleLoess (not limma) when interfacing C code, and I can see there have been recent commits to the stats/loess.R file in the R core subversion repository. In the stable release, limma calls normalizeWithinArrays() -> loessFit() > .vsimpleLoess(), which appear to have modified to normalizeWithinArrays() -> loessFit() > stats:::simpleLoess in the development release of limma, which explains the error I get with the development release of limma. A colleague independently tested normalizing the one offending microarray slide I identified causing the same segmentation type fault on his machine, Sorry I have not provided a reproducible example but I can provide an RSave to load just that array of data, and the code snippets to reproduce the error in an email, this bug appears to be data dependent, windows dependent (my example code works fine without crashing on linux using R version 2.15.0 and limma 3.12.0) and probably infrequent to achieve. cheers, Marcus ## Stable release > sessionInfo() R version 2.15.1 (2012-06-22) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New Zealand.1252 [3] LC_MONETARY=English_New Zealand.1252 LC_NUMERIC=C [5] LC_TIME=English_New Zealand.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Biobase_2.16.0 BiocGenerics_0.2.0 limma_3.12.1 ## development release sessionInfo() R version 2.15.1 (2012-06-22) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New Zealand.1252 [3] LC_MONETARY=English_New Zealand.1252 LC_NUMERIC=C [5] LC_TIME=English_New Zealand.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] limma_3.13.17 ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
Normalization limma Normalization limma • 2.1k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 3 minutes ago
WEHI, Melbourne, Australia
Hi Marcus, I guess the fundamental question then is, what should loess normalization return (for a array or print-tip group or whatever) when all the weights are zero? Strict mathematics would suggest that the solution should be NA. Practical considerations suggest to me that normalizeWithinArrays() might be better performing ordinary unweighted loess normalization in this case, because no probe is weighted as more reliable than any other. Or alternatively, one might say that the loess curve can't be estimated, so the raw expression values should be returned without adjustment. So loess normalization with zero weights is equivalent to no normalization. That is what loessFit() and normalizeWithinArrays() have been doing up to R 2.14.1. If you make the weights all zero for a print tip, do you want normalizeWithinArrays to return NAs for all probes in that print tip group on that array? Or do ordinary unweighted normalization? Or do no normalization? Regards Gordon On Wed, 5 Sep 2012, Marcus Davy wrote: > It looks like this error is related to a particular print tip having all > weights=0 as input into stats:::simpleLoess. I am trying to construct a > simple reproducible example. > > Marcus ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT
0
Entering edit mode
Hi Gordon, some good points you make here. I have checked back to the original slide and there was a background smear over one corner covering an entire print tip region, so a researcher would have subjectively flagged all those spots as bad, which is why they were all allocated zero weight. For global loess normalization such as with agilent arrays, a researcher would also potentially remove the entire biological replicate from analysis. I am tending towards the solution should also be NA for the situation above. Maybe a conditional switch could be used to cover the options, ordinary unweighted normalization, and doing no normalization etc for historical/backward compatability reasons. Marcus On Wed, Sep 5, 2012 at 11:21 AM, Gordon K Smyth <smyth@wehi.edu.au> wrote: > Hi Marcus, > > I guess the fundamental question then is, what should loess normalization > return (for a array or print-tip group or whatever) when all the weights > are zero? > > Strict mathematics would suggest that the solution should be NA. > > Practical considerations suggest to me that normalizeWithinArrays() might > be better performing ordinary unweighted loess normalization in this case, > because no probe is weighted as more reliable than any other. > > Or alternatively, one might say that the loess curve can't be estimated, > so the raw expression values should be returned without adjustment. So > loess normalization with zero weights is equivalent to no normalization. > That is what loessFit() and normalizeWithinArrays() have been doing up to R > 2.14.1. > > If you make the weights all zero for a print tip, do you want > normalizeWithinArrays to return NAs for all probes in that print tip group > on that array? Or do ordinary unweighted normalization? Or do no > normalization? > > Regards > Gordon > > > On Wed, 5 Sep 2012, Marcus Davy wrote: > > It looks like this error is related to a particular print tip having all >> weights=0 as input into stats:::simpleLoess. I am trying to construct a >> simple reproducible example. >> >> Marcus >> > > ______________________________**______________________________**____ ______ > The information in this email is confidential and inte...{{dropped:10}}
ADD REPLY
0
Entering edit mode
Marcus Davy ▴ 390
@marcus-davy-5153
Last seen 6.6 years ago
Hi Gordon, yes, I believe the cause of the crashes are due to changes in the file http://svn.r-project.org/R/branches/R-2-15-branch/src/library/stats/R/ loess.Rwithin the stats package for R-2.15.1. I will email you a simple example off list to see if you can reproduce the crash on windows. The change to loessFit to use stats:::simpleLoess appears to be the reason for the second error message I got when using the development version of limma on windows. > packageDescription("limma")$Version [1] "3.13.17" > changeLog(n=7) 17 Aug 2012: limma 3.3.17 - limma license upgraded to GPL-2 instead of LGPL to match R itself. - loessFit() no longer makes direct calls to foreign language functions in the stats package. Same values are returned as before, but now take 25-30% longer whenever weights are used. svn diff -r 68536:68077 cheers, Marcus On Tue, Sep 4, 2012 at 10:09 PM, Gordon K Smyth <smyth@wehi.edu.au> wrote: > Hi Marcus, > > I haven't seen this problem myself. I've just tried running the Weaver > case study in the limma User's Guide, and it still runs correctly for me, > using either R 2.15.1 or R-devel on Windows. > > There weren't any changes to that part of the limma code between R 2.14.1 > and R 2.15.1, so the change you are seeing may be in the stats package. > > Best wishes > Gordon > > ---------------- original message --------------- > [BioC] Limma normalization error and loess.R segmentation type fault > (windows) > Marcus Davy mdavy86 at gmail.com > Tue Sep 4 07:11:43 CEST 2012 > > Is anyone having recent problems after upgrading to R-2.15.1 on windows > with limma or other bioconductor packages that use loess functions in the > stats core package.? > > The normalization function normalizeWithinArrays(..., > method=printtiploess) is crashing on two independent windows machines > since upgrading R from 2.14.1 to 2.15.1 using the stable release packages > on bioconductor. I have a segmentation type fault of Rgui.exe running this > code snippet; > > system.time(MA <- normalizeWithinArrays(RG, method="printtiploess", > bc.method="none")) > > Process R exited abnormally with code 148 at Tue Sep 04 14:42:30 2012 > > I can get a more meaningful error if I install the development release of > limma; > > useDevel() > biocLite("limma") > > system.time(MA <- normalizeWithinArrays(RG, method="printtiploess", > bc.method="none")) > > Error in stats:::simpleLoess(y = yobs, x = xobs, weights = wobs, span = > span, : > > NA/NaN/Inf in foreign function call (arg 1) > > Timing stopped at: 12.34 2.11 14.5 > > The number of NAs and weights in the offending array are; > > sumis.na(tmp$M)) >> > [1] 717 > >> sumis.na(tmp$A)) >> > [1] 717 > >> table(tmp$weights) >> > 0 1 > 8362 10070 > > > The error appears to be caused in R core->stats:::simpleLoess (not limma) > when interfacing C code, and I can see there have been recent commits to > the stats/loess.R file in the R core subversion repository. In the stable > release, limma calls normalizeWithinArrays() -> loessFit() > > .vsimpleLoess(), which appear to have modified to > > normalizeWithinArrays() -> loessFit() > stats:::simpleLoess in the > development release of limma, which explains the error I get with the > development release of limma. > > A colleague independently tested normalizing the one offending microarray > slide I identified causing the same segmentation type fault on his machine, > Sorry I have not provided a reproducible example but I can provide an RSave > to load just that array of data, and the code snippets to reproduce the > error in an email, this bug appears to be data dependent, windows dependent > (my example code works fine without crashing on linux using R version > 2.15.0 and limma 3.12.0) and probably infrequent to achieve. > > cheers, > Marcus > > > ## Stable release > > sessionInfo() >> > R version 2.15.1 (2012-06-22) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New Zealand.1252 > [3] LC_MONETARY=English_New Zealand.1252 LC_NUMERIC=C > [5] LC_TIME=English_New Zealand.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] Biobase_2.16.0 BiocGenerics_0.2.0 limma_3.12.1 > > ## development release > sessionInfo() > > R version 2.15.1 (2012-06-22) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New Zealand.1252 > [3] LC_MONETARY=English_New Zealand.1252 LC_NUMERIC=C > [5] LC_TIME=English_New Zealand.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] limma_3.13.17 > > ______________________________**______________________________**____ ______ > The information in this email is confidential and inte...{{dropped:10}}
ADD COMMENT
0
Entering edit mode
On Tue, Sep 4, 2012 at 5:38 PM, Marcus Davy <mdavy86 at="" gmail.com=""> wrote: > Hi Gordon, > yes, I believe the cause of the crashes are due to changes in the file > http://svn.r-project.org/R/branches/R-2-15-branch/src/library/stats/ R/loess.Rwithin > the stats package for R-2.15.1. I will email you a simple example > off list to see if you can reproduce the crash on windows. Markus, a similar problem seem to occur with the charm package. Have you tracked down the approximate changes to R which causes it (I think the svn revision numbers below are for bioconductor). Specifically, does it go away in R-2.15.1-patched and/or when was it introduced? Thanks, Kasper > > The change to loessFit to use stats:::simpleLoess appears to be the reason > for the second error message I got when using the development version of > limma on windows. > >> packageDescription("limma")$Version > [1] "3.13.17" > >> changeLog(n=7) > 17 Aug 2012: limma 3.3.17 > > - limma license upgraded to GPL-2 instead of LGPL to match R itself. > > - loessFit() no longer makes direct calls to foreign language > functions in the stats package. Same values are returned as before, > but now take 25-30% longer whenever weights are used. > > > svn diff -r 68536:68077 > > > cheers, > > Marcus > > > On Tue, Sep 4, 2012 at 10:09 PM, Gordon K Smyth <smyth at="" wehi.edu.au=""> wrote: > >> Hi Marcus, >> >> I haven't seen this problem myself. I've just tried running the Weaver >> case study in the limma User's Guide, and it still runs correctly for me, >> using either R 2.15.1 or R-devel on Windows. >> >> There weren't any changes to that part of the limma code between R 2.14.1 >> and R 2.15.1, so the change you are seeing may be in the stats package. >> >> Best wishes >> Gordon >> >> ---------------- original message --------------- >> [BioC] Limma normalization error and loess.R segmentation type fault >> (windows) >> Marcus Davy mdavy86 at gmail.com >> Tue Sep 4 07:11:43 CEST 2012 >> >> Is anyone having recent problems after upgrading to R-2.15.1 on windows >> with limma or other bioconductor packages that use loess functions in the >> stats core package.? >> >> The normalization function normalizeWithinArrays(..., >> method=printtiploess) is crashing on two independent windows machines >> since upgrading R from 2.14.1 to 2.15.1 using the stable release packages >> on bioconductor. I have a segmentation type fault of Rgui.exe running this >> code snippet; >> >> system.time(MA <- normalizeWithinArrays(RG, method="printtiploess", >> bc.method="none")) >> >> Process R exited abnormally with code 148 at Tue Sep 04 14:42:30 2012 >> >> I can get a more meaningful error if I install the development release of >> limma; >> >> useDevel() >> biocLite("limma") >> >> system.time(MA <- normalizeWithinArrays(RG, method="printtiploess", >> bc.method="none")) >> >> Error in stats:::simpleLoess(y = yobs, x = xobs, weights = wobs, span = >> span, : >> >> NA/NaN/Inf in foreign function call (arg 1) >> >> Timing stopped at: 12.34 2.11 14.5 >> >> The number of NAs and weights in the offending array are; >> >> sumis.na(tmp$M)) >>> >> [1] 717 >> >>> sumis.na(tmp$A)) >>> >> [1] 717 >> >>> table(tmp$weights) >>> >> 0 1 >> 8362 10070 >> >> >> The error appears to be caused in R core->stats:::simpleLoess (not limma) >> when interfacing C code, and I can see there have been recent commits to >> the stats/loess.R file in the R core subversion repository. In the stable >> release, limma calls normalizeWithinArrays() -> loessFit() > >> .vsimpleLoess(), which appear to have modified to >> >> normalizeWithinArrays() -> loessFit() > stats:::simpleLoess in the >> development release of limma, which explains the error I get with the >> development release of limma. >> >> A colleague independently tested normalizing the one offending microarray >> slide I identified causing the same segmentation type fault on his machine, >> Sorry I have not provided a reproducible example but I can provide an RSave >> to load just that array of data, and the code snippets to reproduce the >> error in an email, this bug appears to be data dependent, windows dependent >> (my example code works fine without crashing on linux using R version >> 2.15.0 and limma 3.12.0) and probably infrequent to achieve. >> >> cheers, >> Marcus >> >> >> ## Stable release >> >> sessionInfo() >>> >> R version 2.15.1 (2012-06-22) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New Zealand.1252 >> [3] LC_MONETARY=English_New Zealand.1252 LC_NUMERIC=C >> [5] LC_TIME=English_New Zealand.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] Biobase_2.16.0 BiocGenerics_0.2.0 limma_3.12.1 >> >> ## development release >> sessionInfo() >> >> R version 2.15.1 (2012-06-22) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New Zealand.1252 >> [3] LC_MONETARY=English_New Zealand.1252 LC_NUMERIC=C >> [5] LC_TIME=English_New Zealand.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] limma_3.13.17 >> >> ______________________________**______________________________**___ _______ >> The information in this email is confidential and inte...{{dropped:10}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
I know that it was introduced in R-2.15.1, and the same error also appears to also exist on R-2.15.1-patched which I have tested this morning. The subversion revision numbers were in the limma change log. I have a specific simple example to illustrate the *first* error which is a core dump on windows R-2.15.1, and R-2.15.1-patched set.seed(42) n <- 10 y <- rnorm(n) x <- rnorm(n) w <- rep(0, n) sessionInfo() lm.wfit(cbind(1, x), y, w) Process R exited abnormally with code 148 at Wed Sep 05 12:23:58 2012 Try that and see if you get a core dump, if so that makes this bug pretty ciritical on windows given that it is for weighted linear model fits. I am still checking out R-2.15.1 to identify what revision changes have taken place. Marcus > sessionInfo() R version 2.15.1 Patched (2012-09-01 r60539) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New Zealand.1252 [3] LC_MONETARY=English_New Zealand.1252 LC_NUMERIC=C [5] LC_TIME=English_New Zealand.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): On Wed, Sep 5, 2012 at 9:44 AM, Kasper Daniel Hansen < kasperdanielhansen@gmail.com> wrote: > On Tue, Sep 4, 2012 at 5:38 PM, Marcus Davy <mdavy86@gmail.com> wrote: > > Hi Gordon, > > yes, I believe the cause of the crashes are due to changes in the file > > > http://svn.r-project.org/R/branches/R-2-15-branch/src/library/stats/ R/loess.Rwithin > > the stats package for R-2.15.1. I will email you a simple example > > off list to see if you can reproduce the crash on windows. > > Markus, a similar problem seem to occur with the charm package. Have > you tracked down the approximate changes to R which causes it (I think > the svn revision numbers below are for bioconductor). Specifically, > does it go away in R-2.15.1-patched and/or when was it introduced? > > Thanks, > Kasper > > > > > The change to loessFit to use stats:::simpleLoess appears to be the > reason > > for the second error message I got when using the development version of > > limma on windows. > > > >> packageDescription("limma")$Version > > [1] "3.13.17" > > > >> changeLog(n=7) > > 17 Aug 2012: limma 3.3.17 > > > > - limma license upgraded to GPL-2 instead of LGPL to match R itself. > > > > - loessFit() no longer makes direct calls to foreign language > > functions in the stats package. Same values are returned as before, > > but now take 25-30% longer whenever weights are used. > > > > > > svn diff -r 68536:68077 > > > > > > cheers, > > > > Marcus > > > > > > On Tue, Sep 4, 2012 at 10:09 PM, Gordon K Smyth <smyth@wehi.edu.au> > wrote: > > > >> Hi Marcus, > >> > >> I haven't seen this problem myself. I've just tried running the Weaver > >> case study in the limma User's Guide, and it still runs correctly for > me, > >> using either R 2.15.1 or R-devel on Windows. > >> > >> There weren't any changes to that part of the limma code between R > 2.14.1 > >> and R 2.15.1, so the change you are seeing may be in the stats package. > >> > >> Best wishes > >> Gordon > >> > >> ---------------- original message --------------- > >> [BioC] Limma normalization error and loess.R segmentation type fault > >> (windows) > >> Marcus Davy mdavy86 at gmail.com > >> Tue Sep 4 07:11:43 CEST 2012 > >> > >> Is anyone having recent problems after upgrading to R-2.15.1 on windows > >> with limma or other bioconductor packages that use loess functions in > the > >> stats core package.? > >> > >> The normalization function normalizeWithinArrays(..., > >> method=printtiploess) is crashing on two independent windows machines > >> since upgrading R from 2.14.1 to 2.15.1 using the stable release > packages > >> on bioconductor. I have a segmentation type fault of Rgui.exe running > this > >> code snippet; > >> > >> system.time(MA <- normalizeWithinArrays(RG, method="printtiploess", > >> bc.method="none")) > >> > >> Process R exited abnormally with code 148 at Tue Sep 04 14:42:30 2012 > >> > >> I can get a more meaningful error if I install the development release > of > >> limma; > >> > >> useDevel() > >> biocLite("limma") > >> > >> system.time(MA <- normalizeWithinArrays(RG, method="printtiploess", > >> bc.method="none")) > >> > >> Error in stats:::simpleLoess(y = yobs, x = xobs, weights = wobs, span = > >> span, : > >> > >> NA/NaN/Inf in foreign function call (arg 1) > >> > >> Timing stopped at: 12.34 2.11 14.5 > >> > >> The number of NAs and weights in the offending array are; > >> > >> sumis.na(tmp$M)) > >>> > >> [1] 717 > >> > >>> sumis.na(tmp$A)) > >>> > >> [1] 717 > >> > >>> table(tmp$weights) > >>> > >> 0 1 > >> 8362 10070 > >> > >> > >> The error appears to be caused in R core->stats:::simpleLoess (not > limma) > >> when interfacing C code, and I can see there have been recent commits to > >> the stats/loess.R file in the R core subversion repository. In the > stable > >> release, limma calls normalizeWithinArrays() -> loessFit() > > >> .vsimpleLoess(), which appear to have modified to > >> > >> normalizeWithinArrays() -> loessFit() > stats:::simpleLoess in the > >> development release of limma, which explains the error I get with the > >> development release of limma. > >> > >> A colleague independently tested normalizing the one offending > microarray > >> slide I identified causing the same segmentation type fault on his > machine, > >> Sorry I have not provided a reproducible example but I can provide an > RSave > >> to load just that array of data, and the code snippets to reproduce the > >> error in an email, this bug appears to be data dependent, windows > dependent > >> (my example code works fine without crashing on linux using R version > >> 2.15.0 and limma 3.12.0) and probably infrequent to achieve. > >> > >> cheers, > >> Marcus > >> > >> > >> ## Stable release > >> > >> sessionInfo() > >>> > >> R version 2.15.1 (2012-06-22) > >> Platform: i386-pc-mingw32/i386 (32-bit) > >> > >> locale: > >> [1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New > Zealand.1252 > >> [3] LC_MONETARY=English_New Zealand.1252 LC_NUMERIC=C > >> [5] LC_TIME=English_New Zealand.1252 > >> > >> attached base packages: > >> [1] stats graphics grDevices utils datasets methods base > >> > >> other attached packages: > >> [1] Biobase_2.16.0 BiocGenerics_0.2.0 limma_3.12.1 > >> > >> ## development release > >> sessionInfo() > >> > >> R version 2.15.1 (2012-06-22) > >> Platform: i386-pc-mingw32/i386 (32-bit) > >> > >> locale: > >> [1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New > Zealand.1252 > >> [3] LC_MONETARY=English_New Zealand.1252 LC_NUMERIC=C > >> [5] LC_TIME=English_New Zealand.1252 > >> > >> attached base packages: > >> [1] stats graphics grDevices utils datasets methods base > >> > >> other attached packages: > >> [1] limma_3.13.17 > >> > >> > ______________________________**______________________________**____ ______ > >> The information in this email is confidential and inte...{{dropped:10}} > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
On 09/04/2012 05:25 PM, Marcus Davy wrote: > I know that it was introduced in R-2.15.1, and the same error also appears > to also exist on R-2.15.1-patched > which I have tested this morning. The subversion revision numbers were in > the limma change log. > > I have a specific simple example to illustrate the *first* error which is a > core dump on windows R-2.15.1, and R-2.15.1-patched > > set.seed(42) > n <- 10 > y <- rnorm(n) > x <- rnorm(n) > w <- rep(0, n) > sessionInfo() > > lm.wfit(cbind(1, x), y, w) under R -d gdb -f test.R there is > set.seed(42) > n <- 10 > y <- rnorm(n) > x <- rnorm(n) > w <- rep(0, n) > lm.wfit(cbind(1, x), y, w) Program received signal SIGFPE, Arithmetic exception. 0x00007ffff4d6d975 in Cdqrls (x=0xc47eb8, y=0xc47d68, tol=0xe29b68) at /home/mtmorgan/src/R-devel/src/library/stats/src/lm.c:51 51 ny = LENGTH(y)/n; /* n x ny, or a vector */ with > R.version.string [1] "R Under development (unstable) (2012-09-04 r60559)" so the reproducible example should be reported to R-devel or R-bugs https://bugs.r-project.org Martin > > Process R exited abnormally with code 148 at Wed Sep 05 12:23:58 2012 > > > Try that and see if you get a core dump, if so that makes this bug pretty > ciritical on windows given that it is for weighted linear model fits. I am > still checking out R-2.15.1 to identify what revision changes have taken > place. > > Marcus > >> sessionInfo() > R version 2.15.1 Patched (2012-09-01 r60539) > Platform: i386-w64-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New > Zealand.1252 > [3] LC_MONETARY=English_New Zealand.1252 > LC_NUMERIC=C > [5] LC_TIME=English_New Zealand.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > > On Wed, Sep 5, 2012 at 9:44 AM, Kasper Daniel Hansen < > kasperdanielhansen at gmail.com> wrote: > >> On Tue, Sep 4, 2012 at 5:38 PM, Marcus Davy <mdavy86 at="" gmail.com=""> wrote: >>> Hi Gordon, >>> yes, I believe the cause of the crashes are due to changes in the file >>> >> http://svn.r-project.org/R/branches/R-2-15-branch/src/library/stats /R/loess.Rwithin >>> the stats package for R-2.15.1. I will email you a simple example >>> off list to see if you can reproduce the crash on windows. >> >> Markus, a similar problem seem to occur with the charm package. Have >> you tracked down the approximate changes to R which causes it (I think >> the svn revision numbers below are for bioconductor). Specifically, >> does it go away in R-2.15.1-patched and/or when was it introduced? >> >> Thanks, >> Kasper >> >>> >>> The change to loessFit to use stats:::simpleLoess appears to be the >> reason >>> for the second error message I got when using the development version of >>> limma on windows. >>> >>>> packageDescription("limma")$Version >>> [1] "3.13.17" >>> >>>> changeLog(n=7) >>> 17 Aug 2012: limma 3.3.17 >>> >>> - limma license upgraded to GPL-2 instead of LGPL to match R itself. >>> >>> - loessFit() no longer makes direct calls to foreign language >>> functions in the stats package. Same values are returned as before, >>> but now take 25-30% longer whenever weights are used. >>> >>> >>> svn diff -r 68536:68077 >>> >>> >>> cheers, >>> >>> Marcus >>> >>> >>> On Tue, Sep 4, 2012 at 10:09 PM, Gordon K Smyth <smyth at="" wehi.edu.au=""> >> wrote: >>> >>>> Hi Marcus, >>>> >>>> I haven't seen this problem myself. I've just tried running the Weaver >>>> case study in the limma User's Guide, and it still runs correctly for >> me, >>>> using either R 2.15.1 or R-devel on Windows. >>>> >>>> There weren't any changes to that part of the limma code between R >> 2.14.1 >>>> and R 2.15.1, so the change you are seeing may be in the stats package. >>>> >>>> Best wishes >>>> Gordon >>>> >>>> ---------------- original message --------------- >>>> [BioC] Limma normalization error and loess.R segmentation type fault >>>> (windows) >>>> Marcus Davy mdavy86 at gmail.com >>>> Tue Sep 4 07:11:43 CEST 2012 >>>> >>>> Is anyone having recent problems after upgrading to R-2.15.1 on windows >>>> with limma or other bioconductor packages that use loess functions in >> the >>>> stats core package.? >>>> >>>> The normalization function normalizeWithinArrays(..., >>>> method=printtiploess) is crashing on two independent windows machines >>>> since upgrading R from 2.14.1 to 2.15.1 using the stable release >> packages >>>> on bioconductor. I have a segmentation type fault of Rgui.exe running >> this >>>> code snippet; >>>> >>>> system.time(MA <- normalizeWithinArrays(RG, method="printtiploess", >>>> bc.method="none")) >>>> >>>> Process R exited abnormally with code 148 at Tue Sep 04 14:42:30 2012 >>>> >>>> I can get a more meaningful error if I install the development release >> of >>>> limma; >>>> >>>> useDevel() >>>> biocLite("limma") >>>> >>>> system.time(MA <- normalizeWithinArrays(RG, method="printtiploess", >>>> bc.method="none")) >>>> >>>> Error in stats:::simpleLoess(y = yobs, x = xobs, weights = wobs, span = >>>> span, : >>>> >>>> NA/NaN/Inf in foreign function call (arg 1) >>>> >>>> Timing stopped at: 12.34 2.11 14.5 >>>> >>>> The number of NAs and weights in the offending array are; >>>> >>>> sumis.na(tmp$M)) >>>>> >>>> [1] 717 >>>> >>>>> sumis.na(tmp$A)) >>>>> >>>> [1] 717 >>>> >>>>> table(tmp$weights) >>>>> >>>> 0 1 >>>> 8362 10070 >>>> >>>> >>>> The error appears to be caused in R core->stats:::simpleLoess (not >> limma) >>>> when interfacing C code, and I can see there have been recent commits to >>>> the stats/loess.R file in the R core subversion repository. In the >> stable >>>> release, limma calls normalizeWithinArrays() -> loessFit() > >>>> .vsimpleLoess(), which appear to have modified to >>>> >>>> normalizeWithinArrays() -> loessFit() > stats:::simpleLoess in the >>>> development release of limma, which explains the error I get with the >>>> development release of limma. >>>> >>>> A colleague independently tested normalizing the one offending >> microarray >>>> slide I identified causing the same segmentation type fault on his >> machine, >>>> Sorry I have not provided a reproducible example but I can provide an >> RSave >>>> to load just that array of data, and the code snippets to reproduce the >>>> error in an email, this bug appears to be data dependent, windows >> dependent >>>> (my example code works fine without crashing on linux using R version >>>> 2.15.0 and limma 3.12.0) and probably infrequent to achieve. >>>> >>>> cheers, >>>> Marcus >>>> >>>> >>>> ## Stable release >>>> >>>> sessionInfo() >>>>> >>>> R version 2.15.1 (2012-06-22) >>>> Platform: i386-pc-mingw32/i386 (32-bit) >>>> >>>> locale: >>>> [1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New >> Zealand.1252 >>>> [3] LC_MONETARY=English_New Zealand.1252 LC_NUMERIC=C >>>> [5] LC_TIME=English_New Zealand.1252 >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> other attached packages: >>>> [1] Biobase_2.16.0 BiocGenerics_0.2.0 limma_3.12.1 >>>> >>>> ## development release >>>> sessionInfo() >>>> >>>> R version 2.15.1 (2012-06-22) >>>> Platform: i386-pc-mingw32/i386 (32-bit) >>>> >>>> locale: >>>> [1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New >> Zealand.1252 >>>> [3] LC_MONETARY=English_New Zealand.1252 LC_NUMERIC=C >>>> [5] LC_TIME=English_New Zealand.1252 >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> other attached packages: >>>> [1] limma_3.13.17 >>>> >>>> >> ______________________________**______________________________**___ _______ >>>> The information in this email is confidential and inte...{{dropped:10}} >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 3 minutes ago
WEHI, Melbourne, Australia
Hi Marcus, I have committed a fix to limma in Bioc devel repository, to work- around problems with lm.wfit() and loess() functions with zero weights. normalizeWithinArrays() now runs smoothly on the data example you sent me. Please try it out and see if it solves your problems. My solution has been to disallow zero weights in loessFit(), instead to reset zero weights to a very small positive value so as to avoid instabilities in defining the fit. My thinking on weights is that normalizeWithinArrays() should not introduce NA values for zero weight observations. Rather, the philosophy is that weights can be used to define each probes influence on the normalizing curve, with the curve being extended to all observations regardless of their influence. We use this sort of strategy for upweighting control probes, as for example in: http://genomebiology.com/2007/8/1/R2 Of course, the weights still enter into downstream functions such as lmFit(), and zero weights at this stage have the effect of removing observations entirely. Best wishes Gordon > Date: Wed, 5 Sep 2012 16:16:40 +1200 > From: Marcus Davy <mdavy86 at="" gmail.com=""> > To: Gordon K Smyth <smyth at="" wehi.edu.au=""> > Cc: Bioconductor mailing list <bioconductor at="" r-project.org=""> > Subject: Re: [BioC] Limma normalization error and loess.R segmentation > type fault (windows) > > Hi Gordon, > some good points you make here. > > I have checked back to the original slide and there was a background smear > over one corner covering an entire print tip region, so a researcher would > have subjectively flagged all those spots as bad, which is why they were > all allocated zero weight. For global loess normalization such as with > agilent arrays, a researcher would also potentially remove the entire > biological replicate from analysis. > > I am tending towards the solution should also be NA for the situation > above. Maybe a conditional switch could be used to cover the options, > ordinary unweighted normalization, and doing no normalization etc for > historical/backward compatability reasons. > > Marcus > > > On Wed, Sep 5, 2012 at 11:21 AM, Gordon K Smyth <smyth at="" wehi.edu.au=""> wrote: > >> Hi Marcus, >> >> I guess the fundamental question then is, what should loess normalization >> return (for a array or print-tip group or whatever) when all the weights >> are zero? >> >> Strict mathematics would suggest that the solution should be NA. >> >> Practical considerations suggest to me that normalizeWithinArrays() might >> be better performing ordinary unweighted loess normalization in this case, >> because no probe is weighted as more reliable than any other. >> >> Or alternatively, one might say that the loess curve can't be estimated, >> so the raw expression values should be returned without adjustment. So >> loess normalization with zero weights is equivalent to no normalization. >> That is what loessFit() and normalizeWithinArrays() have been doing up to R >> 2.14.1. >> >> If you make the weights all zero for a print tip, do you want >> normalizeWithinArrays to return NAs for all probes in that print tip group >> on that array? Or do ordinary unweighted normalization? Or do no >> normalization? >> >> Regards >> Gordon >> >> >> On Wed, 5 Sep 2012, Marcus Davy wrote: >> >> It looks like this error is related to a particular print tip having all >>> weights=0 as input into stats:::simpleLoess. I am trying to construct a >>> simple reproducible example. >>> >>> Marcus >>> ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT
0
Entering edit mode
Hi Gordon, thanks for looking at this, I will test out devel version of limma on the dataset. I assume reproducibility of prior normalizations in previous limma releases is more likely if loess normaliations are not introducing NA values for zero weight observations, as your've noted observation removal can be made at lmFit(). cheers, Marcus On Thu, Sep 6, 2012 at 2:18 PM, Gordon K Smyth <smyth@wehi.edu.au> wrote: > Hi Marcus, > > I have committed a fix to limma in Bioc devel repository, to work- around > problems with lm.wfit() and loess() functions with zero weights. > normalizeWithinArrays() now runs smoothly on the data example you sent me. > Please try it out and see if it solves your problems. > > My solution has been to disallow zero weights in loessFit(), instead to > reset zero weights to a very small positive value so as to avoid > instabilities in defining the fit. > > My thinking on weights is that normalizeWithinArrays() should not > introduce NA values for zero weight observations. Rather, the philosophy > is that weights can be used to define each probes influence on the > normalizing curve, with the curve being extended to all observations > regardless of their influence. We use this sort of strategy for > upweighting control probes, as for example in: > > http://genomebiology.com/2007/**8/1/R2<http: genomebiology.com="" 20="" 07="" 8="" 1="" r2=""> > > Of course, the weights still enter into downstream functions such as > lmFit(), and zero weights at this stage have the effect of removing > observations entirely. > > Best wishes > Gordon > > Date: Wed, 5 Sep 2012 16:16:40 +1200 >> From: Marcus Davy <mdavy86@gmail.com> >> To: Gordon K Smyth <smyth@wehi.edu.au> >> Cc: Bioconductor mailing list <bioconductor@r-project.org> >> Subject: Re: [BioC] Limma normalization error and loess.R segmentation >> type fault (windows) >> >> Hi Gordon, >> some good points you make here. >> >> I have checked back to the original slide and there was a background smear >> over one corner covering an entire print tip region, so a researcher would >> have subjectively flagged all those spots as bad, which is why they were >> all allocated zero weight. For global loess normalization such as with >> agilent arrays, a researcher would also potentially remove the entire >> biological replicate from analysis. >> >> I am tending towards the solution should also be NA for the situation >> above. Maybe a conditional switch could be used to cover the options, >> ordinary unweighted normalization, and doing no normalization etc for >> historical/backward compatability reasons. >> >> Marcus >> >> >> On Wed, Sep 5, 2012 at 11:21 AM, Gordon K Smyth <smyth@wehi.edu.au> >> wrote: >> >> Hi Marcus, >>> >>> I guess the fundamental question then is, what should loess normalization >>> return (for a array or print-tip group or whatever) when all the weights >>> are zero? >>> >>> Strict mathematics would suggest that the solution should be NA. >>> >>> Practical considerations suggest to me that normalizeWithinArrays() might >>> be better performing ordinary unweighted loess normalization in this >>> case, >>> because no probe is weighted as more reliable than any other. >>> >>> Or alternatively, one might say that the loess curve can't be estimated, >>> so the raw expression values should be returned without adjustment. So >>> loess normalization with zero weights is equivalent to no normalization. >>> That is what loessFit() and normalizeWithinArrays() have been doing up >>> to R >>> 2.14.1. >>> >>> If you make the weights all zero for a print tip, do you want >>> normalizeWithinArrays to return NAs for all probes in that print tip >>> group >>> on that array? Or do ordinary unweighted normalization? Or do no >>> normalization? >>> >>> Regards >>> Gordon >>> >>> >>> On Wed, 5 Sep 2012, Marcus Davy wrote: >>> >>> It looks like this error is related to a particular print tip having all >>> >>>> weights=0 as input into stats:::simpleLoess. I am trying to construct a >>>> simple reproducible example. >>>> >>>> Marcus >>>> >>>> > ______________________________**______________________________**____ ______ > The information in this email is confidential and inte...{{dropped:10}}
ADD REPLY

Login before adding your answer.

Traffic: 750 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6