R: Recurrent error with Flashclust
1
0
Entering edit mode
@manca-marco-path-4295
Last seen 10.2 years ago
Thanks Peter! Indeed your answer hasn't made it through the Bioncoductor server, but it is attached to this email so other interested people will be able to read about it. I am looking forward to see your update out there... I appreciate your support... and your job! All the best, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: Peter Langfelder [peter.langfelder at gmail.com] Inviato: gioved? 3 febbraio 2011 17.50 A: Manca Marco (PATH) Cc: bioconductor at stat.math.ethz.ch Oggetto: Re: Recurrent error with Flashclust Hi Marco, I'm not on the bioconductor list so I suspect my reply there won't be let through, but please feel free to forward it if there's interest. I suspect you are running into limitation of 32-bit integers. Your code shows >> dim(dataIRDam.cont) > [1] 48803 7 which means the distance structure has length roughly 48803^2/2, which is above 2^31. At present the Fortran code uses integers (which I assume are 4-byte long) and indexing such log arrays may be the root of the problem. I'll have to look into this issue. Of course, I can modify the fortran code to use long integers for indexing, but I need to figure out how to arrange the interaction with R. Please give me a few days or maybe a week as I am busy with other stuff as well. Thanks for alerting me to this issue. Peter On Thu, Feb 3, 2011 at 5:15 AM, Manca Marco (PATH) <m.manca at="" maastrichtuniversity.nl=""> wrote: > > > Good afternoon Peter, fellow bioconductors > > I have a help request for you. > > While planning to use WGCNA on a data set produced through Illumina Humanv3 I am experiencing a recurrent failure of the function flashclust. The output, which lets me without any clue is the following: > > *** caught segfault *** > address 0x7f0222e0d848, cause 'memory not mapped' > > *** caught segfault *** > address 0x3e80000074f, cause 'unknown' > > > > Following I am pasting all the code and output until the point. Also a sessionInfo() output is pasted at the bottom. I am working an up to date installation of Ubuntu 10.04 64bit and I have updated R (and packages) from CRAN repository and Bioconductor packages using: > source("http://bioconductor.org/biocLite.R") > update.packages(repos=biocinstallRepos(), ask=FALSE, checkBuilt=TRUE) > > > Thank you in advance for any direction/feedback you will share with me. > > All the best, Marco > > > ###@Workstation1:~$ R > > R version 2.12.1 (2010-12-16) > Copyright (C) 2010 The R Foundation for Statistical Computing > ISBN 3-900051-07-0 > Platform: x86_64-pc-linux-gnu (64-bit) > > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under certain conditions. > Type 'license()' or 'licence()' for distribution details. > > Natural language support but running in an English locale > > R is a collaborative project with many contributors. > Type 'contributors()' for more information and > 'citation()' on how to cite R or R packages in publications. > > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > >> getwd(); > [1] "/home/###" >> workingDir = "/home/###/Documents/working"; >> setwd(workingDir); >> library(WGCNA); > Loading required package: impute > Loading required package: dynamicTreeCut > Loading required package: flashClust > > Attaching package: 'flashClust' > > The following object(s) are masked from 'package:stats': > > hclust > > Loading required package: qvalue > Loading Tcl/Tk interface ... done > Loading required package: Hmisc > Loading required package: survival > Loading required package: splines > > Attaching package: 'Hmisc' > > The following object(s) are masked from 'package:survival': > > untangle.specials > > The following object(s) are masked from 'package:base': > > format.pval, round.POSIXt, trunc.POSIXt, units > > > Package WGCNA version 0.99 loaded. > > > Attaching package: 'WGCNA' > > The following object(s) are masked from 'package:stats': > > cor > >> options(stringsAsFactors = FALSE); >> library(lumi) > Loading required package: Biobase > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'openVignette()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation(pkgname)'. > > > Attaching package: 'Biobase' > > The following object(s) are masked from 'package:Hmisc': > > combine, contents > > KernSmooth 2.23 loaded > Copyright M. P. Wand 1997-2009 > >> fileName <- "FinalReport_nobckgrnd_nonorm.csv"; >> >> library(illuminaHumanv3BeadID.db) > Loading required package: AnnotationDbi > Loading required package: org.Hs.eg.db > Loading required package: DBI >> >> IRDam.dirtyImmature <- lumiR(fileName, convertNuID=FALSE, lib="illuminaHumanv3BeadID.db"); > Perform Quality Control assessment of the LumiBatch object ... >> >> IRDam.dirty <- lumiExpresso(IRDam.dirtyImmature, QC.evaluation=TRUE); > Background Correction: bgAdjust > Variance Stabilizing Transform method: vst > Normalization method: quantile > > > Background correction ... > Perform bgAdjust background correction ... > There is no control probe information in the LumiBatch object! > No background adjustment will be performed. > done. > > Variance stabilizing ... > Perform vst transformation ... > 2011-02-03 13:11:44 , processing array 1 > ###-Removed for the sake of brevity-### > done. > > Normalizing ... > Perform quantile normalization ... > done. > > Quality control after preprocessing ... > Perform Quality Control assessment of the LumiBatch object ... > done. >> summary(IRDam.dirty); > ###-Removed for the sake of brevity-### > >> summary(IRDam.dirty, 'QC'); > ###-Removed for the sake of brevity-### >> dataIRDam.dirty <- exprs(IRDam.dirty) >> >> dataIRDam.cont = dataIRDam.dirty[, c(10,11,6,13,22,20,19)] >> is.matrix(dataIRDam.cont) > [1] TRUE >> dim(dataIRDam.cont) > [1] 48803 7 >> ControlTree1 = flashClust(dist(dataIRDam.cont), method = "average"); > > *** caught segfault *** > address 0x7f0222e0d848, cause 'memory not mapped' > > *** caught segfault *** > address 0x3e80000074f, cause 'unknown' > > Traceback: > 1: .Fortran("hc", n = as.integer(n), len = as.integer(len), method = as.integer(method), ia = integer(n), ib = integer(n), crit = double(n), membr = as.double(members), nn = integer(n), disnn = double(n), flag = logical(n), diss = as.double(d)) > 2: hclust(d, method, members) > 3: flashClust(dist(dataIRDam.cont), method = "average") > > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > > Traceback: > 1: .Fortran("hc", n = as.integer(n), len = as.integer(len), method = as.integer(method), ia = integer(n), ib = integer(n), crit = double(n), membr = as.double(members), nn = integer(n), disnn = double(n), flag = logical(n), diss = as.double(d)) > 2: hclust(d, method, members) > 3: flashClust(dist(dataIRDam.cont), method = "average") > > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > Selection: Selection: > > > > > > My session info is: > > >> sessionInfo() > R version 2.12.1 (2010-12-16) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C > [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 > [7] LC_PAPER=en_US.utf8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C > > attached base packages: > [1] splines stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] illuminaHumanv3BeadID.db_1.8.0 org.Hs.eg.db_2.4.6 > [3] RSQLite_0.9-4 DBI_0.2-5 > [5] AnnotationDbi_1.12.0 lumi_2.2.1 > [7] Biobase_2.10.0 WGCNA_0.99 > [9] Hmisc_3.8-3 survival_2.36-2 > [11] qvalue_1.24.0 flashClust_1.00-2 > [13] dynamicTreeCut_1.21 impute_1.24.0 > > loaded via a namespace (and not attached): > [1] affy_1.28.0 affyio_1.18.0 annotate_1.28.0 > [4] cluster_1.13.2 grid_2.12.1 hdrcde_2.15 > [7] KernSmooth_2.23-4 lattice_0.19-17 MASS_7.3-9 > [10] Matrix_0.999375-46 methylumi_1.6.1 mgcv_1.7-2 > [13] nlme_3.1-97 preprocessCore_1.12.0 tcltk_2.12.1 > [16] tools_2.12.1 xtable_1.5-6 > > > -- > Marco Manca, MD > University of Maastricht > Faculty of Health, Medicine and Life Sciences (FHML) > Cardiovascular Research Institute (CARIM) > > Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) > Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht > > E-mail: m.manca at maastrichtuniversity.nl > Office telephone: +31(0)433874633 > Personal mobile: +31(0)626441205 > Twitter: @markomanka > > > ******************************************************************** ************************************************* > > This email and any files transmitted with it are confidential and solely for the use of the intended recipient. > > It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for > > delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. > > If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA > > ******************************************************************** *****
Normalization Preprocessing probe Normalization Preprocessing probe • 1.1k views
ADD COMMENT
0
Entering edit mode
@manca-marco-path-4295
Last seen 10.2 years ago
Dear Peter, you have been SUPER! Thank you for your quick reaction. I will try the new code straight away. I am forwarding also to Bioconductor for general interest. All the best, Marco -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht E-mail: m.manca at maastrichtuniversity.nl Office telephone: +31(0)433874633 Personal mobile: +31(0)626441205 Twitter: @markomanka ********************************************************************** *********************************************** This email and any files transmitted with it are confidential and solely for the use of the intended recipient. It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED. If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA ********************************************************************** *********************************************** ________________________________________ Da: Peter Langfelder [peter.langfelder at gmail.com] Inviato: domenica 6 febbraio 2011 20.41 A: Manca Marco (PATH) Oggetto: Re: Recurrent error with Flashclust Hi Marco, I found the error and I am now able to run a clustering of 50000 objects (takes about 10 minutes on my machine). I hope your computer has enough memory to handle such large problems (my 32GB were pretty much exhausted and needed some swapping at the end of the function). I will submit the fixed package to CRAN soon but in the meantime you can install it from the attached source bundle. It is my understanding that because of inherent R limitations, you won't be able to cluster more than 2^16 = 65536 objects, since R doesn't seem to allow objects of length more than 2^31. Let me know if anything doesn't work, and, as before, feel free to forward this to the Bioconductor mailing list. Best, Peter On Fri, Feb 4, 2011 at 12:40 AM, Manca Marco (PATH) <m.manca at="" maastrichtuniversity.nl=""> wrote: > > > Thanks Peter! > > Indeed your answer hasn't made it through the Bioncoductor server, but it is attached to this email so other interested people will be able to read about it. > > I am looking forward to see your update out there... I appreciate your support... and your job!
ADD COMMENT

Login before adding your answer.

Traffic: 890 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6