Recurrent error with Flashclust
Good afternoon Peter, fellow bioconductors I have a help request for you. While planning to use WGCNA on a data set produced through Illumina Humanv3 I am experiencing a recurrent failure of the function flashclust. The output, which lets me without any clue is the following: *** caught segfault *** address 0x7f0222e0d848, cause 'memory not mapped' *** caught segfault *** address 0x3e80000074f, cause 'unknown' Following I am pasting all the code and output until the point. Also a sessionInfo() output is pasted at the bottom. I am working an up to date installation of Ubuntu 10.04 64bit and I have updated R (and packages) from CRAN repository and Bioconductor packages using: source("") update.packages(repos=biocinstallRepos(), ask=FALSE, checkBuilt=TRUE) Thank you in advance for any direction/feedback you will share with me. ###@Workstation1:~$ R

R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit) Type 'q()' to quit R. > getwd(); [1] "/home/###" > workingDir = "/home/###/Documents/working"; > setwd(workingDir); > library(WGCNA); Loading required package: impute Loading required package: dynamicTreeCut Loading required package: flashClust Attaching package: 'flashClust' The following object(s) are masked from 'package:stats': hclust Loading required package: qvalue Loading Tcl/Tk interface ... done Loading required package: Hmisc Loading required package: survival Loading required package: splines Attaching package: 'Hmisc' The following object(s) are masked from 'package:survival': untangle.specials The following object(s) are masked from 'package:base': format.pval, round.POSIXt, trunc.POSIXt, units Package WGCNA version 0.99 loaded. Attaching package: 'WGCNA' The following object(s) are masked from 'package:stats': cor > options(stringsAsFactors = FALSE); > library(lumi) Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation(pkgname)'. Attaching package: 'Biobase' The following object(s) are masked from 'package:Hmisc': combine, contents KernSmooth 2.23 loaded Copyright M. P. Wand 1997-2009 > fileName <- "FinalReport_nobckgrnd_nonorm.csv"; > > library(illuminaHumanv3BeadID.db) Loading required package: AnnotationDbi Loading required package: Loading required package: DBI > > IRDam.dirtyImmature <- lumiR(fileName, convertNuID=FALSE, lib="illuminaHumanv3BeadID.db"); Perform Quality Control assessment of the LumiBatch object ... > > IRDam.dirty <- lumiExpresso(IRDam.dirtyImmature, QC.evaluation=TRUE); Background Correction: bgAdjust Variance Stabilizing Transform method: vst Normalization method: quantile Background correction ... Perform bgAdjust background correction ... There is no control probe information in the LumiBatch object! No background adjustment will be performed. done. Variance stabilizing ... Perform vst transformation ... 2011-02-03 13:11:44 , processing array 1 ###-Removed for the sake of brevity-### done. Normalizing ... Perform quantile normalization ... done. Quality control after preprocessing ... Perform Quality Control assessment of the LumiBatch object ... done. > summary(IRDam.dirty); ###-Removed for the sake of brevity-### > summary(IRDam.dirty, 'QC'); ###-Removed for the sake of brevity-### > dataIRDam.dirty <- exprs(IRDam.dirty) > > dataIRDam.cont = dataIRDam.dirty[, c(10,11,6,13,22,20,19)] > is.matrix(dataIRDam.cont) [1] TRUE > dim(dataIRDam.cont) [1] 48803 7 > ControlTree1 = flashClust(dist(dataIRDam.cont), method = "average"); *** caught segfault *** address 0x7f0222e0d848, cause 'memory not mapped' *** caught segfault *** address 0x3e80000074f, cause 'unknown' Traceback: 1: .Fortran("hc", n = as.integer(n), len = as.integer(len), method = as.integer(method), ia = integer(n), ib = integer(n), crit = double(n), membr = as.double(members), nn = integer(n), disnn = double(n), flag = logical(n), diss = as.double(d)) 2: hclust(d, method, members) 3: flashClust(dist(dataIRDam.cont), method = "average") Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Traceback: 1: .Fortran("hc", n = as.integer(n), len = as.integer(len), method = as.integer(method), ia = integer(n), ib = integer(n), crit = double(n), membr = as.double(members), nn = integer(n), disnn = double(n), flag = logical(n), diss = as.double(d)) 2: hclust(d, method, members) 3: flashClust(dist(dataIRDam.cont), method = "average") Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection: Selection: My session info is: > sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 [7] LC_PAPER=en_US.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] illuminaHumanv3BeadID.db_1.8.0 [3] RSQLite_0.9-4 DBI_0.2-5 [5] AnnotationDbi_1.12.0 lumi_2.2.1 [7] Biobase_2.10.0 WGCNA_0.99 [9] Hmisc_3.8-3 survival_2.36-2 [11] qvalue_1.24.0 flashClust_1.00-2 [13] dynamicTreeCut_1.21 impute_1.24.0 loaded via a namespace (and not attached): [1] affy_1.28.0 affyio_1.18.0 annotate_1.28.0 [4] cluster_1.13.2 grid_2.12.1 hdrcde_2.15 [7] KernSmooth_2.23-4 lattice_0.19-17 MASS_7.3-9 [10] Matrix_0.999375-46 methylumi_1.6.1 mgcv_1.7-2 [13] nlme_3.1-97 preprocessCore_1.12.0 tcltk_2.12.1 [16] tools_2.12.1 xtable_1.5-6 -- Marco Manca, MD University of Maastricht Faculty of Health, Medicine and Life Sciences (FHML) Cardiovascular Research Institute (CARIM) Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands) Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Marco Manca, MD
University of Maastricht
Faculty of Health, Medicine and Life Sciences (FHML)
Cardiovascular Research Institute (CARIM)

Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands)
Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht

E-mail: m.manca at
Office telephone: +31(0)433874633
Personal mobile: +31(0)626441205
Twitter: @markomanka
Hi Marco, I'm not on the bioconductor list so I suspect my reply there won't be let through, but please feel free to forward it if there's interest. I suspect you are running into limitation of 32-bit integers. Your code shows >> dim(dataIRDam.cont) > [1] 48803 7 which means the distance structure has length roughly 48803^2/2, which is above 2^31. At present the Fortran code uses integers (which I assume are 4-byte long) and indexing such log arrays may be the root of the problem. I'll have to look into this issue. Of course, I can modify the fortran code to use long integers for indexing, but I need to figure out how to arrange the interaction with R. Please give me a few days or maybe a week as I am busy with other stuff as well. Thanks for alerting me to this issue. 