Entering edit mode
Hi,
I seem to have a problem with the openWorkspace() and closeWorkspace().
openWorkspace() occupies a lot of RAM on my system. About 20MB. This memory is not freed, when I try to close the workspace [using closeWorkspace()] or when I remove it [using rm()].
As a consequence, my memory gets more and more occupied and my computer finally crashes ;( I am using Rstudio with R3.1.1. and Windows 7.
Best regards, Silke
> i <- wsp_file
> ws <- openWorkspace(i)
> closeWorkspace(ws)
What package is this
openWorkspace
function defined in?I'm guessing it comes from flowWorkspace but it would be best if the original poster would provide this information along with the complete code needed to reproduce the problem (how is wsp_file defined?) and the output of sessionInfo().
The
openWorkspace()
function is defined in theflowWorkspace()
package. wsp_file is a character-string indicating the path to the FlowJo-Workspace-File that should be "opened" and parsed afterwards usingparseWorkspace(ws)
. The FlowJo workspace file is an XML-file. The sessionInfo() is as posted below. My current assumption is, that the memory leak is caused by the XML package, which seemed to be used by openWorkspace(). Currently, I have libxml2 2.9.1 installed. There is an update (2.9.2) from 2 weeks ago, which is not yet available for Windows.The details from the closeWorkspace-help tell me:
Open an XML flowJo workspace file and return a
flowJoWorkspace
object. The workspace is represented using aXMLInternalDocument
object. Close a flowJoWorkpsace after finishing with it. This is necessary to explicitly clean up the C-based representation of the XML tree. (See the XML package).However, this cleanup does not seem to work, since the memory of 20 MB is not freed afterwards. gc() does not work, since R seems to have no control over the C-based representation. I can open the wsp-File, parse the Workspace, read the necessary information. I do not get errors or warnings, however, the memory runs over.
> sessionInfo() R version 3.1.1 (2014-07-10) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 [4] LC_NUMERIC=C LC_TIME=German_Germany.1252 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] xlsx_0.5.7 xlsxjars_0.6.1 rJava_0.9-6 dplyr_0.3.0.2 [5] plyr_1.8.1 car_2.0-21 stringr_0.6.2 flowWorkspace_3.12.0 [9] gridExtra_0.9.1 ncdfFlow_2.12.0 BH_1.54.0-4 RcppArmadillo_0.4.450.1.0 [13] flowViz_1.30.0 lattice_0.20-29 flowCore_1.32.0 loaded via a namespace (and not attached): [1] assertthat_0.1 Biobase_2.26.0 BiocGenerics_0.12.0 chron_2.3-45 corpcor_1.6.7 data.table_1.9.4 [7] DBI_0.3.1 DEoptimR_1.0-2 graph_1.44.0 hexbin_1.27.0 IDPmisc_1.1.17 KernSmooth_2.23-13 [13] latticeExtra_0.6-26 magrittr_1.0.1 MASS_7.3-35 mvtnorm_1.0-0 nnet_7.3-8 parallel_3.1.1 [19] pcaPP_1.9-50 RColorBrewer_1.0-5 Rcpp_0.11.3 reshape2_1.4 Rgraphviz_2.10.0 robustbase_0.91-1 [25] rrcov_1.3-4 stats4_3.1.1 tools_3.1.1 XML_3.98-1.1 zlibbioc_1.12.0
To verify your assumption, try this to see if there is still 20M leaking,
I tried a systematic evaluation of the problem: I wrote 4 different functions to use them with lapply on 10 Workspace files. The MB occupied in the RAM I wrote behind the lapply statements. Somehow, it seems that the more XML object are generated (even implicitly) the more RAM is occupied.
extract_WSP_info_1 <- function (wsp_file) {
doc <- xmlTreeParse(wsp_file, useInternalNodes = TRUE)
free(doc)
rm(doc)
}
extract_WSP_info_2 <- function (wsp_file) {
ws <- openWorkspace(wsp_file)
closeWorkspace(ws)
}
extract_WSP_info_3 <- function (wsp_file) {
xml_content <- xmlTreeParse(wsp_file,useInternalNodes=TRUE)
wsp_version <- xmlAttrs(xmlRoot(xml_content))["version"]
fj_version <- xmlAttrs(xmlRoot(xml_content))["flowJoVersion"]
free(xml_content)
rm(xml_content)
}
extract_WSP_info_4 <- function (wsp_file) {
xml_content <- xmlTreeParse(wsp_file,useInternalNodes=TRUE)
xml_root <- xmlRoot(xml_content)
wsp_version <- xmlAttrs(xml_root)["version"]
fj_version <- xmlAttrs(xml_root)["flowJoVersion"]
free(xml_content)
rm(xml_content,xml_root)
}
lapply(wsp_files[1:10],extract_WSP_info_1) 4.228-4.237=9
lapply(wsp_files[1:10],extract_WSP_info_2) 4.237-4.323=86