GAIA package Error: cannot allocate vector of size 852.1 Mb
Last seen 6.8 years ago


I am running 64-bit R (RStudio) on windows 7, 16GB RAM on PC. Following the TCGA tutorial to check for copy number variations , i have used the code below:

query.lgg.nocnv <- GDCquery(project="TCGA-LGG", data.category = "Copy number variation",
                            file.type="nocnv_hg19.seg", legacy = TRUE, access = "open")

lgg.nocnv <- GDCprepare(query.lgg.nocnv, save = TRUE, save.filename = "LGGnocnvhg19.rda")

for(cancer in c("LGG")){
  message(paste0("Starting", cancer))
  # Prepare CNV matrix
  cnvMatrix <- get(load(paste0 (cancer,"nocnvhg19.rda")))
  # Add label (0 for loss, 1 for gain)
  cnvMatrix <- cbind(cnvMatrix, Label=NA)
  cnvMatrix[cnvMatrix[,"Segment_Mean"] < -0.3, "Label" ] <- 0
  cnvMatrix[cnvMatrix[,"Segment_Mean"] > 0.3,"Label"] <- 1
  cnvMatrix <- cnvMatrix[!$Label),]
  # Remove " Segment_Mean" and change col.names
  cnvMatrix <-cnvMatrix[,-6]
  colnames(cnvMatrix) <- c( "Sample.Name", "Chromosome", "Start", "End", "Num.of.Markers", "Aberration")
  # Substitute Chromosomes "X" and "Y" with "23" and "24"
  xidx <- which(cnvMatrix$Chromosome=="X")
  yidx <- which(cnvMatrix$Chromosome=="Y")
  cnvMatrix[xidx,"Chromosome"] <- 23
  cnvMatrix[yidx,"Chromosome"] <- 24
  cnvMatrix$Chromosome <- sapply(cnvMatrix$Chromosome,as.integer)
  # Recurrent CNV identification with GAIA
  # Retrieve probes meta file from broadinstitute website
  # Recurrent CNV identification with GAIA
  gdac.root <- ""
  file <- paste0(gdac.root, "")
  # Retrieve probes meta file from broadinstitute website
  if(!file.exists(basename(file))) download(file, basename(file))
  markersMatrix <- readr::read_tsv(basename(file), col_names = FALSE, col_types = "ccn", progress = TRUE)
  colnames(markersMatrix) <- c("Probe.Name", "Chromosome", "Start")
  xidx <- which(markersMatrix$Chromosome=="X")
  yidx <- which(markersMatrix$Chromosome=="Y")
  markersMatrix[xidx,"Chromosome"] <- 23
  markersMatrix[yidx,"Chromosome"] <- 24
  markersMatrix$Chromosome <- sapply(markersMatrix$Chromosome,as.integer)
  markerID <- apply(markersMatrix,1,function(x) paste0(x[2],":",x[3]))
  ## FALSE    TRUE
  ## 1831041     186
  # There are 186 duplicated markers
  ## FALSE
  ## 1831227
  #  ... with different names!
  # Removed duplicates
  markersMatrix <- markersMatrix[-which(duplicated(markerID)),]
  # Filter markersMatrix for common CNV
  markerID <- apply(markersMatrix,1,function(x) paste0(x[2],":",x[3]))
  file <- paste0(gdac.root, "CNV.hg19.bypos.111213.txt")
  if(!file.exists(basename(file))) download(file, basename(file))
  commonCNV <- readr::read_tsv(basename(file), progress = TRUE)
  commonID <- apply(commonCNV,1,function(x) paste0(x[2],":",x[3]))
  print(table(commonID %in% markerID))
  print(table(markerID %in% commonID))
  markersMatrix_fil <- markersMatrix[!markerID %in% commonID,]
  markers_obj <- load_markers(
  nbsamples <- length(get(paste0("query.",tolower(cancer),".nocnv"))$results[[1]]$cases)
  cnv_obj <- load_cnv(cnvMatrix, markers_obj, nbsamples) 


It is at the last line that  i get the error message. I am not sure whether this is due to R reaching the RAM limit (memory.limit() 16235) or some other reason.


Session info:

R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] TCGAbiolinks_2.6.1 downloader_0.4     readr_1.1.1        gaia_2.22.0       

loaded via a namespace (and not attached):
  [1] colorspace_1.3-2            selectr_0.3-1               rjson_0.2.15                hwriter_1.3.2              
  [5] circlize_0.4.2              XVector_0.18.0              GenomicRanges_1.30.0        GlobalOptions_0.0.12       
  [9] ggpubr_0.1.6                matlab_1.0.2                ggrepel_0.7.0               bit64_0.9-7                
 [13] AnnotationDbi_1.40.0        xml2_1.1.1                  codetools_0.2-15            splines_3.4.2              
 [17] R.methodsS3_1.7.1           mnormt_1.5-5                doParallel_1.0.11           DESeq_1.30.0               
 [21] geneplotter_1.56.0          knitr_1.17                  jsonlite_1.5                Rsamtools_1.30.0           
 [25] km.ci_0.5-2                 broom_0.4.3                 annotate_1.56.1             cluster_2.0.6              
 [29] R.oo_1.21.0                 compiler_3.4.2              httr_1.3.1                  assertthat_0.2.0           
 [33] Matrix_1.2-11               lazyeval_0.2.1              limma_3.34.1                prettyunits_1.0.2          
 [37] tools_3.4.2                 bindrcpp_0.2                gtable_0.2.0                glue_1.2.0                 
 [41] GenomeInfoDbData_0.99.1     reshape2_1.4.2              dplyr_0.7.4                 ggthemes_3.4.0             
 [45] ShortRead_1.36.0            Rcpp_0.12.13                Biobase_2.38.0              Biostrings_2.46.0          
 [49] nlme_3.1-131                rtracklayer_1.38.0          iterators_1.0.8             psych_1.7.8                
 [53] stringr_1.2.0               rvest_0.3.2                 devtools_1.13.4             XML_3.98-1.9               
 [57] edgeR_3.20.1                zoo_1.8-0                   zlibbioc_1.24.0             scales_0.5.0               
 [61] aroma.light_3.8.0           hms_0.4.0                   parallel_3.4.2              SummarizedExperiment_1.8.0 
 [65] RColorBrewer_1.1-2          curl_3.0                    ComplexHeatmap_1.17.1       yaml_2.1.14                
 [69] memoise_1.1.0               gridExtra_2.3               KMsurv_0.1-5                ggplot2_2.2.1              
 [73] biomaRt_2.34.0              latticeExtra_0.6-28         stringi_1.1.6               RSQLite_2.0                
 [77] genefilter_1.60.0           S4Vectors_0.16.0            foreach_1.4.3               RMySQL_0.10.13             
 [81] GenomicFeatures_1.30.0      BiocGenerics_0.24.0         BiocParallel_1.12.0         shape_1.4.3                
 [85] GenomeInfoDb_1.14.0         rlang_0.1.4                 pkgconfig_2.0.1             matrixStats_0.52.2         
 [89] bitops_1.0-6                lattice_0.20-35             purrr_0.2.4                 bindr_0.1                  
 [93] cmprsk_2.2-7                GenomicAlignments_1.14.1    bit_1.1-12                  plyr_1.8.4                 
 [97] magrittr_1.5                R6_2.2.2                    IRanges_2.12.0              DelayedArray_0.4.1         
[101] DBI_0.7                     foreign_0.8-69              withr_2.1.0                 survival_2.41-3            
[105] RCurl_1.95-4.8              tibble_1.3.4                EDASeq_2.12.0               survMisc_0.5.4             
[109] GetoptLong_0.1.6            progress_1.1.2              locfit_1.5-9.1              grid_3.4.2                 
[113] data.table_1.10.4-3         blob_1.1.0                  ConsensusClusterPlus_1.42.0 digest_0.6.12              
[117] xtable_1.8-2                tidyr_0.7.2                 R.utils_2.6.0               stats4_3.4.2               
[121] munsell_0.4.3               survminer_0.4.1          



Any help will be appreciated


memory problem gaia • 1.8k views
Last seen 8.5 years ago
United Kingdom
Entering edit mode

It ran successfully on a subset of data and markers :

cnv_obj <- load_cnv(cnvMatrix[1:15000,], markers_obj[1:6], nbsamples)

To be able to use Bioconductor, do i need to add more system RAM, i.e., 32GB? What are people running their software usually on?

