Entering edit mode
Diana
▴
10
@diana-19465
Last seen 5.8 years ago
Hi all,
I get a memory error ('error: cannot allocate vector of size 344.5 Mb') when running summarizeOverlaps in the Genomic alignments package. I have 4 GB RAM (with about 3.8 GB free space) and I use 64 bits R. I also increased the memory.limit size to 3500 and I tried -- vanilla as well. Nothing seems to work. Do you have any ideas? Thanks a lot!
Hi James,
Thanks for your answer! Yes, I am reading BAM files. I know the yieldSize argument, but the file itself is about 500 MB, so isn't the memory error a bit strange? What could be an explanation besides low RAM memory (which is not the case)?
You say you are reading BAM files, but then you say 'the file itself', so it's not clear if you are reading in one or more files. Anyway, having a computer with 4 Gb RAM doesn't mean you actually have that much RAM to allocate to R. It may be much less, depending on what else you have running. And reading in a 500 Mb file will probably take more RAM than you would expect, given underlying copies that may be created. And if you are on Windows, which sometimes has problems releasing memory, that might be exacerbated.
I wouldn't use a Windows box with 4 Gb RAM for really basic stuff (16 Gb RAM is about the lowest I would go, even for casual use), so it's not surprising to me at all that you would run out of RAM trying to do something real.
You say that you 'know the yieldSize argument'. Does that mean you are using it, or just that you know it exists?
Sorry, currently I am reading in one file. As for the yieldSize argument, I know it exists. I haven't yet tried it, as I assumed it would take a looong time to read the whole file in seperate chunks. I will try it with a yieldSize of 2000000 to start with.
Hi, I have still one question about the reduceByYield argument. I have the following code:
However, I get the following error:
My following steps are counting reads with summarizeOverlaps and performing a differential expression analysis with edgeR. This works fine with my current Yieldsize of 2000000, but I want to perform these analysis on complete BAM-files. Do you know how I can make this reduceByYield argument work?
Why are you doing that? Simply passing a
BamFileList
tosummarizeOverlaps
where you have specified the yieldSize for theBamFileList
will cause the data to be read in chunks.Really? So simply running se will actually count all reads? That would be great... But how is it possible that tail(assay(se)) gives 9997 as last row and rowRanges(se) gives an object of length 25892? I am sorry for asking these probably basic questions...
I think you might be confused. The row names for a
SummarizedExperiment
are the underlying IDs (which in your case might be Entrez Gene IDs? The yieldSize argument simply sets the chunk size for the data being read in, not the total amount of data to read in:Please note that the dim for both
SummarizedExperiments
are identical, and that the rownames are (in this case) Ensembl Gene IDs.