Entering edit mode
Stefanie
▴
360
@stefanie-5192
Last seen 10.2 years ago
Dear list,
I have a question regarding processing of large bamfiles (such as
reading in via readGAlignemnts or computing the coverage).
I know about the option of iterative processing such as shown in the
example below.
mybam = open(Bamfile(bamfile, yieldSize = 2000000))
gAln <- GAlignments()
while(length(chunk <- readGAlignmentsFromBam(mybam))){
gAln <- c(gAln,chunk)
}
close(mybam)
Obviously the efficiency of iterating depends on the (i) file-size of
the bam file and (ii) the available memory.
Can I somehow pinpoint (e.g. file-size, number of alignments, memory
requirements) when it is more efficient ( = faster and memory
requirements are feasible) to process the bam-file in one batch or,
alternatively, do it in an iterative manner?
Best,
Stefanie
[[alternative HTML version deleted]]