Hello!
I would like to use FRASER on a set of 72 BAM files with RNA-seq data for 2 groups aligned using STAR.
I have been attempting to run countRNAData
on our cluster, with a maximum allocation of 800GB of memory,
but it does not seem to be enough. It does seem to work if I run each sample individually, however I believe FRASER requires the fds object to be created using a table of all BAM files. When looking at the resulting fds object for one of my samples I do see a great number of junctions (>300,000) and splice sites (>200,000).
Would it be possible to create a separate fds object for each sample and combine them? Or would someone who has used this package before know a better way I could get this running?
library(FRASER)
sampleTable <- fread("FRASERsampleTable.txt")
bamFiles <- sampleTable[,bamFile]
sampleTable[,bamFile:=bamFiles]
settings <- FraserDataSet(colData=sampleTable, workingDir='./fraserCountsOutput', strandSpecific=as.integer(2))
if(.Platform$OS.type == "unix") {
register(MulticoreParam(workers=min(8, multicoreWorkers())))
}
fds <- countRNAData(settings)
save(fds, file="./fraserCountsOutput/jhh_FRASER_counts.RData")