Hi Julie and Kai,
I am testing the GUIDEseq package on the modified GUIDE-seq protocol. While running the analysis on a new data set, I was looking into a specific off target that was being predicted by another program, but not by the GUIDEseq method. This sample was generated after concatenating the reads from other samples, and so I wondered if the reads are being filtered out as duplicates. If so then is there a way to override this option? Please see the code below for your reference.
Appreciate all the help.
GUIDEseqAnalysis(alignment.inputfile = bamfile , umi.inputfile = umifile,
alignment.format = c("bam"),
BSgenomeName = Hsapiens,
gRNA.file = gRNAs, n.cores.max = 4,min.mapping.quality = 15L,max.R1.len = 150L, max.R2.len =150L,
min.reads = 2,min.SNratio = 2, maxP = 0.01,window.size = 25L,step = 25L,
plus.strand.start.gt.minus.strand.end = FALSE,distance.threshold = 1000,
upstream = 50L, downstream = 50L, PAM.size = 3, gRNA.size = 20,
PAM = "NGG", PAM.pattern = "(NGG)$", max.mismatch = 6,
outputDir = outputDir,
orderOfftargetsBy = "predicted_cleavage_score",
allowed.mismatch.PAM = 2, overwrite = TRUE)
Remove duplicate reads ...
Hi Julie,
Thank you for the reply.
To run the analysis in a less stringent mode, I've tried the parameters (see below), however the off-target that's predicted in the individual sample is not present in the concatenated sample. keepPeaksInBothStrandsOnly = FALSE and min.peak.score.1strandOnly=1L
There's no option for min.read.coverage and min.umi.count in the version of the package I installed. Can you please let me know if there's an option to keep duplicate reads?
Thank you for all the help.