I have been trying to leverage definder
's ability to detect expression without the need of an annotation to identify broad regions of expressions that we observe in certain conditions:
(not shown but there is no annotated gene in this region)
The issue is that it seems that the expressed regions reported by derfinder
are smaller than what would be expected by visual inspection. As you can see above there is quite a large expressed region of which only a fraction is detected. This doesn't seem to be an issue of low coverage because, to the left, there is a an area with similar high coverage which is not reported as expressed. This an example but I found several of these in my data.
The question is if this a setting which needs to be changed/tweaked, or is this something to be expected?
The analysis was run with the following settings:
fullCov <- fullCoverage(
files = files,
chrs = chrom,
cutoff = 0,
L = read_length,
verbose = TRUE,
totalMapped = total_mapped,
filter = "one",
mc.cores = nproc
)
regionMat <- regionMatrix(
fullCov,
cutoff = min_cov,
L = read_length,
maxClusterGap = 3000L,
returnBP = TRUE,
verbose = TRUE,
filter = "one",
targetSize = targetSize,
mc.cores = nproc
)
Coverage was taken directly from bam files, read length = 75 and cutoff = 5. Session info can be found here.
That seems to have been the major issue. Once I remove filterring from the
fullCoverage
it all started to make sense.No, my apologies, and yes - it did give an error. My initial thought was that the filter option (one / mean) was the culprit, started playing around it and ended up posting the wrong code.
targetSize
is total mapped reads since I was reading from the bam files. I have switched to normalized bigwigs.23, but I am not sure why it matters. That said, since posting this, I started spliting the Expressed Regions filtering by condition because I am not only interested in which regions are differentially expressed, but also how much of the genome is expressed in each condition and large those regions are. So right now length of files is 3 :)
Cheers Leonardo.
The length of
files
mattered regarding thefilter
option. Looks like it's all resolved now though =) Though I'll say it again: you don't need to usetotalMapped
andtargetSize
more than once (so either infullCoverage()
or inregionMatrix()
but not both).Got it! For future reference, and in case someone else stumbles upon this, this is the final version of the commands I used:
Awesome, thanks for sharing the code! =)
Best, Leo