Question

Restricting differential binding analysis to only TSS-proximal regions

0

Entering edit mode

Luca • 0

@79698011

Last seen 12 hours ago

Canada

Hello,

I came across a paper recently that divided their analysis between the TSS-proximal region and some other distal sites (I believe they used DiffBind with called peaks, and just filtered their peaks based on the peak annotations). I was wondering if a similar approach is possible using Csaw and generally sliding window methods.

Here is what I've tried, I was wondering if this is a sound approach, it seems to work.

#Get TSS proximal region of transcripts  
promoters_mm10 <- promoters(TxDb.Mmusculus.UCSC.mm10.knownGene, upstream = 1000, downstream = 1000)

#Get length of each chr/contig for respective genome (mm10 in this case)
genome_mm10 <- seqlengths(BSgenome.Mmusculus.UCSC.mm10)
genome_ranges <- GRanges(seqnames = names(genome_mm10), ranges = IRanges(start = 1, end = genome_mm10))

#Subset the regions that do not fall within the promoter +-1kb
promoters_mm10_reduce <- reduce(promoters_mm10)
nonTSS <- setdiff(genome_gr, promoters_mm10_reduce, ignore.strand = T)

#Add this to the discard argument in readParam, and restrict to only the chr's
restriction_TSS_pe <- readParam(
pe = "both", minq = 20, dedup = T, discard = nonTSS,
restrict = paste0("chr", c(1:19, "X", "Y")), max.frag = 400
)

And then as per the book, apply a windowCounts and regionCounts as follows, with a local filter for low FC windows

WindowCounts_Males_TSSOnly <- windowCounts(
Updated_Male_4Months$bamReads, ext = Frag.LengthsM_IP.max,
width = 150, param = restriction_TSS_pe, BPPARAM = SnowParam(workers = 20)
)

TSSLocal <- resize(rowRanges(WindowCounts_Males_TSSOnly), 1000, fix = "center")
Window_Males_TSSOnly <- regionCounts(
Updated_Male_4Months$bamReads, regions = TSSLocal, ext = Frag.LengthsM_IP.max,
param = restriction_TSS_pe, BPPARAM = SnowParam(workers = 20)
)

filter.stat.TSS <- filterWindowsLocal(WindowCounts_Males_TSSOnly, Window_Males_TSSOnly)
WindowCounts_Males_TSSOnly <- WindowCounts_Males_TSSOnly[ filter.stat.TSS$filter > log2(3), ]

In this case I applied a composition bias normalization since I'm expecting a systematic change in binding, but generally does this approach make sense? Would it make sense if I took the complement set of the TSS regions (i.e.: all regions but the TSS)? And lastly, since we're interested in what's going on at enhancer sites, would it make sense to use peaks called from a separate H3K27ac ChIP in a readParam() setup like this?

ChIP csaw • 42 views

ADD COMMENT • link updated 2 hours ago by Aaron Lun ★ 28k • written 1 day ago by Luca • 0

score 0 · Answer 1 · 2025-01-31

Seems reasonable to filter out windows outside of the TSS if you don't care about those events. IIRC the discard option will ignore any fragment that is fully contained within the nonTSS regions, meaning that the associated windows get a count of zero and are mostly filtered out. However, there may be some fragments at the boundary of the TSS/non-TSS that won't get filtered out, so you'll probably still get a few windows that technically lie inside the nonTSS regions. If you want to ensure that every window lies inside a TSS, you can use overlapsAny() to filter the windows more accurately.

From a statistical perspective, this is also fine, as the TSS/non-TSS status of each window is prior knowledge that is independent of your data and experimental design.