Hi,
I'm trying to use dba.contrast and dba.analyze to find differential binding sites within specific regions (ie promoter region ranging from 1000bp upstream of TSS and 500 bp downstream of TSS for a set of genes). I realized that dba.count can use pre-defined regions with the peaks argument so I did something like the following:
# create dba object
TF.peaks <- dba(sampleSheet = "experiments.csv",peakCaller = "macs",peakFormat = "bed",scoreCol = 5 );
# create a GRanges object with specific region
genes <- read.delim('gene_list.txt', as.is = TRUE);
grange <- makeGRangesFromDataFrame(genes);
promoter <- promoters(grange, upstream = 1000, downstream = 500);
# count the reads in the pre-defined region before running dba.contrast and dba.analyze
TF.counts <- dba.count(DBA = TF.peaks, peaks = promoter, score = DBA_SCORE_RPKM);
I got no error message, however, my TF.counts chromosome regions are beyond the range of some chromosome sizes. For example:
range(TF.counts$peaks[[1]]$Start[TF.counts$peaks[[1]]$Chr == 'chr22'])
[1] 22082728 102264308
I double checked my input TF.peaks and promoter which all have the correct chr22 size
range(TF.peaks$peaks[[1]]$V2[TF.peaks$peaks[[1]]$V1 == 'chr22'])
[1] 17639813 51222046
promoter.sub <- promoter[promoter@seqnames == 'chr22',]
range(promoter.sub@ranges@start)
[1] 17081777 51221591
Any idea what's causing the issue? Or is there a better way to look at a differential binding site for specific regions? (I'm using R v3.3.1 and DiffBind v2.2.12)
Many thanks,
Sylvia
Thanks Rory! Just sent.
Best,
Sylvia