Question

ChIPpeakAnno overlap peaks with TSS returns more than TSS overlap

0

Entering edit mode

94133 • 0

@94133-14305

Last seen 4.8 years ago

USA, Stanford

I want ChIP peaks that overlap gene TSSs. However, output from ChIPpeakAnno returns peaks that do not overlap, which requires extra filtering. Is there a better way?

ChIP_peaks_annoTSS <- annotatePeakInBatch(res1_ChIP,
AnnotationData = genes(TxDb.Mmusculus.UCSC.mm10.knownGene),
output = "overlapping",
featureType = "TSS",
select = "all",
ignore.strand = TRUE,
FeatureLocForDistance = "TSS")
ChIP_peaks_annoTSS <- addGeneIDs(annotatedPeak=ChIP_peaks_annoTSS,
orgAnn = "org.Mm.eg.db",
feature_id_type = "entrez_id",
IDs2Add = "symbol") %>% as.data.frame()

fromOverlappingOrNearest column = Overlapping, when insideFeature shows inside or overlapEnd, which is NOT TSS.

So then, I filter from insideFeature column to get TSS overlaps, like:

TSSpatterns = c("overlapStart","includeFeature")
ChIP_peaks_annoTSS <- filter(ChIP_peaks_annoTSS, grepl(paste(TSSpatterns, collapse="|"), insideFeature))
ChIP_peaks_annoTSS_cond <- condenseMatrixByColnames(as.matrix(as.data.frame(ChIP_peaks_annoTSS)), "peak")

Can you show me the proper way?

Thanks!!!!

chippeakanno chipseq R • 2.0k views

ADD COMMENT • link updated 6.6 years ago by Ou, Jianhong ★ 1.3k • written 6.6 years ago by 94133 • 0

0

Entering edit mode

Could you please try the following code and see if that meets your need? Thanks!

tss <- promoters(TxDb.Mmusculus.UCSC.mm10.knownGene, upstream=0, downstream=1)

ChIP_peaks_annoTSS <- annotatePeakInBatch(res1_ChIP,
AnnotationData = tss,
output = "overlapping",
featureType = "TSS",
select = "all",
ignore.strand = TRUE,
FeatureLocForDistance = "TSS")

Best regards,

Julie

ADD REPLY • link 6.6 years ago Julie Zhu ★ 4.3k

score 0 · Answer 1 · 2018-09-19

0

Entering edit mode

Ou, Jianhong ★ 1.3k

@ou-jianhong-4539

Last seen 3 months ago

United States

Did you tried to set output = "upstream"?

ADD COMMENT • link 6.6 years ago Ou, Jianhong ★ 1.3k

0

Entering edit mode

No. Are you suggesting this is the best way to do this? I don't understand why one would use upstream for TSS overlap, can you explain?

Thanks!

ADD REPLY • link 6.6 years ago 94133 • 0

0

Entering edit mode

This will find the peaks overlap with the TSS because we set the maxgap=-1 and FeatureLocForDistance="TSS".

However, maybe this is not the answer of your biological question. Maybe you are asking to find the annotation for promoter region? If that is the case, please try to use set output="overlapping", FeatureLocForDistance="TSS" and bindingRegion = c(-5000, 3000). Here the bindingRegion means upstream 5K and downstream 3K of TSS.

ADD REPLY • link 6.6 years ago Ou, Jianhong ★ 1.3k

0

Entering edit mode

I tried your suggestion like this but get an error:

ChIP_peaks_annoTSS <- annotatePeakInBatch(res1_ChIP,

AnnotationData = genes(TxDb.Mmusculus.UCSC.mm10.knownGene),
output = "overlapping",
featureType = "TSS",
select = "all",
ignore.strand = TRUE,
FeatureLocForDistance = "TSS",
bindingRegion = c(-2000, 2000))
ChIP_peaks_annoTSS <- addGeneIDs(annotatedPeak=ChIP_peaks_annoTSS,
orgAnn = "org.Mm.eg.db",
feature_id_type = "entrez_id",
IDs2Add = "symbol")
ChIP_peaks_annoTSS_cond <- condenseMatrixByColnames(as.matrix(as.data.frame(ChIP_peaks_annoTSS)), "peak")

Error in data.frame(seqnames = as.factor(seqnames(x)), start = start(x), :
duplicate row.names: X12, X39, X45, X52, X67, X71, X137, X144, X179, X184, X215, X228, X232, X240, X244, X246, X255, X262, X265, X284, X287, X379, X384, X391, X393, X404, X420, X451, X533, X534, X536, X553, X556, X574, X575, X60 ... ... ...

ADD REPLY • link 6.6 years ago 94133 • 0

0

Entering edit mode

try:

ChIP_peaks_annoTSS_cond <- condenseMatrixByColnames(as.matrix(as.data.frame(unname(ChIP_peaks_annoTSS))), "peak")