calculating TSS enrichment from ATACseq data
0
0
Entering edit mode
tangming2005 ▴ 200
@tangming2005-6754
Last seen 8 weeks ago
United States

How to calculate the TSS enrichment score according to the ENCODE standard https://www.encodeproject.org/data-standards/terms/#enrichment?

I was reading the source code https://github.com/jianhong/ATACseqQC/blob/master/R/TSSEscore.R#L80 but think it is not doing it correctly.

library(TxDb.Hsapiens.UCSC.hg19.knownGene)

txs <- transcripts(TxDb.Hsapiens.UCSC.hg19.knownGene)
sel.center <- promoters(txs, upstream =  1000 downstream = 1000)
sel.left.flank <- flank(sel.center, width=endFlank, both=FALSE)
sel.right.flank <- flank(sel.center, width=endFlank, start=FALSE, both = FALSE)

cvg<-cov(gr)

sel.center<- as(sel.center, "IntegerRangesList")
vws.center <- Views(cvg, sel.center)

vws.center$chr9
Views on a 140500037-length Rle subject

views:
           start       end width
   [1]     10987     12986  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [2]     71698     73697  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [3]    213865    215864  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [4]    213865    215864  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [5]    272048    274047  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [6]    272048    274047  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [7]    272048    274047  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [8]    272048    274047  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [9]    364591    366590  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   ...       ...       ...   ... ...
[2420] 140458452 140460451  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
[2421] 140472388 140474387  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
[2422] 140458607 140460606  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
[2423] 140472388 140474387  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
[2424] 140472388 140474387  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
[2425] 140472374 140474373  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
[2426] 140483938 140485937  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
[2427] 140512309 140514308  2000 [ ]
[2428] 140786023 140788022  2000 [ ]

> vws.center$chr9[[2428]]
Error in getListElement(x, i, ...) : view is out of limits
> vws.center$chr9[[2426]]
integer-Rle of length 2000 with 3 runs
  Lengths: 969  78 953
  Values :   0   1   0
> vws.center$chr9[[2427]]
Error in getListElement(x, i, ...) : view is out of limits
> vws.center$chr9[[2425]]
integer-Rle of length 2000 with 7 runs
  Lengths: 782 186  11  13   7 173 828
  Values :   0   1   0   1   2   1   0

I want to get the matrix out and better to keep the tx id for each row (how should I do it?) but this gives me an error:

sapply(1:length(vws.center$chr9), function(i) as.numeric(vws.center$chr9[[i]]))
Error in getListElement(x, i, ...) : view is out of limits

essentially, I want a matrix of #tss x 2000, 2kb around the tss and each value is the number of read counts overlap with that base. (I also want to flip the rows of the matrix when the strand is -)

BTW, I have read https://support.bioconductor.org/p/56390/ very old thread and https://stat.ethz.ch/pipermail/bioc-devel/2016-June/009446.html

Thanks very much.

GenomicRanges Views ATACseq TSS • 1.4k views
ADD COMMENT
0
Entering edit mode

or if there is any package there for this purpose, I want to take a look at the source code and learn.

ADD REPLY
0
Entering edit mode

genomation has some functions https://bioconductor.org/packages/devel/bioc/vignettes/genomation/inst/doc/GenomationManual.html I am reading the source code. Thanks.

ADD REPLY

Login before adding your answer.

Traffic: 461 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6