UcscTrack and BED without headers
5
0
Entering edit mode
Andy91 ▴ 60
@andy91-8905
Last seen 3.1 years ago
Netherlands

Dear all,

I am trying to use Gviz to plot the H3K27Ac mark as retrieved from the UCSC Genome Browser. So I based my code off the Gviz tutorial for DataTrack-like tracks:

gcContent <- UcscTrack(genome = "mm9",
                       chromosome = "chrX",
                       track = "Conservation",
                       table = "phyloP30wayPlacental",
                       from = from,
                       to = to,
                       trackType = "DataTrack",
                       start = "start",
                       end = "end",
                       data = "score",
                       type = "mountain",
                       window = -1,
                       windowSize = 1500,
                       fill.histogram = "darkblue",
                       name = "Conservation")

This works fine. However, when I try my own data, I get the following error:

#Error in DataTrack(chromosome = "chr22", genome = "hg19", name = "H3K27Ac",  :
#  Columns indices for the data section are only allowed when 'range' is of class 'GRanges'

 

chr <- "chr22"
genome_version <- "hg19"
start <- 50528213
end <- 50528312

h3k27actrack <- UcscTrack(genome = genome_version,
                          chromosome = chr,
                          track = "Layered H3K27Ac",
                          table = "wgEncodeBroadHistoneGm12878H3k27acStdSig",
                          from = start,
                          to = end,
                          trackType = "DataTrack",
                          start = "start",
                          end = "end",
                          data = "score",
                          type = "hist",
                          window = -1,
                          windowSize = 1500,
                          fill.histogram = "black",
                          name = "H3K27Ac")

 

I suspect it has to do with the way the data is saved on the UCSC servers. If I look at the UCSC Table Browser for my specific region of interest (track: "Layered H3K27Ac", table: "wgEncodeBroadHistoneGm12878H3k27acStdSig"), I get the following table:


track type=wiggle_0 name="H1-hESC" description="H3K27Ac Mark (Often Found Near Regulatory Elements) on H1-hESC Cells from ENCODE"
#bedGraph section chr12:9198625-9231075
chr12    9217500    9217525    34.04
chr12    9217525    9217550    33.48
chr12    9217550    9217575    34.36
chr12    9217575    9217600    32.32
chr12    9217600    9217625    29.84
chr12    9217625    9217650    26.28
chr12    9217650    9217675    20.64

This table looks like some sort of BED format table but unfortunately does not have any column names. I believe therefore that

start = "start",
end = "end",
data = "score",


will not be able to find anything. Is there a way to force UcscTrack to read the "start" as the column 2, "end" as column 3 and "data" as column 4?

sessionInfo()

# R version 3.2.3 (2015-12-10)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 15.04
#
# locale:
#   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=nl_NL.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=nl_NL.UTF-8  
# [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=nl_NL.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C           
# [11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C      
#
# attached base packages:
#   [1] grid      stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base    
#
# other attached packages:
#   [1] Gviz_1.14.0          BiocInstaller_1.20.1 derfinder_1.5.19     GenomicRanges_1.22.3 GenomeInfoDb_1.6.1   IRanges_2.4.6        S4Vectors_0.8.6    
# [8] ggbio_1.18.1         ggplot2_2.0.0        BiocGenerics_0.16.1  devtools_1.9.1     
#
# loaded via a namespace (and not attached):
#   [1] Biobase_2.30.0             httr_1.0.0                 splines_3.2.3              foreach_1.4.3              GenomicFiles_1.6.2       
# [6] Formula_1.2-1              bumphunter_1.10.0          latticeExtra_0.6-26        doRNG_1.6                  RBGL_1.46.0              
# [11] BSgenome_1.38.0            Rsamtools_1.22.0           RSQLite_1.0.0              lattice_0.20-33            biovizBase_1.18.0        
# [16] digest_0.6.8               RColorBrewer_1.1-2         XVector_0.10.0             qvalue_2.2.0               colorspace_1.2-6         
# [21] Matrix_1.2-3               plyr_1.8.3                 OrganismDbi_1.12.1         XML_3.98-1.3               biomaRt_2.26.1           
# [26] zlibbioc_1.16.0            xtable_1.8-0               scales_0.3.0               BiocParallel_1.4.3         pkgmaker_0.22            
# [31] SummarizedExperiment_1.0.2 GenomicFeatures_1.22.7     nnet_7.3-11                survival_2.38-3            magrittr_1.5             
# [36] memoise_0.2.1              GGally_1.0.0               foreign_0.8-66             graph_1.48.0               tools_3.2.3              
# [41] registry_0.3               matrixStats_0.50.1         stringr_1.0.0              locfit_1.5-9.1             munsell_0.4.2            
# [46] cluster_2.0.3              rngtools_1.2.4             AnnotationDbi_1.32.3       lambda.r_1.1.7             Biostrings_2.38.3        
# [51] futile.logger_1.4.1        RCurl_1.95-4.7             dichromat_2.0-0            iterators_1.0.8            VariantAnnotation_1.16.4 
# [56] bitops_1.0-6               derfinderHelper_1.4.1      gtable_0.1.2               codetools_0.2-14           DBI_0.3.1                
# [61] reshape_0.8.5              curl_0.9.4                 reshape2_1.4.1             R6_2.1.1                   GenomicAlignments_1.6.3  
# [66] gridExtra_2.0.0            knitr_1.11                 rtracklayer_1.30.1         Hmisc_3.17-1               futile.options_1.0.0     
# [71] stringi_1.0-1              Rcpp_0.12.2                rpart_4.1-10               acepack_1.3-3.3           

 

 

gviz BED Ucsc UcscTrack • 1.9k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 18 hours ago
United States

Your call to UcscTrack() differs from what you did at the genome browser, which is critical to understanding what the problem is. If you used chr22:50528213-50528312 in the genome browser, you would have been able to diagnose this yourself, as there are no histone marks in that range, and you are returned no data. If you increase the range, you get results:

> start <- 50510000
> end <- 50530000
> h3k27actrack <- UcscTrack(genome = genome_version,
                          chromosome = chr,
                          track = "Layered H3K27Ac",
                          table = "wgEncodeBroadHistoneGm12878H3k27acStdSig",
                          from = start,
                          to = end,
                          trackType = "DataTrack",
                          start = "start",
                          end = "end",
                          data = "score",
                          type = "hist",
                          window = -1,
                          windowSize = 1500,
                          fill.histogram = "black",
                          name = "H3K27Ac")

> h3k27actrack
DataTrack 'H3K27Ac'
| genome: hg19
| active chromosome: chr22
| positions: 305
| samples:1
| strand: *

> range(h3k27actrack)
IRanges of length 305
         start      end width
[1]   50510601 50510625    25
[2]   50510626 50510650    25
[3]   50510651 50510675    25
[4]   50510676 50510700    25
[5]   50510701 50510725    25
...        ...      ...   ...
[301] 50528701 50528725    25
[302] 50528726 50528750    25
[303] 50528751 50528775    25
[304] 50528776 50528800    25
[305] 50528801 50528825    25
ADD COMMENT
1
Entering edit mode
@florianhahnenovartiscom-3784
Last seen 6.2 years ago
Switzerland

I guess that one would still wants to get back an empty DataTrack in these cases, rather than a cryptic error message. I'll take a look at what is causing the error an fix for the next release. 

ADD COMMENT
1
Entering edit mode
@florianhahnenovartiscom-3784
Last seen 6.2 years ago
Switzerland

Patched in both devel and release branch. Should become available soon with the next update.

ADD COMMENT
0
Entering edit mode
Andy91 ▴ 60
@andy91-8905
Last seen 3.1 years ago
Netherlands

I see my problem now, I should have paid more attention to the Browser.  If an empty DataTrack could be returned, that would be perfect.

Thanks for the answers!

ADD COMMENT
0
Entering edit mode
Andy91 ▴ 60
@andy91-8905
Last seen 3.1 years ago
Netherlands

Thanks for the great work!

ADD COMMENT

Login before adding your answer.

Traffic: 833 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6