Extracting Score/Coverage From GRange Object
1
0
Entering edit mode
vanbelj ▴ 30
@vanbelj-21216
Last seen 9 months ago
United States

Overall Goal: Use ggplot to create individual gene plots of average signal ± SEM for biological replicates of a given gene. After I finishing troubleshooting code, I'll create a function/loop to take a csv of gene names and return plots for each gene.

I'm using rtracklayer to import the region of the BigWig file corresponding to the gene of interest, in this case a 1938 bp section of Chromosome XII. Right now I'm ignoring strand information, but this will be coded eventually.

Issue: The extracted list containing the coverage information is not always the expected length (1938 bp). I believe this is due to the way I create Views, selecting regions that contain coverage greater than 0 (some replicates have more than 1 View - see Rep2). However, I cannot find a way to create the View explicitly for a given range (Chromosome XII, 805387 - 807325).

I'm new to GenomicRanges and the associated data structures. Please point out if you see a better way of accomplishing my overall goal or changes to the way I'm currently importing and selecting the the range of interest.

Thanks for any help!

library(rtracklayer)
library(trackViewer)

#Import region of bigwig file for desired gene
Control_Rep1 <- import("/path/to/file/B28-JV31_S31_R1_001_MaskedCPM.bigwig", format="BigWig",
                   which=GRanges("chrXII",
                                 IRanges(805387, 807325)))

Control_Rep2 <- import("/path/to/file/B28-JV32_S32_R1_001_MaskedCPM.bigwig", format="BigWig",
                   which=GRanges("chrXII",
                                 IRanges(805387, 807325)))

Control_Rep3 <- import("/path/to/file/B28-JV33_S33_R1_001_MaskedCPM.bigwig", format="BigWig",
                   which=GRanges("chrXII",
                                 IRanges(805387, 807325)))

#Get coverage information
cRep1_cov <- coverage(Control_Rep1, weight = "score")
cRep2_cov <- coverage(Control_Rep2, weight = "score")
cRep3_cov <- coverage(Control_Rep3, weight = "score")

#Create views for regions with coverage > 0
cRep1_view <- Views(cRep1_cov, cRep1_cov > 0)
cRep2_view <- Views(cRep2_cov, cRep2_cov > 0)
cRep3_view <- Views(cRep3_cov, cRep3_cov > 0)

#Extract values as a list.  [[#A]][[#B]] where #A is the chromosome and #B is the view number
cRep1_Values <- as.list(cRep1_view[[12]][[1]])
cRep2_Values <- as.list(cRep2_view[[12]][[1]])
cRep3_Values <- as.list(cRep3_view[[12]][[1]])

> length(cRep1_Values)
[1] 1939
> length(cRep2_Values)
[1] 1239
> length(cRep3_Values)
[1] 1939

> length(cRep1_view$chrXII)
[1] 1
> length(cRep2_view$chrXII)
[1] 2
> length(cRep3_view$chrXII)
[1] 1
> sessionInfo( )
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.6.5

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] grid      stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] trackViewer_1.28.1   Rcpp_1.0.8.3         rtracklayer_1.52.1   GenomicRanges_1.44.0 GenomeInfoDb_1.28.4 
[6] IRanges_2.26.0       S4Vectors_0.30.2     BiocGenerics_0.38.0 

loaded via a namespace (and not attached):
  [1] colorspace_2.0-3            rjson_0.2.21                ellipsis_0.3.2              biovizBase_1.40.0          
  [5] htmlTable_2.4.0             XVector_0.32.0              base64enc_0.1-3             dichromat_2.0-0            
  [9] rstudioapi_0.13             bit64_4.0.5                 AnnotationDbi_1.54.1        fansi_1.0.3                
 [13] xml2_1.3.3                  splines_4.1.0               cachem_1.0.6                knitr_1.38                 
 [17] Formula_1.2-4               Rsamtools_2.8.0             cluster_2.1.3               dbplyr_2.1.1               
 [21] png_0.1-7                   grImport_0.9-5              graph_1.70.0                compiler_4.1.0             
 [25] httr_1.4.2                  backports_1.4.1             assertthat_0.2.1            Matrix_1.4-1               
 [29] fastmap_1.1.0               lazyeval_0.2.2              cli_3.2.0                   htmltools_0.5.2            
 [33] prettyunits_1.1.1           tools_4.1.0                 gtable_0.3.0                glue_1.6.2                 
 [37] GenomeInfoDbData_1.2.6      dplyr_1.0.8                 rappdirs_0.3.3              Biobase_2.52.0             
 [41] vctrs_0.4.0                 Biostrings_2.60.2           rhdf5filters_1.4.0          xfun_0.30                  
 [45] stringr_1.4.0               lifecycle_1.0.1             restfulr_0.0.13             ensembldb_2.16.4           
 [49] XML_3.99-0.9                InteractionSet_1.20.0       zlibbioc_1.38.0             scales_1.1.1               
 [53] BSgenome_1.60.0             VariantAnnotation_1.38.0    hms_1.1.1                   MatrixGenerics_1.4.3       
 [57] ProtGenerics_1.24.0         SummarizedExperiment_1.22.0 rhdf5_2.36.0                AnnotationFilter_1.16.0    
 [61] RColorBrewer_1.1-3          yaml_2.3.5                  curl_4.3.2                  memoise_2.0.1              
 [65] gridExtra_2.3               ggplot2_3.3.5               biomaRt_2.48.3              rpart_4.1.16               
 [69] latticeExtra_0.6-29         stringi_1.7.6               RSQLite_2.2.12              BiocIO_1.2.0               
 [73] plotrix_3.8-2               checkmate_2.0.0             GenomicFeatures_1.44.2      filelock_1.0.2             
 [77] BiocParallel_1.26.2         rlang_1.0.2                 pkgconfig_2.0.3             matrixStats_0.61.0         
 [81] bitops_1.0-7                lattice_0.20-45             purrr_0.3.4                 Rhdf5lib_1.14.2            
 [85] GenomicAlignments_1.28.0    htmlwidgets_1.5.4           bit_4.0.4                   tidyselect_1.1.2           
 [89] magrittr_2.0.3              R6_2.5.1                    generics_0.1.2              Hmisc_4.6-0                
 [93] DelayedArray_0.18.0         DBI_1.1.2                   pillar_1.7.0                foreign_0.8-82             
 [97] survival_3.3-1              KEGGREST_1.32.0             RCurl_1.98-1.6              nnet_7.3-17                
[101] tibble_3.1.6                crayon_1.5.1                utf8_1.2.2                  BiocFileCache_2.0.0        
[105] jpeg_0.1-9                  progress_1.2.2              data.table_1.14.2           blob_1.2.2                 
[109] Rgraphviz_2.36.0            digest_0.6.29               munsell_0.5.0               Gviz_1.36.2
IRanges rtracklayer GenomicRanges • 1.9k views
ADD COMMENT
0
Entering edit mode

Added response as an Answer so thread can be closed.

ADD REPLY
1
Entering edit mode
vanbelj ▴ 30
@vanbelj-21216
Last seen 9 months ago
United States

Figured this out with a much simpler code. Still open to suggestions on how to easily generate these plots, but the code below at least gets me to extracting the per-base coverage information from a range of interest.

library(rtracklayer)
library(trackViewer)

Control_Rep1 <- import("/Volumes/EasyStore/Brickner28/BigwigMaskedCPM_Yeast/B28-JV31_S31_R1_001_MaskedCPM.bigwig", format="BigWig",
                   which=GRanges("chrXII",
                                 IRanges(805387, 807325)))

Control_Rep2 <- import("/Volumes/EasyStore/Brickner28/BigwigMaskedCPM_Yeast/B28-JV32_S32_R1_001_MaskedCPM.bigwig", format="BigWig",
                   which=GRanges("chrXII",
                                 IRanges(805387, 807325)))

Control_Rep3 <- import("/Volumes/EasyStore/Brickner28/BigwigMaskedCPM_Yeast/B28-JV33_S33_R1_001_MaskedCPM.bigwig", format="BigWig",
                   which=GRanges("chrXII",
                                 IRanges(805387, 807325)))

#Get Coverage Information - NEW METHOD
cRep1_cov_vB <- rep(Control_Rep1$score, width(Control_Rep1))
cRep2_cov_vB <- rep(Control_Rep2$score, width(Control_Rep2))
cRep3_cov_vB <- rep(Control_Rep3$score, width(Control_Rep3))

> length(cRep1_cov_vB)
[1] 1939
> length(cRep2_cov_vB)
[1] 1939
> length(cRep3_cov_vB)
[1] 1939
ADD COMMENT

Login before adding your answer.

Traffic: 868 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6