Using genomicRanges / plyranges to calculate score of peaks within intervals
0
0
Entering edit mode
@alorsonmethyle-16837
Last seen 3.6 years ago
Norway

Hi there

I'd like to know whether a straightforward way existed by using GenomicRanges and maybe plyranges to group peaks of a dataset (gr.unique) into the tiles of another one (tiles)

Here is my code and my attempts

df_chr22_meth = data.frame(seqname=c('chr22'),
start=c(1,10,100,1,10,100),
 end=c(1,10,100,1,10,100),
 strand=c('+','+','+','-','-','-'), 
score=c(30,33,32,90,95,98))

gr.unique = makeGRangesFromDataFrame(df_chr22_meth, keep.extra.columns = T)

tiles=tile(range(df_chr22_meth_fake),width = 10)

grouping = unlist(tiles) %>% group_by_overlaps(gr.unique)

grouping %>% mutate(mean_o = mean(score))

GRanges object with 12 ranges and 3 metadata columns:
Groups: query [4]
       seqnames    ranges strand |     score     query    mean_o
          <Rle> <IRanges>  <Rle> | <numeric> <integer> <numeric>
   [1]    chr22      1-10      + |        30         1        62
   [2]    chr22      1-10      + |        90         1        62
   [3]    chr22      1-10      + |        33         1        62
   [4]    chr22      1-10      + |        95         1        62
   [5]    chr22    91-100      + |        32        10        65
   ...      ...       ...    ... .       ...       ...       ...
   [8]    chr22      1-10      - |        90        11        62
   [9]    chr22      1-10      - |        33        11        62
  [10]    chr22      1-10      - |        95        11        62
  [11]    chr22    91-100      - |        32        20        65
  [12]    chr22    91-100      - |        98        20        65
  -------
  seqinfo: 1 sequence from an unspecified genome; no seqlengths

## it actually combines them but lost the strand in the process without me finding a way to avoid it !

sessionInfo( )
R version 4.0.5 (2021-03-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252    LC_MONETARY=French_France.1252
[4] LC_NUMERIC=C                   LC_TIME=French_France.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] plyranges_1.10.0                  stringr_1.4.0                     BSgenome.Hsapiens.UCSC.hg38_1.4.3
 [4] BSgenome_1.58.0                   rtracklayer_1.49.5                Biostrings_2.58.0                
 [7] XVector_0.30.0                    dplyr_1.0.5                       GenomicRanges_1.42.0             
[10] GenomeInfoDb_1.26.7               IRanges_2.24.1                    S4Vectors_0.28.1                 
[13] BiocGenerics_0.36.1              

loaded via a namespace (and not attached):
 [1] SummarizedExperiment_1.20.0 tidyselect_1.1.1            purrr_0.3.4                 lattice_0.20-41            
 [5] vctrs_0.3.7                 generics_0.1.0              expm_0.999-6                utf8_1.2.1                 
 [9] XML_3.99-0.6                rlang_0.4.11                e1071_1.7-6                 pillar_1.6.0               
[13] glue_1.4.2                  DBI_1.1.1                   BiocParallel_1.24.1         matrixStats_0.58.0         
[17] GenomeInfoDbData_1.2.4      rootSolve_1.8.2.1           lifecycle_1.0.0             zlibbioc_1.36.0            
[21] MatrixGenerics_1.2.1        mvtnorm_1.1-1               Biobase_2.50.0              lmom_2.8                   
[25] class_7.3-18                fansi_0.4.2                 Rcpp_1.0.6                  DelayedArray_0.16.3        
[29] Rsamtools_2.6.0             gld_2.6.2                   Exact_2.1                   stringi_1.5.3              
[33] grid_4.0.5                  tools_4.0.5                 bitops_1.0-7                magrittr_2.0.1             
[37] DescTools_0.99.41           RCurl_1.98-1.3              proxy_0.4-25                tibble_3.1.1               
[41] crayon_1.4.1                pkgconfig_2.0.3             MASS_7.3-53.1               ellipsis_0.3.1             
[45] Matrix_1.3-2                data.table_1.14.0           assertthat_0.2.1            rstudioapi_0.13            
[49] R6_2.5.0                    boot_1.3-27                 GenomicAlignments_1.26.0    compiler_4.0.5

I did find a way from there : summarize scores of GRanges into bins I was just feeling that it was fairly convoluted for a basic operation and that it might now exists better ways ?

thanks a lot

plyranges GenomicRanges • 879 views
ADD COMMENT

Login before adding your answer.

Traffic: 435 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6