How to left_join a tidySummarizedExperiment with a GRanges object by seqnames and start (tidyverse-style)?
0
0
Entering edit mode
Kateřina • 0
@5b0a26b7
Last seen 3 hours ago
Czechia

Hello, I'm working with a ranged tidySummarizedExperiment and I have a separate GRanges object that contains metadata I'd like to integrate. I'd like to perform a left_join() between the two, matching by seqnames and start where the start values are identical between the two objects. Is there a way to perform this join in a tidyverse-native way (similar to joining two GRanges objects using plyranges), ideally without having to convert both the tidySummarizedExperiment and GRanges object to a tibble and then back again? I'd love to keep everything within the tidy grammar and not break the abstraction if possible. Using left_join() doesn't seem to work (probably because seqnames and start are view-only variables?) even after converting it to a tibble - but maybe I am just missing something.


library(SummarizedExperiment)
library(GenomicRanges)
library(tidyomics)
library(dplyr)

gr <- GRanges(
  seqnames = c("chr1", "chr1", "chr2"),
  ranges = IRanges(start = c(100, 200, 300), width = 50),
  strand = c("+", "-", "+")
)

assay_mat <- matrix(1:9, ncol = 3)
colnames(assay_mat) <- c("Sample1", "Sample2", "Sample3")

se <- SummarizedExperiment(
  assays = list(counts = assay_mat),
  rowRanges = gr
)

gr_annot <- GRanges(
  seqnames = c("chr1", "chr2"),
  ranges = IRanges(start = c(100, 300), width = 1),
  strand = c("+", "+"),
  gene_name = c("GeneA", "GeneB")
)

gr_annot_tb <- se |> left_join(as_tibble(gr_annot))


Error in `join_function()`:
`by` must be supplied when `x` and `y` have no common variables.
Use `cross_join()` to perform a cross-join.
Run `rlang::last_trace()` to see where the error occurred.

R version 4.4.2 (2024-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Sequoia 15.1.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Prague
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] nullranges_1.12.0               plyranges_1.26.0               
 [3] tidybulk_1.18.0                 tidyseurat_0.8.0               
 [5] SeuratObject_5.0.2              sp_2.2-0                       
 [7] tidySingleCellExperiment_1.16.0 SingleCellExperiment_1.28.1    
 [9] tidySummarizedExperiment_1.16.0 ttservice_0.4.1                
[11] ggplot2_3.5.1                   tidyr_1.3.1                    
[13] tidyomics_1.2.0                 dplyr_1.1.4                    
[15] SummarizedExperiment_1.36.0     Biobase_2.66.0                 
[17] GenomicRanges_1.58.0            GenomeInfoDb_1.42.3            
[19] IRanges_2.40.1                  S4Vectors_0.44.0               
[21] BiocGenerics_0.52.0             MatrixGenerics_1.18.1          
[23] matrixStats_1.5.0              

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3       rstudioapi_0.17.1        jsonlite_1.9.1          
  [4] magrittr_2.0.3           spatstat.utils_3.1-3     farver_2.1.2            
  [7] BiocIO_1.16.0            zlibbioc_1.52.0          vctrs_0.6.5             
 [10] ROCR_1.0-11              Rsamtools_2.22.0         spatstat.explore_3.3-4  
 [13] RCurl_1.98-1.16          htmltools_0.5.8.1        S4Arrays_1.6.0          
 [16] curl_6.2.1               SparseArray_1.6.2        sctransform_0.4.1       
 [19] parallelly_1.42.0        KernSmooth_2.23-26       htmlwidgets_1.6.4       
 [22] ica_1.0-3                plyr_1.8.9               plotly_4.10.4           
 [25] zoo_1.8-13               GenomicAlignments_1.42.0 igraph_2.1.4            
 [28] mime_0.13                lifecycle_1.0.4          pkgconfig_2.0.3         
 [31] Matrix_1.7-3             R6_2.6.1                 fastmap_1.2.0           
 [34] GenomeInfoDbData_1.2.13  fitdistrplus_1.2-2       future_1.34.0           
 [37] shiny_1.10.0             digest_0.6.37            colorspace_2.1-1        
 [40] patchwork_1.3.0          Seurat_5.2.1             tensor_1.5              
 [43] RSpectra_0.16-2          irlba_2.3.5.1            progressr_0.15.1        
 [46] fansi_1.0.6              spatstat.sparse_3.1-0    httr_1.4.7              
 [49] polyclip_1.10-7          abind_1.4-8              compiler_4.4.2          
 [52] withr_3.0.2              BiocParallel_1.40.0      fastDummies_1.7.5       
 [55] MASS_7.3-65              DelayedArray_0.32.0      rjson_0.2.23            
 [58] tools_4.4.2              lmtest_0.9-40            httpuv_1.6.15           
 [61] future.apply_1.11.3      goftest_1.2-3            glue_1.8.0              
 [64] InteractionSet_1.34.0    restfulr_0.0.15          nlme_3.1-167            
 [67] promises_1.3.2           grid_4.4.2               Rtsne_0.17              
 [70] cluster_2.1.8.1          reshape2_1.4.4           generics_0.1.3          
 [73] gtable_0.3.6             spatstat.data_3.1-6      tzdb_0.5.0              
 [76] preprocessCore_1.68.0    hms_1.1.3                data.table_1.17.0       
 [79] utf8_1.2.4               XVector_0.46.0           spatstat.geom_3.3-5     
 [82] RcppAnnoy_0.0.22         ggrepel_0.9.6            RANN_2.6.2              
 [85] pillar_1.10.1            stringr_1.5.1            spam_2.11-1             
 [88] RcppHNSW_0.6.0           later_1.4.1              splines_4.4.2           
 [91] lattice_0.22-6           rtracklayer_1.66.0       survival_3.8-3          
 [94] deldir_2.0-4             tidyselect_1.2.1         Biostrings_2.74.1       
 [97] miniUI_0.1.1.1           pbapply_1.7-2            gridExtra_2.3           
[100] scattermore_1.2          stringi_1.8.4            UCSC.utils_1.2.0        
[103] yaml_2.3.10              lazyeval_0.2.2           codetools_0.2-20        
[106] tibble_3.2.1             cli_3.6.4                uwot_0.2.3              
[109] xtable_1.8-4             reticulate_1.41.0.1      munsell_0.5.1           
[112] Rcpp_1.0.14              spatstat.random_3.3-2    globals_0.16.3          
[115] png_0.1-8                XML_3.99-0.18            spatstat.univar_3.1-2   
[118] parallel_4.4.2           ellipsis_0.3.2           readr_2.1.5             
[121] dotCall64_1.2            bitops_1.0-9             listenv_0.9.1           
[124] viridisLite_0.4.2        scales_1.3.0             ggridges_0.5.6          
[127] purrr_1.0.4              crayon_1.5.3             rlang_1.1.5             
[130] cowplot_1.1.3
GenomicRanges tidySummarizedExperiment tidyomics • 23 views
ADD COMMENT

Login before adding your answer.

Traffic: 793 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6