How to left_join a tidySummarizedExperiment with a GRanges object by seqnames and start (tidyverse-style)?
0
0
Entering edit mode
Kateřina • 0
@5b0a26b7
Last seen 1 day ago
Czechia

Hello, I'm working with a ranged tidySummarizedExperiment and I have a separate GRanges object that contains metadata I'd like to integrate. I'd like to perform a left_join() between the two, matching by seqnames and start where the start values are identical between the two objects. Is there a way to perform this join in a tidyverse-native way (similar to joining two GRanges objects using plyranges), ideally without having to convert both the tidySummarizedExperiment and GRanges object to a tibble and then back again? I'd love to keep everything within the tidy grammar and not break the abstraction if possible. Using left_join() doesn't seem to work (probably because seqnames and start are view-only variables?) even after converting it to a tibble - but maybe I am just missing something.


library(SummarizedExperiment)
library(GenomicRanges)
library(tidyomics)
library(dplyr)

gr <- GRanges(
  seqnames = c("chr1", "chr1", "chr2"),
  ranges = IRanges(start = c(100, 200, 300), width = 50),
  strand = c("+", "-", "+")
)

assay_mat <- matrix(1:9, ncol = 3)
colnames(assay_mat) <- c("Sample1", "Sample2", "Sample3")

se <- SummarizedExperiment(
  assays = list(counts = assay_mat),
  rowRanges = gr
)

gr_annot <- GRanges(
  seqnames = c("chr1", "chr2"),
  ranges = IRanges(start = c(100, 300), width = 1),
  strand = c("+", "+"),
  gene_name = c("GeneA", "GeneB")
)

gr_annot_tb <- se |> left_join(as_tibble(gr_annot))


Error in `join_function()`:
`by` must be supplied when `x` and `y` have no common variables.
Use `cross_join()` to perform a cross-join.
Run `rlang::last_trace()` to see where the error occurred.

R version 4.4.2 (2024-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Sequoia 15.1.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Prague
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] nullranges_1.12.0               plyranges_1.26.0               
 [3] tidybulk_1.18.0                 tidyseurat_0.8.0               
 [5] SeuratObject_5.0.2              sp_2.2-0                       
 [7] tidySingleCellExperiment_1.16.0 SingleCellExperiment_1.28.1    
 [9] tidySummarizedExperiment_1.16.0 ttservice_0.4.1                
[11] ggplot2_3.5.1                   tidyr_1.3.1                    
[13] tidyomics_1.2.0                 dplyr_1.1.4                    
[15] SummarizedExperiment_1.36.0     Biobase_2.66.0                 
[17] GenomicRanges_1.58.0            GenomeInfoDb_1.42.3            
[19] IRanges_2.40.1                  S4Vectors_0.44.0               
[21] BiocGenerics_0.52.0             MatrixGenerics_1.18.1          
[23] matrixStats_1.5.0              

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3       rstudioapi_0.17.1        jsonlite_1.9.1          
  [4] magrittr_2.0.3           spatstat.utils_3.1-3     farver_2.1.2            
  [7] BiocIO_1.16.0            zlibbioc_1.52.0          vctrs_0.6.5             
 [10] ROCR_1.0-11              Rsamtools_2.22.0         spatstat.explore_3.3-4  
 [13] RCurl_1.98-1.16          htmltools_0.5.8.1        S4Arrays_1.6.0          
 [16] curl_6.2.1               SparseArray_1.6.2        sctransform_0.4.1       
 [19] parallelly_1.42.0        KernSmooth_2.23-26       htmlwidgets_1.6.4       
 [22] ica_1.0-3                plyr_1.8.9               plotly_4.10.4           
 [25] zoo_1.8-13               GenomicAlignments_1.42.0 igraph_2.1.4            
 [28] mime_0.13                lifecycle_1.0.4          pkgconfig_2.0.3         
 [31] Matrix_1.7-3             R6_2.6.1                 fastmap_1.2.0           
 [34] GenomeInfoDbData_1.2.13  fitdistrplus_1.2-2       future_1.34.0           
 [37] shiny_1.10.0             digest_0.6.37            colorspace_2.1-1        
 [40] patchwork_1.3.0          Seurat_5.2.1             tensor_1.5              
 [43] RSpectra_0.16-2          irlba_2.3.5.1            progressr_0.15.1        
 [46] fansi_1.0.6              spatstat.sparse_3.1-0    httr_1.4.7              
 [49] polyclip_1.10-7          abind_1.4-8              compiler_4.4.2          
 [52] withr_3.0.2              BiocParallel_1.40.0      fastDummies_1.7.5       
 [55] MASS_7.3-65              DelayedArray_0.32.0      rjson_0.2.23            
 [58] tools_4.4.2              lmtest_0.9-40            httpuv_1.6.15           
 [61] future.apply_1.11.3      goftest_1.2-3            glue_1.8.0              
 [64] InteractionSet_1.34.0    restfulr_0.0.15          nlme_3.1-167            
 [67] promises_1.3.2           grid_4.4.2               Rtsne_0.17              
 [70] cluster_2.1.8.1          reshape2_1.4.4           generics_0.1.3          
 [73] gtable_0.3.6             spatstat.data_3.1-6      tzdb_0.5.0              
 [76] preprocessCore_1.68.0    hms_1.1.3                data.table_1.17.0       
 [79] utf8_1.2.4               XVector_0.46.0           spatstat.geom_3.3-5     
 [82] RcppAnnoy_0.0.22         ggrepel_0.9.6            RANN_2.6.2              
 [85] pillar_1.10.1            stringr_1.5.1            spam_2.11-1             
 [88] RcppHNSW_0.6.0           later_1.4.1              splines_4.4.2           
 [91] lattice_0.22-6           rtracklayer_1.66.0       survival_3.8-3          
 [94] deldir_2.0-4             tidyselect_1.2.1         Biostrings_2.74.1       
 [97] miniUI_0.1.1.1           pbapply_1.7-2            gridExtra_2.3           
[100] scattermore_1.2          stringi_1.8.4            UCSC.utils_1.2.0        
[103] yaml_2.3.10              lazyeval_0.2.2           codetools_0.2-20        
[106] tibble_3.2.1             cli_3.6.4                uwot_0.2.3              
[109] xtable_1.8-4             reticulate_1.41.0.1      munsell_0.5.1           
[112] Rcpp_1.0.14              spatstat.random_3.3-2    globals_0.16.3          
[115] png_0.1-8                XML_3.99-0.18            spatstat.univar_3.1-2   
[118] parallel_4.4.2           ellipsis_0.3.2           readr_2.1.5             
[121] dotCall64_1.2            bitops_1.0-9             listenv_0.9.1           
[124] viridisLite_0.4.2        scales_1.3.0             ggridges_0.5.6          
[127] purrr_1.0.4              crayon_1.5.3             rlang_1.1.5             
[130] cowplot_1.1.3
GenomicRanges tidySummarizedExperiment tidyomics • 174 views
ADD COMMENT
0
Entering edit mode

Have you figured out how to do this using tidy grammar? Using non-tidy grammar is quite simple, and perhaps you already know how to do that. But if not, let us know. I can't help with tidy grammar, but I can help with conventional methods.

ADD REPLY

Login before adding your answer.

Traffic: 650 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6