Hello, I'm working with a ranged tidySummarizedExperiment and I have a separate GRanges object that contains metadata I'd like to integrate. I'd like to perform a left_join() between the two, matching by seqnames and start where the start values are identical between the two objects. Is there a way to perform this join in a tidyverse-native way (similar to joining two GRanges objects using plyranges), ideally without having to convert both the tidySummarizedExperiment and GRanges object to a tibble and then back again? I'd love to keep everything within the tidy grammar and not break the abstraction if possible. Using left_join() doesn't seem to work (probably because seqnames and start are view-only variables?) even after converting it to a tibble - but maybe I am just missing something.
library(SummarizedExperiment)
library(GenomicRanges)
library(tidyomics)
library(dplyr)
gr <- GRanges(
seqnames = c("chr1", "chr1", "chr2"),
ranges = IRanges(start = c(100, 200, 300), width = 50),
strand = c("+", "-", "+")
)
assay_mat <- matrix(1:9, ncol = 3)
colnames(assay_mat) <- c("Sample1", "Sample2", "Sample3")
se <- SummarizedExperiment(
assays = list(counts = assay_mat),
rowRanges = gr
)
gr_annot <- GRanges(
seqnames = c("chr1", "chr2"),
ranges = IRanges(start = c(100, 300), width = 1),
strand = c("+", "+"),
gene_name = c("GeneA", "GeneB")
)
gr_annot_tb <- se |> left_join(as_tibble(gr_annot))
Error in `join_function()`:
`by` must be supplied when `x` and `y` have no common variables.
Use `cross_join()` to perform a cross-join.
Run `rlang::last_trace()` to see where the error occurred.
R version 4.4.2 (2024-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Sequoia 15.1.1
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/Prague
tzcode source: internal
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] nullranges_1.12.0 plyranges_1.26.0
[3] tidybulk_1.18.0 tidyseurat_0.8.0
[5] SeuratObject_5.0.2 sp_2.2-0
[7] tidySingleCellExperiment_1.16.0 SingleCellExperiment_1.28.1
[9] tidySummarizedExperiment_1.16.0 ttservice_0.4.1
[11] ggplot2_3.5.1 tidyr_1.3.1
[13] tidyomics_1.2.0 dplyr_1.1.4
[15] SummarizedExperiment_1.36.0 Biobase_2.66.0
[17] GenomicRanges_1.58.0 GenomeInfoDb_1.42.3
[19] IRanges_2.40.1 S4Vectors_0.44.0
[21] BiocGenerics_0.52.0 MatrixGenerics_1.18.1
[23] matrixStats_1.5.0
loaded via a namespace (and not attached):
[1] RColorBrewer_1.1-3 rstudioapi_0.17.1 jsonlite_1.9.1
[4] magrittr_2.0.3 spatstat.utils_3.1-3 farver_2.1.2
[7] BiocIO_1.16.0 zlibbioc_1.52.0 vctrs_0.6.5
[10] ROCR_1.0-11 Rsamtools_2.22.0 spatstat.explore_3.3-4
[13] RCurl_1.98-1.16 htmltools_0.5.8.1 S4Arrays_1.6.0
[16] curl_6.2.1 SparseArray_1.6.2 sctransform_0.4.1
[19] parallelly_1.42.0 KernSmooth_2.23-26 htmlwidgets_1.6.4
[22] ica_1.0-3 plyr_1.8.9 plotly_4.10.4
[25] zoo_1.8-13 GenomicAlignments_1.42.0 igraph_2.1.4
[28] mime_0.13 lifecycle_1.0.4 pkgconfig_2.0.3
[31] Matrix_1.7-3 R6_2.6.1 fastmap_1.2.0
[34] GenomeInfoDbData_1.2.13 fitdistrplus_1.2-2 future_1.34.0
[37] shiny_1.10.0 digest_0.6.37 colorspace_2.1-1
[40] patchwork_1.3.0 Seurat_5.2.1 tensor_1.5
[43] RSpectra_0.16-2 irlba_2.3.5.1 progressr_0.15.1
[46] fansi_1.0.6 spatstat.sparse_3.1-0 httr_1.4.7
[49] polyclip_1.10-7 abind_1.4-8 compiler_4.4.2
[52] withr_3.0.2 BiocParallel_1.40.0 fastDummies_1.7.5
[55] MASS_7.3-65 DelayedArray_0.32.0 rjson_0.2.23
[58] tools_4.4.2 lmtest_0.9-40 httpuv_1.6.15
[61] future.apply_1.11.3 goftest_1.2-3 glue_1.8.0
[64] InteractionSet_1.34.0 restfulr_0.0.15 nlme_3.1-167
[67] promises_1.3.2 grid_4.4.2 Rtsne_0.17
[70] cluster_2.1.8.1 reshape2_1.4.4 generics_0.1.3
[73] gtable_0.3.6 spatstat.data_3.1-6 tzdb_0.5.0
[76] preprocessCore_1.68.0 hms_1.1.3 data.table_1.17.0
[79] utf8_1.2.4 XVector_0.46.0 spatstat.geom_3.3-5
[82] RcppAnnoy_0.0.22 ggrepel_0.9.6 RANN_2.6.2
[85] pillar_1.10.1 stringr_1.5.1 spam_2.11-1
[88] RcppHNSW_0.6.0 later_1.4.1 splines_4.4.2
[91] lattice_0.22-6 rtracklayer_1.66.0 survival_3.8-3
[94] deldir_2.0-4 tidyselect_1.2.1 Biostrings_2.74.1
[97] miniUI_0.1.1.1 pbapply_1.7-2 gridExtra_2.3
[100] scattermore_1.2 stringi_1.8.4 UCSC.utils_1.2.0
[103] yaml_2.3.10 lazyeval_0.2.2 codetools_0.2-20
[106] tibble_3.2.1 cli_3.6.4 uwot_0.2.3
[109] xtable_1.8-4 reticulate_1.41.0.1 munsell_0.5.1
[112] Rcpp_1.0.14 spatstat.random_3.3-2 globals_0.16.3
[115] png_0.1-8 XML_3.99-0.18 spatstat.univar_3.1-2
[118] parallel_4.4.2 ellipsis_0.3.2 readr_2.1.5
[121] dotCall64_1.2 bitops_1.0-9 listenv_0.9.1
[124] viridisLite_0.4.2 scales_1.3.0 ggridges_0.5.6
[127] purrr_1.0.4 crayon_1.5.3 rlang_1.1.5
[130] cowplot_1.1.3