Hi.
I`m trying to perform a DTE test using Swish as per fishpond vignette, but i would like to collapse technical replicates for some samples using the collapseReplicates function from DESeq2 package. First i read in the Salmon(v1.6.0) quantification data using the tximeta package and then try to run the collapseReplicates function on the resulting object. Unfortunately i get an error message for which i can not find any solution in the documentation or online forums (see code and output below). However, when i summarizeToGene and convert to DESeqDataSet, everything seems to be working perfectly fine in regards to the collapsing of technical replicates. From the collapseReplicates documentation it seems that it should work perfectly fine with tximeta output. What course of action should i take to collapse the replicates or am i mistaken and it's not possible unless summarized to gene level counts?
Thank you in advance.
> head(coldata, c(5,4))
names ID group run
1 V300082504_L01_73 2T T 73
2 V300082504_L01_75 3T T 75
3 V300082504_L01_77 4T T 77
4 V300082504_L01_78 4T T 78
5 V300095289_L01_100 6T T 100
> se = tximeta(coldata, type = 'salmon', )
importing quantifications
reading in files with read_tsv
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
found matching transcriptome:
[ GENCODE - Homo sapiens - release 38 ]
loading existing TxDb created: 2021-11-13 00:50:22
loading existing transcript ranges created: 2021-11-13 00:50:23
fetching genome info for GENCODE
> se
class: RangedSummarizedExperiment
dim: 236186 37
metadata(6): tximetaInfo quantInfo ... txomeInfo txdbInfo
assays(23): counts abundance ... infRep19 infRep20
rownames(236186): ENST00000456328.2 ENST00000450305.2 ... ENST00000387460.2 ENST00000387461.2
rowData names(3): tx_id gene_id tx_name
colnames(37): V300082504_L01_73 V300082504_L01_75 ... V300095289_L01_101 V300095289_L01_103
colData names(4): names ID group run
> se = collapseReplicates(object = se,
+ groupby = se$ID,
+ run = se$run,
+ renameCols = T)
Error in collapseReplicates(object = se, groupby = se$ID, run = se$run, :
sum(as.numeric(assay(object))) == sum(as.numeric(assay(collapsed))) is not TRUE
Session info:
> sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Matrix products: default
locale:
[1] LC_COLLATE=English_Europe.utf8 LC_CTYPE=English_Europe.utf8 LC_MONETARY=English_Europe.utf8 LC_NUMERIC=C
[5] LC_TIME=English_Europe.utf8
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] fishpond_2.4.1 GenomicFeatures_1.50.4 AnnotationDbi_1.60.0 DESeq2_1.38.3 SummarizedExperiment_1.28.0
[6] Biobase_2.58.0 MatrixGenerics_1.10.0 matrixStats_0.63.0 GenomicRanges_1.50.2 GenomeInfoDb_1.34.8
[11] IRanges_2.32.0 S4Vectors_0.36.1 BiocGenerics_0.44.0 tximeta_1.16.0 stringr_1.5.0
loaded via a namespace (and not attached):
[1] minqa_1.2.5 colorspace_2.0-3 rjson_0.2.21 ellipsis_0.3.2
[5] qvalue_2.30.0 XVector_0.38.0 rstudioapi_0.14 bit64_4.0.5
[9] interactiveDisplayBase_1.36.0 fansi_1.0.4 xml2_1.3.3 codetools_0.2-19
[13] splines_4.2.2 tximport_1.26.1 cachem_1.0.6 geneplotter_1.76.0
[17] jsonlite_1.8.4 nloptr_2.0.3 Rsamtools_2.14.0 annotate_1.76.0
[21] dbplyr_2.3.0 png_0.1-8 pheatmap_1.0.12 shiny_1.7.4
[25] readr_2.1.3 BiocManager_1.30.19 compiler_4.2.2 httr_1.4.4
[29] assertthat_0.2.1 Matrix_1.5-3 fastmap_1.1.0 lazyeval_0.2.2
[33] cli_3.6.0 later_1.3.0 htmltools_0.5.4 prettyunits_1.1.1
[37] tools_4.2.2 gtable_0.3.1 glue_1.6.2 GenomeInfoDbData_1.2.9
[41] reshape2_1.4.4 dplyr_1.1.0 rappdirs_0.3.3 Rcpp_1.0.10
[45] vctrs_0.5.2 Biostrings_2.66.0 nlme_3.1-162 rtracklayer_1.58.0
[49] lme4_1.1-31 mime_0.12 lifecycle_1.0.3 restfulr_0.0.15
[53] ensembldb_2.22.0 gtools_3.9.4 XML_3.99-0.13 AnnotationHub_3.6.0
[57] zlibbioc_1.44.0 MASS_7.3-58.2 scales_1.2.1 vroom_1.6.1
[61] hms_1.1.2 promises_1.2.0.1 ProtGenerics_1.30.0 parallel_4.2.2
[65] AnnotationFilter_1.22.0 RColorBrewer_1.1-3 SingleCellExperiment_1.20.0 yaml_2.3.7
[69] curl_5.0.0 memoise_2.0.1 ggplot2_3.4.0 biomaRt_2.54.0
[73] stringi_1.7.12 RSQLite_2.2.20 BiocVersion_3.16.0 BiocIO_1.8.0
[77] filelock_1.0.2 boot_1.3-28.1 BiocParallel_1.32.5 rlang_1.0.6
[81] pkgconfig_2.0.3 bitops_1.0-7 archive_1.1.5 lattice_0.20-45
[85] purrr_1.0.1 GenomicAlignments_1.34.0 bit_4.0.5 tidyselect_1.2.0
[89] plyr_1.8.8 magrittr_2.0.3 R6_2.5.1 generics_0.1.3
[93] DelayedArray_0.24.0 DBI_1.1.3 withr_2.5.0 pillar_1.8.1
[97] svMisc_1.2.3 abind_1.4-5 KEGGREST_1.38.0 RCurl_1.98-1.10
[101] tibble_3.1.8 crayon_1.5.2 utf8_1.2.3 BiocFileCache_2.6.0
[105] tzdb_0.3.0 progress_1.2.2 locfit_1.5-9.7 grid_4.2.2
[109] blob_1.2.3 digest_0.6.31 xtable_1.8-4 httpuv_1.6.8
[113] munsell_0.5.0
Completely forgot about other assays i would need for swish.
Thank you, Michael!
A note: I've now changed collapseReplicates() to output a loud warning when it is run on a DESeqDataSet (SummarizedExperiment) with more than just a "counts" assay, and furthermore it will drop the other assays. This is to warn the user that other assays require some other form of collapsing (it's not safe to assume they should be summed, e.g. gene lengths don't sum, nor do TPM).