Hello,
I am trying to re-analyze public data from this deposit on DESeq2: https://github.com/Ellen1101/OA-subtype/tree/master
The counts matrix is available in the "cartilage.count.201905113.zip" file, and metadata is stored in the "cart_ann_20190513.txt" file. I used the following code to load the data, and at the very first step when I try to collapse technical replicates I get an error of dimension, like one matrix is transposed the wrong way.
Error in [[<-
(`tmp`, name, value = new("DESeqDataSet", design = ~Group, :
58684 elements in value to replace 232 elements
What can I do to proceed?
Here is the code :
counts = read.table(file = "Yuan2020/cartilage.count.201905113.txt",
header = TRUE)
counts.matrix = counts %>%
select(c(7:238))
rownames(counts.matrix)= counts$Geneid
metadata = read.table(file = "Yuan2020/cart_ann_20190513.txt",
header = TRUE,
na.strings = "N",
stringsAsFactors = TRUE
) %>%
mutate(runID = libID_cart02) %>%
mutate(sampleID = as.factor(sampleID)) %>%
select(-c("libID_cart02")) #simplify name
length(levels(metadata$sampleID)) # 217 samples for 232 runs
table(metadata$sampleID) # some samples have been run twice
table(metadata[,"runID"]) #Ok this is unique
### Load to DESeq
ddsFullCountTable <- DESeqDataSetFromMatrix(countData = counts.matrix,
colData = metadata,
design = ~ Group)
ddsFullCountTable$
# Collapse replicates
ddsCollapsed <- collapseReplicates(ddsFullCountTable,
groupby = metadata$sampleID, run = metadata$runID)
sessionInfo()
And here is the session info:
R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C LC_TIME=English_United States.utf8
time zone: Europe/Paris
tzcode source: internal
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] DESeq2_1.40.2 tidyselect_1.2.1 dplyr_1.1.4 SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0
[6] Biobase_2.62.0 GenomicRanges_1.54.1 GenomeInfoDb_1.38.7 IRanges_2.36.0 S4Vectors_0.40.2
[11] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 matrixStats_1.2.0
loaded via a namespace (and not attached):
[1] gtable_0.3.4 xfun_0.42 ggplot2_3.5.0 htmlwidgets_1.6.4 rstatix_0.7.2 lattice_0.22-5
[7] vctrs_0.6.5 tools_4.3.2 bitops_1.0-7 generics_0.1.3 parallel_4.3.2 tibble_3.2.1
[13] fansi_1.0.6 pkgconfig_2.0.3 Matrix_1.6-5 data.table_1.15.2 lifecycle_1.0.4 GenomeInfoDbData_1.2.11
[19] compiler_4.3.2 munsell_0.5.0 codetools_0.2-19 carData_3.0-5 htmltools_0.5.7 RCurl_1.98-1.14
[25] yaml_2.3.8 lazyeval_0.2.2 plotly_4.10.4 pillar_1.9.0 car_3.1-2 ggpubr_0.6.0
[31] crayon_1.5.2 tidyr_1.3.1 BiocParallel_1.36.0 DelayedArray_0.28.0 sessioninfo_1.2.2 abind_1.4-5
[37] locfit_1.5-9.9 digest_0.6.35 purrr_1.0.2 fastmap_1.1.1 grid_4.3.2 SparseArray_1.2.4
[43] colorspace_2.1-0 cli_3.6.2 magrittr_2.0.3 S4Arrays_1.2.1 utf8_1.2.4 broom_1.0.5
[49] withr_3.0.0 scales_1.3.0 backports_1.4.1 rmarkdown_2.26 XVector_0.42.0 httr_1.4.7
[55] ggsignif_0.6.4 evaluate_0.23 knitr_1.45 viridisLite_0.4.2 rlang_1.1.3 Rcpp_1.0.12
[61] glue_1.7.0 pROC_1.18.5 pkgload_1.3.4 rstudioapi_0.15.0 jsonlite_1.8.8 R6_2.5.1
[67] plyr_1.8.9 zlibbioc_1.48.0
Thanks !