Entering edit mode
Hello, I am trying to run barcodeRanks for a matrix and I continuously get this error. I have tried:
mtx = readMM("/Solo.out/GeneFull/raw/matrix.mtx.gz")
rownames(mtx) = read.table("/Solo.out/GeneFull/raw/features.tsv.gz")$V2
colnames(mtx) = read.table("/Solo.out/GeneFull/raw/barcodes.tsv.gz")$V1
str(mtx)
Formal class 'dgTMatrix' [package "Matrix"] with 6 slots
..@ i : int [1:642] 366 214795 33 33 33 33 33 456 33 33 ...
..@ j : int [1:642] 0 2697 6910 7051 8743 11437 12341 12434 12717 19581 ...
..@ Dim : int [1:2] 233275 737280
..@ Dimnames:List of 2
.. ..$ : chr [1:233275] "ENST00000456328.2" "ENST00000619216.1" "ENST00000473358.1" "ENST00000469289.1" ...
.. ..$ : chr [1:737280] "AAACAACGAAAACATG" "AAACAACGAAAACTGA" "AAACAACGAAAAGCTA" "AAACAACGAAAAGGTT" ...
..@ x : num [1:642] 1 1 1 1 1 1 1 1 1 2 ...
..@ factors : list()
brOut = barcodeRanks(mtx)
Error in .local(m, ...) :
insufficient unique points for computing knee/inflection points
sessionInfo( )
R version 4.2.2 (2022-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Rocky Linux 8.8 (Green Obsidian)
Matrix products: default
BLAS/LAPACK: /g/easybuild/x86_64/Rocky/8/haswell/software/FlexiBLAS/3.2.1-GCC-12.2.0/lib64/libflexiblas.so.3.2
locale:
[1] LC_CTYPE=en_IE.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_IE.UTF-8 LC_COLLATE=en_IE.UTF-8
[5] LC_MONETARY=en_IE.UTF-8 LC_MESSAGES=en_IE.UTF-8
[7] LC_PAPER=en_IE.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_IE.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] patchwork_1.1.2 pbmcapply_1.5.1
[3] DropletUtils_1.18.1 SingleCellExperiment_1.20.1
[5] SummarizedExperiment_1.28.0 Biobase_2.58.0
[7] GenomicRanges_1.50.2 GenomeInfoDb_1.34.9
[9] IRanges_2.32.0 S4Vectors_0.36.2
[11] BiocGenerics_0.44.0 MatrixGenerics_1.10.0
[13] matrixStats_0.63.0 ggplot2_3.4.1
[15] Matrix_1.5-3
loaded via a namespace (and not attached):
[1] tidyselect_1.2.0 locfit_1.5-9.7
[3] beachmat_2.14.2 HDF5Array_1.26.0
[5] lattice_0.20-45 rhdf5_2.42.1
[7] colorspace_2.1-0 vctrs_0.6.0
[9] generics_0.1.3 utf8_1.2.3
[11] rlang_1.1.0 R.oo_1.25.0
[13] pillar_1.8.1 glue_1.6.2
[15] scuttle_1.8.4 withr_2.5.0
[17] R.utils_2.12.2 BiocParallel_1.32.6
[19] dqrng_0.3.0 GenomeInfoDbData_1.2.9
[21] lifecycle_1.0.3 zlibbioc_1.44.0
[23] munsell_0.5.0 gtable_0.3.1
[25] R.methodsS3_1.8.2 codetools_0.2-19
[27] fansi_1.0.4 Rcpp_1.0.10
[29] edgeR_3.40.2 scales_1.2.1
[31] limma_3.54.2 DelayedArray_0.24.0
[33] XVector_0.38.0 dplyr_1.1.0
[35] grid_4.2.2 cli_3.6.0
[37] tools_4.2.2 bitops_1.0-7
[39] rhdf5filters_1.10.1 magrittr_2.0.3
[41] RCurl_1.98-1.10 tibble_3.2.0
[43] pkgconfig_2.0.3 DelayedMatrixStats_1.20.0
[45] sparseMatrixStats_1.10.0 Rhdf5lib_1.20.0
[47] R6_2.5.1 compiler_4.2.2
Why to I get this error? Could you walk me through a solution?
Before doing anything else, please try to read the data with
DropletUtils::read10xCounts()
and repeat the erroneous step. See if that helps. If not, one can see how to debug. I see transcripts rather than genes as rownames, why is that? Did you do tx-level quantification?-Before doing anything else, please try to read the data with DropletUtils::read10xCounts() and repeat the erroneous step.
I get the same error.
-I see transcripts rather than genes as rownames, why is that?
No specific reason, but could be changed.
-Did you do tx-level quantification?
Quantification was performed with STARsolo --quantMode GeneCounts.
Thank you for the insights.
Doing GeneCounts and seeing transcripts indicates to me that something is wrong here. You should have a gene by sample matrix, not a transcript by sample matrix. I would start trying to solve that mismatch.