Hi everyone,
So I'm trying to run a DESeq2 analysis on expression data. The problem I have is that when running the analysis for some genes it works, and for a few of them, it does not work.
So in the following code attached, expression_gene
is a matrix with genes as rownames, samples as colnames, and raw counts of reads as values (these being integers), colData
is a dataframe with a single column of cases (TRUE or FALSE values) and as rownames the samples analysed.
colData <- data.frame(cases = factor(colnames(expression_gene) %in% c_patients), row.names = colnames(expression_gene))
levels(colData$cases) <- c("FALSE" = "DELETED", "TRUE" = "DIPLOID")
dds <- DESeqDataSetFromMatrix(countData = expression_gene, colData = colData, design = ~ cases)
And the error:
Error in DESeqDataSet(se, design = design, ignoreRank) : counts matrix should be numeric, currently it has mode : logical
Calls: DESeqDataSetFromMatrix → DESeqDataSet
Execution halted.
So for that gene (let's call it X), that error pops up, but for a bunch of other genes, which have exactly the same format in the counts matrix, it works well and does not complain. What I do not understand is why it says that counts matrix is logical for these few genes, and for the rest of them it takes the matrix as numeric.
Thanks in advance for your help.
sessionInfo( )
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.1 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_ES.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=es_ES.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] forcats_0.5.0 stringr_1.4.0 dplyr_1.0.2 purrr_0.3.4
[5] readr_1.4.0 tidyr_1.1.2 tibble_3.0.4 ggplot2_3.3.2
[9] tidyverse_1.3.0 DESeq2_1.28.1 SummarizedExperiment_1.18.2 DelayedArray_0.14.1
[13] matrixStats_0.57.0 Biobase_2.48.0 GenomicRanges_1.40.0 GenomeInfoDb_1.24.2
[17] IRanges_2.22.2 S4Vectors_0.26.1 BiocGenerics_0.34.0
loaded via a namespace (and not attached):
[1] httr_1.4.2 bit64_4.0.5 jsonlite_1.7.1 splines_4.0.3 modelr_0.1.8
[6] assertthat_0.2.1 blob_1.2.1 GenomeInfoDbData_1.2.3 cellranger_1.1.0 pillar_1.4.6
[11] RSQLite_2.2.1 backports_1.1.10 lattice_0.20-41 glue_1.4.2 digest_0.6.25
[16] RColorBrewer_1.1-2 XVector_0.28.0 rvest_0.3.6 colorspace_1.4-1 Matrix_1.2-18
[21] XML_3.99-0.5 pkgconfig_2.0.3 broom_0.7.2 haven_2.3.1 genefilter_1.70.0
[26] zlibbioc_1.34.0 xtable_1.8-4 scales_1.1.1 BiocParallel_1.22.0 annotate_1.66.0
[31] generics_0.0.2 ellipsis_0.3.1 withr_2.3.0 cli_2.1.0 survival_3.2-7
[36] magrittr_1.5 crayon_1.3.4 readxl_1.3.1 memoise_1.1.0 fansi_0.4.1
[41] fs_1.5.0 xml2_1.3.2 tools_4.0.3 hms_0.5.3 lifecycle_0.2.0
[46] munsell_0.5.0 reprex_0.3.0 locfit_1.5-9.4 AnnotationDbi_1.50.3 compiler_4.0.3
[51] tinytex_0.26 rlang_0.4.8 grid_4.0.3 RCurl_1.98-1.2 rstudioapi_0.11
[56] bitops_1.0-6 gtable_0.3.0 DBI_1.1.0 R6_2.4.1 lubridate_1.7.9
[61] bit_4.0.4 stringi_1.5.3 Rcpp_1.0.5 vctrs_0.3.4 geneplotter_1.66.0
[66] dbplyr_1.4.4 tidyselect_1.1.0 xfun_0.18
What is the output of
head(expression_gene)
?