DESeq2 "Not full rank" and "less than full rank" error
1
0
Entering edit mode
@c38f6201
Last seen 1 day ago
Germany

Hi all, I would like to use DESeq2 to analyze bulk RNA-seq data and I have the following coldata:

group                           donor                id 
wt_ctrl                         m1                   wt_ctrl_m1         
wt_treated                      m1                   wt_treated_m1  
wt_ctrl                         m2                   wt_ctrl_m2       
wt_treated                      m2                   wt_treated_m2 
ko_treated                      m3                   ko_treated_m3 
ko_ctrl                         m3                   ko_ctrl_m3      
ko_ctrl                         m4                   ko_ctrl_m4     
ko_treated                      m4                   ko_treated_m4 
ko_ctrl                         m5                   ko_ctrl_m5     
wt_treated                      m6                   wt_treated_m6 
wt_ctrl                         m6                   wt_ctrl_m6       
ko_treated                      m5                   ko_treated_m5 
ko_ctrl                         m7                   ko_ctrl_m7      
ko_treated                      m7                   ko_treated_m7 
wt_treated                      m8                   wt_treated_m8  
wt_cntrl                        m8                   wt_cntrl_m8   
wt_treated                      m9                   wt_treated_m9  
wt_ctrl                         m9                   wt_ctrl_m9  
ko_treated                      m10                  ko_treated_m10 
ko_ctrl                         m10                  ko_ctrl_m10

I wanted to perform DE analysis, so to create the dds object

dds <- DESeqDataSet(gse1, design = ~donor + group)

but I got the following error

Error in checkFullRank(modelMatrix) : 
  the model matrix is not full rank, so the model cannot be fit as specified.
  One or more variables or interaction terms in the design formula are linear
  combinations of the others and must be removed.

  Please read the vignette section 'Model matrix not full rank':

  vignette('DESeq2')

So I tried to find from the vignette a work around to include both donor and group in my design:

dds <- DESeqDataSet(gse, design = ~ donor + donor:id+ donor:group)

then when I wanted to run DESeq() I got the follwoing error:

Error in designAndArgChecker(object, betaPrior) : 
  full model matrix is less than full rank

I am a bit confused how to handle this design:

table(gse1$donor_animal,gse1$group)

             wt_ctrl        wt_treated        ko_ctrl       ko_treated
  m1               1                 1             0                0
  m10              0                 0             1                1
  m2               1                 1             0                0
  m3               0                 0             1                1
  m4               0                 0             1                1
  m5               0                 0             1                1
  m6               1                 1             0                0
  m7               0                 0             1                1
  m8               1                 1             0                0
  m9               1                 1             0                0

even when we see that the wt and ko columns are duplicated. Any ideas on how should I proceed? Thank you in advance!

sessionInfo( )
R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS Ventura 13.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tximeta_1.23.2              topGO_2.57.0               
 [3] SparseM_1.84-2              GO.db_3.19.1               
 [5] graph_1.83.0                clusterProfiler_4.13.3     
 [7] biomaRt_2.61.3              org.Mm.eg.db_3.19.1        
 [9] AnnotationDbi_1.67.0        DT_0.33                    
[11] GeneTonic_2.9.0             DESeq2_1.45.3              
[13] pcaExplorer_2.31.0          limma_3.61.9               
[15] ggplot2_3.5.1               knitr_1.48                 
[17] SingleCellExperiment_1.27.2 SummarizedExperiment_1.35.1
[19] Biobase_2.65.1              GenomicRanges_1.57.1       
[21] GenomeInfoDb_1.41.1         IRanges_2.39.2             
[23] S4Vectors_0.43.2            BiocGenerics_0.51.1        
[25] MatrixGenerics_1.17.0       matrixStats_1.4.1          

loaded via a namespace (and not attached):
  [1] ProtGenerics_1.37.1      fs_1.6.4                 bitops_1.0-8            
  [4] enrichplot_1.25.2        httr_1.4.7               webshot_0.5.5           
  [7] RColorBrewer_1.1-3       doParallel_1.0.17        Rgraphviz_2.49.0        
 [10] dynamicTreeCut_1.63-1    tippy_0.1.0              tools_4.4.1             
 [13] utf8_1.2.4               R6_2.5.1                 lazyeval_0.2.2          
 [16] GetoptLong_1.0.5         withr_3.0.1              prettyunits_1.2.0       
 [19] gridExtra_2.3            cli_3.6.3                TSP_1.2-4               
 [22] scatterpie_0.2.4         labeling_0.4.3           sass_0.4.9              
 [25] bs4Dash_2.3.4            genefilter_1.87.0        ggridges_0.5.6          
 [28] Rsamtools_2.21.1         yulab.utils_0.1.7        txdbmaker_1.1.1         
 [31] gson_0.1.0               DOSE_3.99.1              R.utils_2.12.3          
 [34] AnnotationForge_1.47.1   readxl_1.4.3             rstudioapi_0.16.0       
 [37] RSQLite_2.3.7            BiocIO_1.15.2            gridGraphics_0.5-1      
 [40] visNetwork_2.1.2         generics_0.1.3           GOstats_2.71.0          
 [43] shape_1.4.6.1            crosstalk_1.2.1          dplyr_1.1.4             
 [46] dendextend_1.17.1        Matrix_1.7-0             fansi_1.0.6             
 [49] abind_1.4-8              R.methodsS3_1.8.2        lifecycle_1.0.4         
 [52] yaml_2.3.10              qvalue_2.37.0            SparseArray_1.5.36      
 [55] BiocFileCache_2.13.0     grid_4.4.1               blob_1.2.4              
 [58] promises_1.3.0           crayon_1.5.3             shinydashboard_0.7.2    
 [61] miniUI_0.1.1.1           lattice_0.22-6           cowplot_1.1.3           
 [64] ComplexUpset_1.3.3       GenomicFeatures_1.57.0   annotate_1.83.0         
 [67] KEGGREST_1.45.1          pillar_1.9.0             ComplexHeatmap_2.21.0   
 [70] fgsea_1.31.0             rjson_0.2.23             codetools_0.2-20        
 [73] fastmatch_1.1-4          glue_1.7.0               ggfun_0.1.6             
 [76] data.table_1.16.0        treeio_1.29.1            vctrs_0.6.5             
 [79] png_0.1-8                cellranger_1.1.0         gtable_0.3.5            
 [82] assertthat_0.2.1         cachem_1.1.0             xfun_0.47               
 [85] S4Arrays_1.5.7           mime_0.12                tidygraph_1.3.1         
 [88] survival_3.7-0           pheatmap_1.0.12          seriation_1.5.6         
 [91] iterators_1.0.14         statmod_1.5.0            nlme_3.1-166            
 [94] Category_2.71.0          ggtree_3.13.1            bit64_4.5.2             
 [97] threejs_0.3.3            progress_1.2.3           filelock_1.0.3          
[100] bslib_0.8.0              colorspace_2.1-1         DBI_1.2.3               
[103] tidyselect_1.2.1         bit_4.5.0                compiler_4.4.1          
[106] curl_5.2.3               httr2_1.0.4              expm_1.0-0              
[109] xml2_1.3.6               DelayedArray_0.31.11     plotly_4.10.4           
[112] rtracklayer_1.65.0       shadowtext_0.1.4         colourpicker_1.3.0      
[115] scales_1.3.0             RBGL_1.81.0              NMF_0.28                
[118] rappdirs_0.3.3           stringr_1.5.1            digest_0.6.37           
[121] shinyBS_0.61.1           rmarkdown_2.28           ca_0.71.1               
[124] XVector_0.45.0           htmltools_0.5.8.1        pkgconfig_2.0.3         
[127] base64enc_0.1-3          ensembldb_2.29.1         dbplyr_2.5.0            
[130] fastmap_1.2.0            rlang_1.1.4              GlobalOptions_0.1.2     
[133] htmlwidgets_1.6.4        UCSC.utils_1.1.0         shiny_1.9.1             
[136] farver_2.1.2             jquerylib_0.1.4          jsonlite_1.8.9          
[139] BiocParallel_1.39.0      GOSemSim_2.31.2          R.oo_1.26.0             
[142] RCurl_1.98-1.16          magrittr_2.0.3           ggplotify_0.1.2         
[145] GenomeInfoDbData_1.2.12  patchwork_1.3.0          munsell_0.5.1           
[148] Rcpp_1.0.13              ape_5.8                  shinycssloaders_1.1.0   
[151] viridis_0.6.5            stringi_1.8.4            rintrojs_0.3.4          
[154] ggraph_2.2.1             zlibbioc_1.51.1          MASS_7.3-61             
[157] AnnotationHub_3.13.3     plyr_1.8.9               parallel_4.4.1          
[160] ggrepel_0.9.6            graphlayouts_1.2.0       Biostrings_2.73.1       
[163] splines_4.4.1            hms_1.1.3                circlize_0.4.16         
[166] locfit_1.5-9.10          igraph_2.0.3             rngtools_1.5.2          
[169] pkgload_1.4.0            reshape2_1.4.4           BiocVersion_3.20.0      
[172] XML_3.99-0.17            evaluate_1.0.0           BiocManager_1.30.25     
[175] foreach_1.5.2            tweenr_2.0.3             httpuv_1.6.15           
[178] backbone_2.1.4           tidyr_1.3.1              purrr_1.0.2             
[181] polyclip_1.10-7          heatmaply_1.5.0          clue_0.3-65             
[184] gridBase_0.4-7           ggforce_0.4.2            xtable_1.8-4            
[187] AnnotationFilter_1.29.0  restfulr_0.0.15          tidytree_0.4.6          
[190] later_1.3.2              viridisLite_0.4.2        tibble_3.2.1            
[193] aplot_0.2.3              GenomicAlignments_1.41.0 memoise_2.0.1           
[196] registry_0.5-1           tximport_1.33.0          cluster_2.1.6           
[199] shinyWidgets_0.8.7       shinyAce_0.4.2           GSEABase_1.67.0
DESeq2 deseq • 97 views
ADD COMMENT
0
Entering edit mode
Guido Hooiveld ★ 4.1k
@guido-hooiveld-2020
Last seen 9 hours ago
Wageningen University, Wageningen, the …

According to the vignette (https://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#model-matrix-not-full-rank) it is suggested to nest individuals within each group, so within each group you will need to start the numbering with donor 1 (you numbered the donors consecutively, but in both groups (wt and ko) there should be a donor labelled m1, m2 etc). Again, see the example in the vignette, and run that code to understand and appreciate this (= column ind.n that is added to the design, and is then used when specifying the model).

Also note that according this example you will need to 'split' the genotype and treatment information. In other words, the design should (at least) contain the columns genotype (wt/ko), treatment (ctrl/treatment) and ind.n (m1, m2, ...).

ADD COMMENT

Login before adding your answer.

Traffic: 641 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6