Hello, I am performing different analysis of transposable elements deregulation with SQuIRE. SQuIRE performs the differential expression analysis with DESeq2, but it does not allow you to introduce any batch effect. According to this and as I am analyzing samples of many laboratories I have decided to perform the differential expression analysis with DESeq2 outside SQuIRE. This is the code (with the inputs modified by me) that I have extracted of the SQuIRE code. (https://github.com/wyang17/SQuIRE)
source("https://bioconductor.org/biocLite.R")
biocLite("DESeq2")
library("DESeq2")
library("BiocParallel")
register(MulticoreParam(4))
args<-commandArgs(TRUE)
print(args)
count_filename<-args[1] #count table
coldata_filename<-args[2] #sample description
strand<-args[3]
outfile <- args[4]
print(args)
#read in file
#count_filename="E:/Dropbox/Miguel/squire_call/squire_miguel_counttable.txt"
#coldata_filename="E:/Dropbox/Miguel/squire_call/squire_miguel_coldata.txt"
cts<- as.matrix(read.delim('LaRocca_locus_gene_TE_counttable.txt',header=TRUE, stringsAsFactors=FALSE,row.names="gene_id"))
coldata<-(read.delim('original',header=TRUE, stringsAsFactors=FALSE,row.names="sample"))
dds <- DESeqDataSetFromMatrix(countData = cts,
colData = coldata,
design = ~ condition)
dds$condition <- factor(dds$condition, levels = c("young","old"))
dds <- DESeq(dds)
res <- results(dds)
resLFC <- lfcShrink(dds, coef=2)
resOrdered <- res[order(res$pvalue),]
summary(res)
plotCounts(dds, gene=which.min(res$padj), intgroup="condition")
plotMA(res, ylim=c(-2,2))
TE_only<-resOrdered[grepl("\\|",rownames(resOrdered)),]
plotMA(TE_only, ylim=c(-2,2))
sessionInfo( )
This is the error that It reports when I perform the "dds = DESeq(dds) step
dds <- DESeq(dds)
Error in designAndArgChecker(object, betaPrior) :
full model matrix is less than full rank
On the other hand, this outputs make me think the error could be in the condition assignment.
> as.data.frame(colData(dds))
condition
SRR7093854 <NA>
SRR7093864 <NA>
SRR7093871 <NA>
SRR7093872 <NA>
SRR7093873 <NA>
SRR7093919 <NA>
SRR7093809 <NA>
SRR7093874 <NA>
SRR7093875 <NA>
SRR7093881 <NA>
SRR7093949 <NA>
SRR7093951 <NA>
But when I analyze the coldata object it seems to have been processed correctly:
> coldata
condition
SRR7093854 old
SRR7093864 old
SRR7093871 old
SRR7093872 old
SRR7093873 old
SRR7093919 old
SRR7093809 young
SRR7093874 young
SRR7093875 young
SRR7093881 young
SRR7093949 young
SRR7093951 young
This is the session info
> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=es_ES.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=es_ES.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils
[7] datasets methods base
other attached packages:
[1] DESeq2_1.32.0 SummarizedExperiment_1.22.0
[3] Biobase_2.52.0 MatrixGenerics_1.4.0
[5] matrixStats_0.59.0 GenomicRanges_1.44.0
[7] GenomeInfoDb_1.28.1 IRanges_2.26.0
[9] S4Vectors_0.30.0 BiocGenerics_0.38.0
[11] readr_2.0.1
loaded via a namespace (and not attached):
[1] httr_1.4.2 vroom_1.5.5
[3] bit64_4.0.5 splines_4.1.1
[5] assertthat_0.2.1 blob_1.2.1
[7] GenomeInfoDbData_1.2.6 yaml_2.2.1
[9] pillar_1.6.1 RSQLite_2.2.7
[11] lattice_0.20-44 glue_1.4.2
[13] digest_0.6.27 RColorBrewer_1.1-2
[15] XVector_0.32.0 colorspace_2.0-2
[17] htmltools_0.5.1.1 Matrix_1.3-4
[19] XML_3.99-0.6 pkgconfig_2.0.3
[21] genefilter_1.74.0 zlibbioc_1.38.0
[23] purrr_0.3.4 xtable_1.8-4
[25] scales_1.1.1 tzdb_0.1.2
[27] BiocParallel_1.26.1 tibble_3.1.2
[29] annotate_1.70.0 KEGGREST_1.32.0
[31] generics_0.1.0 ggplot2_3.3.5
[33] ellipsis_0.3.2 withr_2.4.2
[35] cachem_1.0.5 cli_3.0.0
[37] survival_3.2-13 magrittr_2.0.1
[39] crayon_1.4.1 memoise_2.0.0
[41] evaluate_0.14 fansi_0.5.0
[43] tools_4.1.1 hms_1.1.0
[45] lifecycle_1.0.0 munsell_0.5.0
[47] locfit_1.5-9.4 DelayedArray_0.18.0
[49] AnnotationDbi_1.54.1 Biostrings_2.60.1
[51] compiler_4.1.1 rlang_0.4.11
[53] grid_4.1.1 RCurl_1.98-1.3
[55] rstudioapi_0.13 bitops_1.0-7
[57] rmarkdown_2.9 gtable_0.3.0
[59] DBI_1.1.1 R6_2.5.0
[61] knitr_1.33 dplyr_1.0.7
[63] fastmap_1.1.0 bit_4.0.4
[65] utf8_1.2.1 Rcpp_1.0.7
[67] vctrs_0.3.8 geneplotter_1.70.0
[69] png_0.1-7 tidyselect_1.1.1
[71] xfun_0.24
I would be very grateful to anyone who can help me, I tried to solve this problem in the SQuIRE forum but I did not have an answer so I hope I can get some solution in this one.
Hello ATpoint, thank you very much for your fast and helpful response, finally I corrected the error by changing the separator, instead of using \t with "_" it works better. Thank you very much! Have a nice day!