I am using DESeq2 and have a genes x counts matrix where some of the columns are technical replicates. I want to use collapseReplicates, but I'm confused by the documentation about when to do it (e.g. after calling DESeqDataSetFromMatrix or on the raw counts table)
the documentation( http://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html) says "DESeq2 provides a function collapseReplicates which can assist in combining the counts from technical replicates into single columns of the count matrix", but this sounds like the raw counts matrix. However, the R link makes it sound like it's on a DESeqDataSet (https://www.rdocumentation.org/packages/DESeq2/versions/1.12.3/topics/collapseReplicates). Previous posts on this haven't really clarified this.
I've unsuccessfully transformed the gens x cts matrix, but successfully have done it on the DESeqDataSetFromMatrix derived dds matrix (code below), so I'm sure it's that; I get the expected number of columns reduced based on the number of replicates I have.
However: (1) Why does this has to be done on a DESeq Data Set? (2) Exactly what is happening on the back end? (3) Is it proper to continue with DESeq and the other downstream analysis (like I've started below) as is?
relevant code:
dds <- DESeqDataSetFromMatrix(countData = cts,
colData = coldata,
design = ~ condition)
ddsColl <- collapseReplicates(dds, dds$sample, renameCols = TRUE)
#perform differential expression analysis
dds <- DESeq(ddsColl)
res <- DESeq(ddsColl)
Posting because (1) I don't really know much about the backend of DESeq2 and don't want to blindly slap functions on my data (2) I'm genuinely curious.