Hi All,
I am trying to collapse technical replicates in DESeq2. Already had a look at the manual, but still is not clear to me how to do it. I did try run it but got some errors, need to understand how it works properly.
If I want to collapse A1 with A1.1, B1 with B1.1, C1 and C1.1 , and D2 with D2.1
dds <- DESeqDataSetFromMatrix(
countData = countdata,
colData = coldata,
design = ~ Subject + Treatment)
dds
> coldata
Subject Treatment Time
A1 1 35 1
A1.1 1 35 1
A2 2 35 1
A3 3 35 1
A4 4 35 1
A5 5 35 1
B1 1 25 1
B1.1 1 25 1
B2 2 25 1
B4 4 25 1
B5 5 25 1
C1 1 35 24
C1.1 1 35 24
C2 2 35 24
C3 3 35 24
C4 4 35 24
C5 5 35 24
D2 2 25 24
D2.1 2 25 24
D4 4 25 24
D5 5 25 24
>
dds$Subject <- factor(sample(paste0("Subject",rep(1:22, c(1,1,2,3,4,5,1,1,2,3,4,5,1,1,2,3,4,5,2,2,4,5)))))??
dds$run <- paste0("run",1:??)
ddsColl <- collapseReplicates(dds, dds$Subject, dds$run)
From the example in the manual: paste0("run",1:12), means now there are 12 rows in the coldata?
## Collapse replicates in manual
dds <- makeExampleDESeqDataSet(m=12)
# make data with two technical replicates for three samples
dds$sample <- factor(sample(paste0("sample",rep(1:9, c(2,1,1,2,1,1,2,1,1)))))
dds$run <- paste0("run",1:12)
ddsColl <- collapseReplicates(dds, dds$sample, dds$run)
##
Also will like to know after if after I collapse the replicates, I need to modify my target file and run DESeqDataSetFromMatrix again??
Thanks,
Catalina
> R.Version()
$platform
[1] "x86_64-apple-darwin10.8.0"
$arch
[1] "x86_64"
$os
[1] "darwin10.8.0"
$system
[1] "x86_64, darwin10.8.0"
$status
[1] ""
$major
[1] "3"
$minor
[1] "1.0"
$year
[1] "2014"
$month
[1] "04"
$day
[1] "10"
$`svn rev`
[1] "65387"
$language
[1] "R"
$version.string
[1] "R version 3.1.0 (2014-04-10)"
$nickname
[1] "Spring Dance"
Hi Michael,
when you defne 'groupby' with dds$id I don't understand where do you tell which samples to collapse? Like is my case A1 with A1.1, B1 with B1.1, C1 with C1.1 , and D2 with D2.1 that are my technical replicates. Would I need to specify that?
Thanks
It collapses by the levels in the factor variable 'groupby'.
That is why the output has as many columns as levels in 'groupby'.
For example, if the original counts matrix has 5 columns, and groupby is A, A, A, B, C, then it adds the counts from columns 1-3 to produce a column "A", and the final count table will have columns A, B, C.
Thanks Michael, now I understand I don't need to define which columns to collapse, but need to change my replicates to have the same ID.
From the example in: ?collapseReplicates I couldn't understand which were the three samples and it was confusing me.
# make data with two technical replicates for three samples
dds$sample <- factor(sample(paste0("sample",rep(1:9, c(2,1,1,2,1,1,2,1,1)))))