collapseReplicates in DESeq2
1
0
Entering edit mode
@catalina-aguilar-hurtado-6554
Last seen 4.1 years ago
United States

Hi All,

I am trying to collapse technical replicates in DESeq2. Already had a look at the manual, but still is not clear to me how to do it. I did try run it but got some errors, need to understand how it works properly.

If I want to collapse A1 with A1.1, B1 with B1.1, C1 and C1.1 , and D2 with D2.1

dds <- DESeqDataSetFromMatrix(
  countData = countdata,
  colData = coldata,
  design = ~ Subject + Treatment)
dds

> coldata
     Subject Treatment Time
A1         1        35    1
A1.1       1        35    1
A2         2        35    1
A3         3        35    1
A4         4        35    1
A5         5        35    1
B1         1        25    1
B1.1       1        25    1
B2         2        25    1
B4         4        25    1
B5         5        25    1
C1         1        35   24
C1.1       1        35   24
C2         2        35   24
C3         3        35   24
C4         4        35   24
C5         5        35   24
D2         2        25   24
D2.1       2        25   24
D4         4        25   24
D5         5        25   24
>

dds$Subject <- factor(sample(paste0("Subject",rep(1:22, c(1,1,2,3,4,5,1,1,2,3,4,5,1,1,2,3,4,5,2,2,4,5)))))??

dds$run <- paste0("run",1:??)

ddsColl <- collapseReplicates(dds, dds$Subject, dds$run)

From the example in the manual: paste0("run",1:12), means now there are 12 rows in the coldata?

## Collapse replicates in manual

dds <- makeExampleDESeqDataSet(m=12)

# make data with two technical replicates for three samples
dds$sample <- factor(sample(paste0("sample",rep(1:9, c(2,1,1,2,1,1,2,1,1)))))
dds$run <- paste0("run",1:12)

ddsColl <- collapseReplicates(dds, dds$sample, dds$run)

##

Also will like to know after if after I collapse the replicates, I need to modify my target file and run DESeqDataSetFromMatrix again??

Thanks,

Catalina

> R.Version()
$platform
[1] "x86_64-apple-darwin10.8.0"

$arch
[1] "x86_64"

$os
[1] "darwin10.8.0"

$system
[1] "x86_64, darwin10.8.0"

$status
[1] ""

$major
[1] "3"

$minor
[1] "1.0"

$year
[1] "2014"

$month
[1] "04"

$day
[1] "10"

$`svn rev`
[1] "65387"

$language
[1] "R"

$version.string
[1] "R version 3.1.0 (2014-04-10)"

$nickname
[1] "Spring Dance"

deseq2 collapseReplicates • 6.5k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 2 days ago
United States

if we look up the help:

?collapseReplicates

There is information about these arguments:

groupby:     a grouping factor, as long as the columns of object

run:     optional, the names of each unique column in object. if provided, a new column runsCollapsed will be added to the colData which pastes together the names of run

And also information about the result:

Value:     the object with as many columns as levels in groupby.

So, you should make a new column which uniquely identifies the libraries which were sequenced more than once (this is what we refer to as a technical replicate). It looks like this would be:

dds$id <- factor(paste0(dds$subject, dds$treatment, dds$time))

Then provide dds$id to the 'groupby' argument.

You should not run a constructor function (like DESeqDataSetFrom*) after you've run collapseReplicates().

ADD COMMENT
0
Entering edit mode

Hi Michael,

when you defne 'groupby' with dds$id I don't understand where do you tell which samples to collapse? Like is my case A1 with A1.1, B1 with B1.1, C1 with C1.1 , and D2 with D2.1 that are my technical replicates. Would I need to specify that?

 

Thanks

ADD REPLY
1
Entering edit mode

It collapses by the levels in the factor variable 'groupby'.

That is why the output has as many columns as levels in 'groupby'. 

For example, if the original counts matrix has 5 columns, and groupby is A, A, A, B, C, then it adds the counts from columns 1-3 to produce a column "A", and the final count table will have columns A, B, C.

ADD REPLY
0
Entering edit mode

Thanks Michael, now I understand I don't need to define which columns to collapse, but need to change my replicates to have the same ID.

From the example in: ?collapseReplicates I couldn't understand which were the three samples and it was confusing me.

# make data with two technical replicates for three samples
dds$sample <- factor(sample(paste0("sample",rep(1:9, c(2,1,1,2,1,1,2,1,1)))))

ADD REPLY

Login before adding your answer.

Traffic: 812 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6