Question

DESeq2 got stuck for one night, restarting?

0

Entering edit mode

Raymond ▴ 20

@raymond-14020

Last seen 5.5 years ago

Hi,

My DESeq2 running was stucked for one night, it is normal? My dataset contains 635 human samples, this is abnormally large:

head(ddsTxi)
class: DESeqDataSet
dim: 6 635
metadata(1): version
assays(2): counts avgTxLength

dds <- DESeq(ddsTxi)
estimating size factors
Note: levels of factors in the design contain characters other than
letters, numbers, '_' and '.'. It is recommended (but not required) to use
only letters, numbers, and delimiters '_' or '.', as these are safe characters
for column names in R. [This is a message, not an warning or error]
using 'avgTxLength' from assays(dds), correcting for library size
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
Note: levels of factors in the design contain characters other than
letters, numbers, '_' and '.'. It is recommended (but not required) to use
only letters, numbers, and delimiters '_' or '.', as these are safe characters
for column names in R. [This is a message, not an warning or error]
final dispersion estimates
Note: levels of factors in the design contain characters other than
letters, numbers, '_' and '.'. It is recommended (but not required) to use
only letters, numbers, and delimiters '_' or '.', as these are safe characters
for column names in R. [This is a message, not an warning or error]

Then it got stuck....Shall I wait or should I run it again step by step? or Can I just stop it and run the last nbinomWaldTest from the current dds?

dds <- estimateSizeFactors(dds) dds <- estimateDispersions(dds) dds <- nbinomWaldTest(dds)

Regards,
Raymond

deseq2 rnaseq • 1.5k views

ADD COMMENT • link updated 6.2 years ago by Michael Love 43k • written 6.2 years ago by Raymond ▴ 20

score 0 · Answer 1 · 2018-10-04

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 7 days ago

United States

What is your design? How many levels per variable?

Usually DESeq2 doesn't take so much time unless there are dozens of variables and hundreds of samples.

> dds <- makeExampleDESeqDataSet(n=100, m=600)
> system.time({ dds <- DESeq(dds, quiet=TRUE) })
   user  system elapsed
 37.552   4.407  42.106

It should scale linearly, so for 10,000 genes, you'd expect 70 minutes using a single core.

If you use parallel=TRUE, and 10 cores, this would take probably ~10 minutes.

You can filter out lowly expressed genes to save time, or switch to using limma-voom.

In my lab we use limma-voom whenever we have hundreds of samples.

ADD COMMENT • link 6.2 years ago Michael Love 43k

0

Entering edit mode

My design Matrix is

design = ~ batch+genotype+sex+condition

where batch has 9 level, genotype has 6 levels, sex has 2 levels, and condition has 4 levels. I do not include the PMI information here, where is a continuous number.

I will try limma-voom then. Thanks, Micheal!

ADD REPLY • link 6.2 years ago Raymond ▴ 20