Question

Using DESeq2 with few biological replicates

0

Entering edit mode

daniel.silvestre ▴ 100

@danielsilvestre-6769

Last seen 9.3 years ago

Brazil

I have a couple of TruSeq/HiScanSQ mRNA-seq from blood samples (no globin depletion) from subjects with different rare autosomal recessive disorders related to nucleotide-excision repair or junction resolution. Of course, we have few biological replicates of each condition (e. g. only two with Bloom syndrome). Despite being associated with a specific gene, nobody investigated their expression. So, we´re not even certain if we could group samples with the same alleged condition. Given this scenario, I want to use DESeq2 to find out differentially expressed genes. The problem is: few biological replicates in the case group and possibly cases within the same condition with very different values of expression. I know that in edgeR one could 'fake' some parameters. Is there a way of doing the same in DESeq2?

deseq2 rnaseq • 4.8k views

ADD COMMENT • link updated 10.1 years ago by Simon Anders ★ 3.8k • written 10.1 years ago by daniel.silvestre ▴ 100

0

Entering edit mode

How many samples total and how many groups? Were the samples run in batches? Analysis with very few replicates, even 2 per group, is still possible with DESeq2, and I'd recommend this rather than inserting guesses for parameters (which parameters are you referring to?)

ADD REPLY • link 10.1 years ago Michael Love 43k

0

Entering edit mode

For now, as a pilot, we have 3 healthy controls and 2 cases of Bloom's. The focal gene (BLM) is somewhat responsible for chromosomal integrity. Controls and one case were run on the same lane. The other case on other lane/cell.

ADD REPLY • link 10.1 years ago daniel.silvestre ▴ 100

1

Entering edit mode

You can certainly do an exploratory analysis with design ~ condition or ~ batch + condition. The first one leaves you with 3 degrees of freedom for estimating dispersion, and the second one with 2 degrees. Having one of the two cases on a different lane than the controls is not ideal: you would typically want to evenly split cases and controls across the different batches to give some hope to separate the batch effect from the condition effect.

ADD REPLY • link 10.1 years ago Michael Love 43k

1

Entering edit mode

I´m aware of such design issues. As usual, I came into the project way after experimental runs. So, I'm trying to do my best and asking for some expert advice. I hope to convince PIs to (re)design things before next round of runs.

ADD REPLY • link 10.1 years ago daniel.silvestre ▴ 100

score 1 · Answer 1 · 2014-10-08

1

Entering edit mode

Simon Anders ★ 3.8k

@simon-anders-3855

Last seen 4.3 years ago

Zentrum für Molekularbiologie, Universi…

If you have a decently-sized control group of healthy individuals, you might get some results even with only two or three cases -- if the disease in question causes very drastic expression changes to at least a few genes, which stick out clearly from the range of values seen in healthy subjects, and stick out consistently in all cases in the same direction.

If, however, the expression differences between cases and controls are only moderate, i.e., no stronger than what you would expect to see between two different healthy subjects, then you will need dozens of cases per disease to say something.

Note that all this has little to do with whether you use DESeq2 or edgeR or whatever other tool, and how you use the tools. The question is simply whether there is anything in your data or not. Maybe explain a bit more why you cannot simply go ahead and do a straight-forward case-control comparison for one of the diseases.

ADD COMMENT • link 10.1 years ago Simon Anders ★ 3.8k

0

Entering edit mode

What's the size of a decently-sized control group? I don't expect very drastic expression changes as the focal gene don't regulate anything, AFAIK. But, case subjects have a quite altered chromosomal architecture.

ADD REPLY • link 10.1 years ago daniel.silvestre ▴ 100

1

Entering edit mode

There is a Bioconductor package, where you can plug in average count, sample size, true fold change, p-value threshold, etc. and get probability of detecting differential expression:

http://www.bioconductor.org/packages/release/bioc/html/RNASeqPower.html

ADD REPLY • link 10.1 years ago Michael Love 43k