Question

Using DESeq2 to analyse multi-variate design resulting in testing the wrong parameter

0

Entering edit mode

Jonas • 0

@f91ff31c

Last seen 2.4 years ago

United Kingdom

Enter the body of text here Hi, I am analysing a RNA Seq dataset coming from 3 independent cell isolates (isolate1, isolate2, isolate3), each given 3 different treatments (control, drug1, drug2).

We are testing drug 1 against control in the first instance:

We also observed that there is some variation among isolates - i.e., batch effect

We load the data into DESeq2, the coldata looks like this:

             drug            isolate
Sample 1     control       isolate1
Sample 2     control       isolate2
Sample 3     control       isolate3
Sample 4     drug1        isolate1
Sample 5     drug1        isolate2
Sample 6     drug1        isolate3

Code should be placed in three backticks as shown below

configure = data.frame(drug=factor(c("control","control","control","drug1","drug1","drug1")), isolate=c("isolate1","isolate2","isolate3","isolate1","isolate2","isolate3"))
dds = DESeqDataSetFromMatrix(countData = readcount[-1],colData = configure,design = ~ drug + isolate)

Then I performed DEG calling later

dds <- DESeq(dds)
res <- results(dds)

However it showed this message: log2 fold change (MLE): isolate isolate1 vs isolate2
Wald test p-value: isolate isolate1 vs isolate2
DataFrame with 15793 rows and 6 columns

I suppose as the formula design ~ drug + isolate suggests, it is going to test the supposed major effet which is the drug, rather than the isolate, and in the same time account for the batch effect brought in by isolates. But it has tested the effect of isolate1 vs isolate 2 instead, even though there are only 2 samples per isolate.

Could you please help how it would happen and how should I correct it? Or should I rearrange the design to ~ isolate + drug.

I am also very curious, given the 3X3 nested design, do we have an option to include all 9 samples under one dds and specify with R which ones we would like to involve in the test each time? Thank you very much!

batch RNASeq BatchEffect DESeq2 • 1.3k views

ADD COMMENT • link updated 3.1 years ago by Steve Lianoglou ★ 13k • written 3.1 years ago by Jonas • 0

score 0 · Answer 1 · 2021-12-30

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 11 minutes ago

United States

See the vignette about how to extract multiple results tables. Running results() without any additional arguments will only return a single table — it can’t guess which table you want.

I’d recommend to work out your statistical design with a local statistician or someone familiar with linear models in R.