Question

Placing groups in one or in different DESeq2 objects

0

Entering edit mode

d93espinoza • 0

@d93espinoza-23410

Last seen 4.8 years ago

Dear DESeq2 community,

I am currently working with RNA-seq data from pre- and post-treatment samples from 3 patients across 4 different cell types (A B C D). The experimental design is shown here:

> my_metadata
   patient celltype_treatment
1       P1              A_pre
2       P2              A_pre
3       P3              A_pre
4       P1             A_post
5       P2             A_post
6       P3             A_post
7       P1              B_pre
8       P2              B_pre
9       P3              B_pre
10      P1             B_post
11      P2             B_post
12      P3             B_post
13      P1              C_pre
14      P2              C_pre
15      P3              C_pre
16      P1             C_post
17      P2             C_post
18      P3             C_post
19      P1              D_pre
20      P2              D_pre
21      P3              D_pre
22      P1             D_post
23      P2             D_post
24      P3             D_post

I am currently only interested in identifying differentially expressed genes for within-celltype comparisons (the effect of treatment on the gene expression within each celltype). That is, I am only interested in the A_pre vs. A_post, B_pre vs. B_post, C_pre vs. C_post, and D_pre vs. D_post comparisons (4 comparisons in total).

That being the case, is the best practice to build 1 DESeq model using all of these samples, or to use each of 4 cell types to build 4 separate DESeq2 models? I am aware of vignette section titled "If I have multiple groups, should I run all together or split into pairs of groups?", which specifies to run samples from all groups together and then specify contrasts, but my understanding was that this approach should be taken in the case that I would want to perform further comparisons, such as A_pre vs B_pre, which I am not interested in.

I have tried both approaches. Using the 1-model approach for cell type A, I obtain 35 differentially expressed genes pre/post-treatment (padj <= 0.05). With the 4-model approach, I obtain 39 differentially expressed genes for cell type A pre/post-treatment, of which only 8 overlap with the results of the first approach.

Thank you in advance!

deseq2 • 696 views

ADD COMMENT • link updated 4.8 years ago by Michael Love 43k • written 4.8 years ago by d93espinoza • 0

score 2 · Accepted Answer · 2020-04-23

2

Entering edit mode

Michael Love 43k

@mikelove

Last seen 1 day ago

United States

Usually i prefer all in one dataset, as in vignette. With small sample size experiments the variance estimates will be more stable this way and so I tend to trust the results more.