How to set up appropriate analysis using sample type and condition for DESeq2
1
1
Entering edit mode
newbio17 • 0
@newbio17-17851
Last seen 6.2 years ago

Note: I do not have biological replicates. I have read the warning messages and am aware that the analysis without replicates probably will not yield any meaning results. I want to make sure that I have the concept down for future reference.

What is the appropriate way of performing differential expression analysis when I have:

  • 6 samples from the same cell line
  • 3 sample types based on the phenotype (A, B, C)
  • 2 times; 0 (C) & 48 (T) hours

In short, the 6 samples are as follows: cell-AC, cell-AT, cell-BC, cell-BT, cell-CC, cell-CT

Q1: If I want to perform differential expression analysis between just the sample types (A vs. B or A vs. C or B vs. C), is it correct to first subset the DESeqDataSet such that samples that belong to types that are being compared are included and set the design to ~type?

Q2: If I want to do sample-to-sample comparisons (AC vs. BC, AT vs. BT, ...; all 15 possible comparisons), is it correct to first subset the DESeqDataSet such that we only include samples of interest (e.g. only AC and BC for AC vs. BC) and set the design to ~type+condition?

Edit: I think the second question can be answered with the answer to a previously asked question A: DESEq2 comparison with mulitple cell types under 2 conditions

Q3: It looks like for my samples, control samples cluster together and treatment samples cluster together. Is it okay to group the samples based on the condition to carry out differential expression analysis (AC & BC & CC vs. AT & BT & CT)? If so, would the design have to be set to ~type+condition or just ~condition? 

 

deseq2 differential gene expression design • 1.8k views
ADD COMMENT
3
Entering edit mode
@mikelove
Last seen 1 day ago
United States

hi,

With your sample setup, you can use a design of ~phenotype + time. You can't fit any interactions without replicates, but you have enough replication to fit the model I listed because you have 6 samples, and need to estimate 4 coefficients with this model.

1) You should not subset, but just use contrast=c("phenotype","C","A") etc in the results() function to make these comparisons. You have to use both 0 and 48 hour time points to do the comparison, again, because you don't have replicates for looking at each time point alone.

2) You cannot do comparisons within each time point

3) In order to compare the treatment effect, you can use contrast=c("time","48","0"). I would not recommend to use ~time, because you won't be controlling for phenotype differences (which could exist per gene but not visible in the PCA plot).

 

ADD COMMENT
0
Entering edit mode

Thank you, Michael!

Is it true that the newer releases for DESeq2 will now throw an error when you try to run DESeq without replicates?

ADD REPLY
1
Entering edit mode

Yes. We carried over the option to do no replicate analysis from DESeq but the results weren’t even really meaningful, and I didn’t think it was appropriate to offer the option anymore, so we deprecated over a release cycle and then removed the option. For no replicates you can compute the vst() and then make plots. This is much more reasonable than any kind of testing approach.

Note that you have sufficient replicates for a design with main effects here. 

ADD REPLY
0
Entering edit mode

I see. It was my understanding that I would need more than one sample belonging to each group (in this example, 3 cell-AC, 3 cell-AT, etc... since 3 is what I believe is the minimum for cell lines).

ADD REPLY
0
Entering edit mode

Hi Michael,

For comparisons within each time point, could I employ GFOLD which doesn't require replicates to perform DEG? Would it be not good practice to report findings from two different tools?

ADD REPLY
0
Entering edit mode

GFOLD and the difference between vst() samples is a similar approach.

ADD REPLY

Login before adding your answer.

Traffic: 494 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6