Hello,
I am using DESeq2 to perform LRT analysis of a multi-treatment experiment, like an ANOVA. I have 5 different conditions, each with 4 replicates, and would like to generate a single p-value that indicates difference among the conditions, but not specific to any two conditions (as in a t-test).
I have run it as I best understand from the other threads I have read and the manual, and I want to be sure I understand the output. The commands I entered and the outputs generated in R are pasted below.
I can't tell from the output if the p-value is for comparing all 5 conditions, or just 12dpd 0 vs 12dpd 4. When I run mcols(res)$description (see below) it says "LRT p-value: '~ group' vs '~ 1'" and "log2 fold change (MLE): group 12dpd 4 vs 12dpd 0". Does this mean I am only generating the p-value based on differences between the two groups 12dpd 4 and 12dpd 0? or all of the groups? (12dpd 0 is the first and 12dpd 4 is the last in the colData set-up).
R version 3.2.0 (2015-04-16) -- "Full of Ingredients"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin13.4.0 (64-bit)
> countData<- read.table("/Users/claireriggs/R/12dpd_annot_shared_exp_051415.txt", header=TRUE, row.names=1)
> colData<- read.table ("/Users/claireriggs/Desktop/mRNA/DESeq2/12dpd_sampleinfo.txt", header=TRUE, row.names=1)
note: colData looks like this:
run | trmt | group | |
1 | t0_1 | t0 | 12dpd_0 |
2 | t0_2 | t0 | 12dpd_0 |
3 | t0_3 | t0 | 12dpd_0 |
4 | t0_4 | t0 | 12dpd_0 |
5 | 4hrs_anoxia_1 | 4hrs_anoxia | 12dpd_1 |
6 | 4hrs_anoxia_2 | 4hrs_anoxia | 12dpd_1 |
7 | 4hrs_anoxia_3 | 4hrs_anoxia | 12dpd_1 |
8 | 4hrs_anoxia_4 | 4hrs_anoxia | 12dpd_1 |
9 | 24hrs_anoxia_1 | 24hrs_anoxia | 12dpd_2 |
10 | 24hrs_anoxia_2 | 24hrs_anoxia | 12dpd_2 |
11 | 24hrs_anoxia_3 | 24hrs_anoxia | 12dpd_2 |
12 | 24hrs_anoxia_4 | 24hrs_anoxia | 12dpd_2 |
13 | 2hrs_recov_1 | 2hrs_recov | 12dpd_3 |
14 | 2hrs_recov_2 | 2hrs_recov | 12dpd_3 |
15 | 2hrs_recov_3 | 2hrs_recov | 12dpd_3 |
16 | 2hrs_recov_4 | 2hrs_recov | 12dpd_3 |
17 | 24hrs_recov_1 | 24hrs_recov | 12dpd_4 |
18 | 24hrs_recov_2 | 24hrs_recov | 12dpd_4 |
19 | 24hrs_recov_3 | 24hrs_recov | 12dpd_4 |
20 | 24hrs_recov_4 | 24hrs_recov | 12dpd_4 |
> dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design= ~trmt)
> design(dds) = ~ group
> dds = DESeq(dds, test = "LRT", reduced = ~ 1)
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
-- note: fitType='parametric', but the dispersion trend was not well captured by the
function: y = a/x + b, and a local regression fit was automatically substituted.
specify fitType='local' or 'mean' to avoid this message next time.
final dispersion estimates
fitting model and testing
> res <- results(dds)
> summary(res)
out of 52619 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 24794, 47%
LFC < 0 (down) : 22871, 43%
outliers [1] : 149, 0.28%
low counts [2] : 0, 0%
(mean count < 0.2)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
> mcols(res)$description
[1] "mean of normalized counts for all samples" "log2 fold change (MLE): group 12dpd 4 vs 12dpd 0"
[3] "standard error: group 12dpd 4 vs 12dpd 0" "LRT statistic: '~ group' vs '~ 1'"
[5] "LRT p-value: '~ group' vs '~ 1'" "BH adjusted p-values"
Got it! Thanks, Michael. And sorry I didn't find it on my own first.