Hello,
I have a similar experiment to this one: Design formula and Design matrix in DESeq
This one might be useful to compare with as well: C: DESEq2 comparison with mulitple cell types under 2 conditions
I have 2 groups for genotype (Mut vs. Ctr), and 4 time points, with 8 biological replicates each. As I am interested in the interactions, I ran my model as you suggested in the manual and in similar posts:
dge <- DESeqDataSetFromMatrix(countData = counts, colData = phenomodel1, design = ~group) dgeWALD <- DESeq(dge)
where group is Ctr_time1, Ctr_time2, Ctr_time3, Ctr_time4, Mut_time1, Mut_time2, Mut_time3, Mut_time4.
This works great in order to extract the genotype effect specifically for each age (example of genotype effect for age 1: Waldresults1 <- results(dgeWALD, contrast=c("group", "Mut_time1", "Ctr_time1")) .
In order to extract the overall effect of genotype (an average of the effect in time 1, 2, 3, and 4), I used the following:
Waldresultsgenotype <- results(dgeWALD, contrast=list(c("Mut_time1", "Mut_time2", "Mut_time3", "Mut_time4"), c("Ctr_time1", "Ctr_time2", "Ctr_time3", "Ctr_time4")), listValues=c(1,-1))
Now I am also interested in the effect of genotype across aging... And the aging effect in general really... And so I am trying to use the likelihood ratio test. However, after reading the manuals and searching for similar posts, I still can't understand how I can run this. So here are my specific questions:
From my understanding, I should run something like this
1) dgeLRT_genotype_aging <- DESeq(dge, test="LRT", full = ~???, reduced = ~age)
for the effect of genotype accross aging (meaning I take out the effect of age and thus stay with the effect of genotype)
2) dgeLRT_genotype_aging <- DESeq(dge, test="LRT", full = ~???, reduced = ~genotype)
for the effect of age, independently of genotype (meaning I take out the effect of genotype and thus stay with the effect of aging)
But my full model was ~group (interaction!), so I really don't understand very well how I define the full model and the reduced model. In every case I see the LRT being used the full model is something like ~condition+genotype, but in this example one would be looking at the effect of genotype whilst controlling for the effect of condition, right? Which is not my interest.
Therefore, my question is how exactly should I define (for the LRT):
a) DESeqDataSetFromMatrix (colData and design)
c) DESeq
d) DESeq LRT (full and reduced)
If you know of any similar posts to similar studies I would really appreciate if you could help me find them.
Thank you in advance!
As requested (in another post), this is my colData() for the design above:
> colData(dge) DataFrame with 63 rows and 1 column group <factor> A17 A_WT6m A18 E_TG6m A19 B_WT8m A20 F_TG8m A21 C_WT10m ... ... J20 G_TG10m J21 D_WT12m J22 H_TG12m J23 A_WT6m J24 E_TG6m
# where A17, A18, ..., are the IDs for each sample, and A_WT6m, B_WT8m, C_WT10m, D_WT12m, E_TG6m, F_TG8m, G_TG10m, H_TG12m are the groups (I 'merged' being "WT" or "TG" with being "6m", "8m, "10m", or "12m"; they are named A-H for practical purposes when using R).
Hi,
I would appreciate any help. I am having an error with
here is my code
The output I am getting is "Error in DESeqDataSet(se, design = design, ignoreRank) : all variables in design formula must be columns in colData"
The readCounColData is
sampleList genotype timePoint 1 WT_0_1 WT 0 2 WT_0_2 WT 0 3 WT_0_3 WT 0 4 KO_0_1 KO 0 5 KO_0_2 KO 0 6 KO_0_3 KO 0 7 OV_0_1 OV 0 8 OV_0_2 OV 0 9 OV_0_3 OV 0 10 WT_12_1 WT 12 11 WT_12_2 WT 12 12 WT_12_3 WT 12 13 KO_12_1 KO 12 14 KO_12_2 KO 12 15 KO_12_3 KO 12 16 OV_12_1 OV 12 17 OV_12_2 OV 12 18 OV_12_3 OV 12
I figured out. In my samplelist i.e readCountcolData, I should have the group as a column.