I would like to estimate DE between a gene in a case and control, however, there are two underlying teams (A and B) (the experiment was run in two batches) so I would like to consider them as well.
the data format is sth like below and I have made sure, "condition" and "team" are factor
gene_in_sample | condition | team |
4.1 | control | team A |
3.1 | control | team B |
4.8 | case | team A |
3.8 | case | team B |
but in reality there are 12 controls and 12 cases, which each are also split into half team A and team B
diagdds = phyloseq_to_deseq2(mydata, ~ condition + team) diagdds = DESeq(diagdds, test="Wald", fitType="parametric") res = results(diagdds, cooksCutoff = FALSE, contrast = c("condition", "A", "B")) sigtab.deseq = res[which(res$padj < 0.05), ]
I wonder if the design should be sth like design = ~ condition + team
I am not sure, if it is right, since I can see a clear distinction in my PCA between these case/control so I expected to see some significant taxa - but by adding "+ condition", there is nothing significant.
I appreciate your comments,
Can you edit your question to say how many samples there are in each group/team, and also add column headings to your table?
Does it look better ?