Hello community,
I'm interested in conducting a DEG analysis to obtain differentially expressed genes based on treatment per cell line. In other words, determine the effect of condition per cell line and extract DEGs based on treatment.
Overview
Experimental Design: This consists of a two-factorial design with factors: cell line (A vs B) and treatment (control vs treatment).
Research Question: Does the treatment have a different given effect per cell line?
Goal: Compare the effect of treatment per cell line (A vs B)
Experimental Design:
sample cell_line condition
A_Ctr_1 A control
A_Ctr_2 A control
A_Ctr_3 A control
A_Met_1 A treatment
A_Met_2 A treatment
A_Met_3 A treatment
B_Ctr_1 B control
B_Ctr_2 B control
B_Ctr_3 B control
B_Met_1 B treatment
B_Met_2 B treatment
B_Met_3 B treatment
DESeq2 Analysis
#Check multi-factorial design for experimental design
print(model.matrix(~cell_line + condition, expDesign))
# Constructing the DESeq2 object (using two design factor)
dds <- DESeqDataSetFromMatrix(countData = geneCountsMat,
colData = expDesign,
design = ~ cell_line + condition + cell_line:condition)
# Filter out lowly expressed genes, here the rowSums(counts(dds)) >= 10 filters out low-count genes
# i.e. keep rows that have at least 10 reads
dds <- dds[ rowSums(counts(dds)) >= 10, ]
#select the reference level for comparing cell lines (set the factor level)
#dds$cell_line <- relevel(dds$cell_line, ref = "A")
"Running DESeq"
# Estimate size factors and dispersion
dds <- DESeq(dds)
# see all comparisons (here there are two given we want to compare conditions and cel_lines)
resultsNames(dds)
Questions:
- Is the design here enough and how can I obtain genes per cell line, would this be done with contrasts in results?
- Do I need to relevel the baseline per cell line in this case?
- Should I instead use the interactions instead to obtain genes per cell line?
- vst normalization also necessary here?
There is a section in the vignette that covers interactions and (depending on use case) how one can do a full factorial design to make interpretation simpler. vst is not part of DE testing, it's for downstream tasks such as PCA and visualization, see vignette.