Hello everyone,
I have a question related to the analysis of those genes that are differentially expressed in three mutants single KO A, single KO B and double KO C where C is a combination of mutant A + mutant B.
I am interested in analyzing the effect of combining mutant A and B to their individual effect.
So, I considered an additive effect if the double mutant is equal to the sum of the two simple mutants (δ = C - A - B = 0) and an interaction effect if δ ≠ 0.
I used the table of counts for all genes and I used DESeq2 to normalize. Then I used contrast to determine δ
This is my code:
data <- read.table("total_counts.txt",header=T,row.names=1)
groups <- factor(rep(c("CTL", "mutantA", "mutantB", "mutantC"), each=3))
sampleInfo <- data.frame(groups,row.names=colnames(data))
dds <- DESeqDataSetFromMatrix(countData = data, colData = sampleInfo, design = ~groups)
dds$groups <- relevel(dds$groups, "CTRL")
colData(dds)
DataFrame with 12 rows and 1 column
groups
<factor>
siCtrl_1 CTRL
siCtrl_2 CTRL
siCtrl_3 CTRL
mutantA_1 mutantA
mutantA_2 mutantA
... ...
mutantB_2 mutantB
mutantB_3 mutantB
mutantC_1 mutantC
mutantC_2 mutantC
mutantC_3 mutantC
design(dds) <- ~ 1 + groups
dds <- DESeq(dds)
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
resultsNames(dds)
[1] "Intercept" "groups_mutantA_vs_CTRL" "groups_mutantB_vs_CTRL" "groups_mutantC_vs_CTRL"
mod_mat <- model.matrix(design(dds), colData(dds))
mod_mat
(Intercept) groupsmutantA groupsmutantB groupsmutantC
siCtrl_1 1 0 0 0
siCtrl_2 1 0 0 0
siCtrl_3 1 0 0 0
mutantA_1 1 1 0 0
mutantA_2 1 1 0 0
mutantA_3 1 1 0 0
mutantB_1 1 0 1 0
mutantB_2 1 0 1 0
mutantB_3 1 0 1 0
mutantC_1 1 0 0 1
mutantC_2 1 0 0 1
mutantC_3 1 0 0 1
attr(,"assign")
[1] 0 1 1 1
attr(,"contrasts")
attr(,"contrasts")$groups
[1] "contr.treatment"
A <- colMeans(mod_mat[dds$groups == "mutantA", ])
B <- colMeans(mod_mat[dds$groups == "mutantB", ])
C <- colMeans(mod_mat[dds$groups == "mutantC", ])
CTRL <- colMeans(mod_mat[dds$groups == "CTRL", ])
res <- results(dds, contrast = C - A - B)
res
log2 fold change (MLE): -1,-1,-1,+1
Wald test p-value: -1,-1,-1,+1
DataFrame with 58721 rows and 6 columns
baseMean log2FoldChange lfcSE stat pvalue padj
<numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
ENSG00000000003.14 1235.661 -10.28328 0.0878839 -117.0099 0 0
ENSG00000000005.5 0.000 NA NA NA NA NA
ENSG00000000419.12 1625.580 -10.70786 0.0909223 -117.7694 0 0
ENSG00000000457.13 509.194 -8.85948 0.1080919 -81.9625 0 0
ENSG00000000460.16 1054.786 -9.75485 0.0815451 -119.6252 0 0
... ... ... ... ... ... ...
ENSG00000285990.1 1.28962 -1.13755 1.8312013 -0.621204 0.534465 0.628871
ENSG00000285991.1 692.47452 -9.63571 0.0863505 -111.588309 0.000000 0.000000
ENSG00000285992.1 0.00000 NA NA NA NA NA
ENSG00000285993.1 0.00000 NA NA NA NA NA
ENSG00000285994.1 2.82204 -1.18911 1.4336056 -0.829452 0.406849 0.502778
sessionInfo( )
What I don't understand is the relationship between the log2FC and the δ. How can I determine δ from the log2FC? If I want to determine only genes with an interaction effect, or an additive effects, should I use the total table of count for all genes or should I first remove all genes where there is not significative difference between mutant A vs mutant C / mutant B vs mutant C / CTRL vs mutant C?
Thanks in advance!
Sophie