Hello,
I want to do an analysis of some mice data with DESeq2. We have a total 24 samples and three factors. Factor one is the genotyp (6 wildtype, 6x knockout 1, 6x knockout2, 6x knockout3) and factor two are two different cell lines (cellA and cellB). Factor three is the sex (male and female
Samples | Genotyp | Cell | Sex |
---|---|---|---|
1 | wt |
cellA |
m |
7 | wt | cellA | f |
5 | wt | cellA | m |
2 | wt | cellB | m |
8 | wt | cellB | f |
6 | wt | cellB | m |
9 | ko1 | cellA | m |
11 | ko1 | cellA | m |
13 | ko1 | cellA | m |
10 | ko1 | cellB | m |
12 | ko1 | cellB | m |
14 | ko1 | cellB | m |
17 | ko2 | cellA | m |
21 | ko2 | cellA | m |
23 | ko2 | cellA | f |
18 | ko2 | cellB | m |
22 | ko2 | cellB | m |
24 | ko2 | cellB | f |
25 | ko3 | cellA | f |
29 | ko3 | cellA | m |
31 | ko3 | cellA | m |
26 | ko3 | cellB | f |
30 | ko3 | cellB | m |
32 | ko3 | cellB | m |
Here is the PCA
We want to compare the subgroups to each other, for example wt-cell1 vs wt-cell2, ko1-cell2 vs ko1-cell2 but also wt-cell1 vs ko1-cell1. Therefore I we wanted to group both factors into one factor and don't use interactions.
dds$group <- factor(paste0(dds$Genotyp, dds$Cell))
Then we realised that 6 of the samples didn't cluster very well. Further investigations have shown that these are female mice which is why the want to chose the following design
~ Sex + group
I this the correct way? We will still be able to do all the comparison and would account for the sex. Or would it be better to remove the samples because it affects 6 out of 8 subgroups (ko1 of cellA and cellB don't have female mice). The inner variance of these two groups which don't have female mouse should be different compared to the others. If we leave the female mice in could this lead to a different set of DEGs compared to removing the 6 samples?
Thanks for your feedback
Best Mathias
How can you have the same cell line from both a female and a male mouse? I thought a cell line was a culture of cells derived from a single common ancestor.
Hey Ryan,
Sorry this was my mistake and was not precise. It's not a cell line, it's two different mouse line. One is our standard mouse line, the other one is infected with a disease.