I have a RNA-seq experiment with the following design:
<caption>cell line | media | treatment |
cellline1 | medium1 | ctrl |
cellline1 | medium1 | ctrl |
cellline1 | medium1 | ctrl |
cellline1 | medium1 | treated |
cellline1 | medium1 | treated |
cellline1 | medium1 | treated |
cellline1 | medium2 | ctrl |
cellline1 | medium2 | ctrl |
cellline1 | medium2 | ctrl |
cellline1 | medium2 | treated |
cellline1 | medium2 | treated |
cellline1 | medium2 | treated |
cellline2 | medium1 | ctrl |
cellline2 | medium1 | ctrl |
cellline2 | medium1 | ctrl |
cellline2 | medium1 | treated |
cellline2 | medium1 | treated |
cellline2 | medium1 | treated |
cellline2 | medium2 | ctrl |
cellline2 | medium2 | ctrl |
cellline2 | medium2 | ctrl |
cellline2 | medium2 | treated |
cellline2 | medium2 | treated |
cellline2 | medium2 | treated |
to get genes responding more in medium2 of cellline1 and cellline2 due to treatment, I analyzed both cell lines separately with the following formula:
cell1 = media+treatment+media:treatment
cell2 = media+treatment+media:treatment
Q1. how to fetch genes responding more (more up- or down-regulated), less (less up- or down-regulated) or opposite (up in medium2 but down in medium1) in medium2 vs. medium 1 of individual cell lines?
I want to obtain genes which are responding more (more up- or down-regulted) in medium2(treated) of cellline2 vs. medium2 of cellline1. I used the following design:
I combined cell line and treatment as one factor (cell_treat): e.g., cellline1_ctrl, cellline1_treated
cell2medium2 = media+cell_treat+media:cell_treat
with this formula, I'm getting those genes which are not even significant with their nominal p-values (p<0.05) during pairwise comparisons.
Q2. What am I doing wrong in this design?
Please help to improve my designs!
Thanks
Gavin - thank you for your reply.
Q2: Sorry for being not very clear about my question2. I want to obtain those genes which are responding more (more up- or more down-regulated in [medium2(treated/control) vs medium1(treated/control) of cellline2] VS. [medium2(treated/control) vs medium1(treated/control) of cellline1] i.e.
Cellline2[medium2(treated/control samples) VS. medium1(treated/control samples)] VS. Cellline1[medium2(treated/control samples) VS. medium1(treated/control samples)]
"(treated/control)" i.e. comparison of treated samples vs. control samples
Regarding your suggestion (b), performing a pairwise comparison by combining M, C and T terms together is very interesting, however, how to get the direction of expression change (whether the gene is up in celline2 or celline1)?
Please see below the code to recreate coldata:
coldata = data.frame(row.names = c('Control1.1', 'Control1.2', 'Control1.3', 'Treated1.1', 'Treated1.2', 'Treated1.3',
'Control2.1', 'Control2.2', 'Control2.3', 'Treated2.1', 'Treated2.2', 'Treated2.3',
'Control3.1', 'Control3.2', 'Control3.3', 'Treated3.1', 'Treated3.2', 'Treated3.3',
'Control4.1', 'Control4.2', 'Control4.3', 'Treated4.1', 'Treated4.2', 'Treated4.3'
),
cellline = factor(rep(c("cellline1","celline2"),each=12)),
media = factor(rep(c("medium1","medium2"),each=3)),
treatment = factor(rep(c("control","treated"),each=3)))
Thank you for your time!
OK, it looks like neither of my guesses was correct then, as you're expanded question refers to both cell-lines, both media, and both conditions. It's a third-order effect, and so looking at design so my approach 'A' is correct, but you test the final, 3rd order coefficient - the one with both an M-word, a C-word and a T-word in the resultsNames. Interpreting this is tricky. So the first order treatment effect is treat/control. The second order MT effects allow us to determine the ratio of treatment effects between the two media, for each cell line (given, for media1 and media2 respectively, by the T expression and the T+MT expression). A positive number for MT could mean that treatment is having a more positive effect in media2, a less negative effect in media2, or a negative effect in media1 and positive in media2. So when we get to your required third order effect, which is effectively the difference of your second-order MT effects between the two cell-lines, we've got combinatorially way more possible interpretations. You can either work out the interpretation long-hand, or 'cluster' your results by the signs of the individual coefficients that are provided by the fit - the different signatures (e.g. +++++-) will correspond to different qualitative behaviours.
Gavin - Thank you for taking the time to explain in details. I appreciate your time and help.