EDIT: Sorry, I had completely left out that for each of the groups there are 3 biological replicates!
Hello,
I am not from a bioinformatics background, so apologies in advance if any of my questions have obvious answers or demonstrate a lack of understanding. I have been tasked with the analysis of RNAseq data with 4 experimental groups, with two factors (Treatment and knockdown).
Treatment | Knockdown | |
S1, 2, 3 | Control | Control |
S4, 5, 6 | Control | Knockdown |
S7, 8, 9 | Treat | Control |
S10, 11, 12 | Treat | Knockdown |
I am familiar enough with EdgeR to perform basic pairwise comparisons between these samples using an additive model. However I have been told I need to perform two way ANOVA, with the goal of finding the combinatorial effect of the treatment and knockdown.
Questions regarding ANOVA :
- I have read through the EdgeR manual, and I understand the "ANOVA like" method is not the same as a typical ANOVA test on normally distributed data. However is there a way to specify for it to be one-way or two-way? I have not been able to find any clarification on whether this needs to be set.
- Additionally, I understand that the test will detect any differences between the 4 biological groups, but lacks post-hoc analysis to find the specific inter-sample differences. Is there any better way to determine the significance between pairs samples after performing the ANOVA analysis, or am I left with using pairwise comparisons?
- The owner of the data had discussed the analysis with a 3rd individual, and from was discussed apparently by setting the design so that the baseline in the control+control and the final group is the Treatment+Knockdown the ANOVA like test will provide statistics that represent the combinatorial effect. From what I've read this does seem correct to me, and I believe it is due to miscommunication, but would anyone be able to offer input as to whether this can be done?
Regarding the aim of combinatorial effect:
From my understanding I think the best option to display a combinatorial effect would be to perform an ANOVA like analysis and on top of this an interaction design. Would this be the correct approach? Further to this, would a nested interaction, or full interaction formula be more appropriate? (I ask this because I only get 150 genes using the full interaction)
Nested interaction:
design.NestedInteraction <- model.matrix(~Treatment+Treatment:Knockdown) # '(Intercept)' 'Knockdown' 'ControlKnockdown:Treatment' 'Knockdown:Treatment' fit.NestedInteraction <- glmFit(set_d, design.NestedInteraction) lrt.NestedInteraction <- glmLRT(fit.NestedInteraction, coef=3:4)
Full interaction:
design.FullInteraction <- model.matrix(~Treatment+siRNA+Treatment:siRNA # '(Intercept)' 'Treatment' 'Knockdown' 'Treatment:Knockdown' fit.FullInteraction <- glmFit(set_d, design.FullInteraction) lrt.FullInteraction <- glmLRT(fit.FullInteraction, coef=4)
My understanding of interactions may be poor and I am misunderstanding something in this analysis. If this is completely the wrong approach I would greatly appreciate any advice anyone could offer!
Thank you all for the help!
Hi! Thank you for the response. I made an error in the original post, in that I forgot to mention for each biological group there are 3 replicates (A very key factor... Sorry!). The replicates should make this approach to the analysis viable, is that correct?
You're absolutely right that the one-way layout seems much easier and clearer to understand. Thank you for the help!
Yes, that now looks fine.