Hello,
I'm new in RNA-seq analysis and would like to seek some guidance on the data analysis.
My experiment design has two variables "Concentration" and "Addition". I have 8 samples in total.
There are 4 concentrations of chemical X was added in experiments which is 0, 25, 50, 75 and 100 under "Concentration".
2 values Y and N stands for with and without nutrients under "Additional nutrients".
I have to understand, when adding additional nutrients to 4 different concentrations of chemicals, any overlaped/common genes are up/down-regulated.
design = ~ Additional nutrients + Concentration design = ~ Additional nutrients + Concentration + Additional nutrients:Concentration design = ~ Additional nutrients + Additional nutrients:Concentration
Can I know which one is correct?
Code should be placed in three backticks as shown below
sessionInfo( )
why not just
And then look at the various coefficients. I think which approach is "correct" is not a theoretical question, but depends on your biological question (and, perhaps, limitations of data quality or experimental design).
Also, note that the effect of many drugs is sigmoid shaped. Using concentration as a continuous variable in a linear model may be OK for detecting differentially expressed genes if the concentrations deviate not that much from where the sigmoid 'happens', sometimes the high concentrations are already so toxic or cause so many additional side-effects that they add more noise and confusion than information. In such cases, focusing on the subset of 'good', physiologically relevant concentrations may be more fruitful.