Hi everybody,
I am not sure if I asked a totally stupid or unnecessary question here, but this is an actual question bothering me for a long time after I learned how to perform differential expression analysis using tools like limma/edgeR/DESeq recently.
We all know that the mainstream DE analysis is build on (generalized) linear model framework, and such framework is flexible to correct effect caused by other covariates such as age and gender. In the classic linear model analysis , we could evaluate how well a model represents given data by looking at r-square etc and do model comparison to choose the "best" model for our data.
However, I didn't see much about such discussion on the DE analysis. No matter what model we chosen, we could get logFC and p value for each gene finally. Tutorial always tell us we could include age or gender into linear model but little is about is that any influence if I include/exclude more covariates, and how to quantify such influence until we can decide which is the "best" model for my data ?
Any suggestion or comments would be appreciated , thanks in advance.