Hi all,
I am using the limma t-test on RNA-seq data, to compare the expression profiles between 2 conditions. I get different results when using log-expression values to when I'm using (z-score) normalised log-expression values.
Do you know why that could be, how scaling could affect a t-test? Should I use the scaled or unscaled log-expression data?
Thank you so much for your help!
Thank you for your answer! How about using gene signatures? (e.g. taking all genes related to a certain pathway, and average their expression, and then repeat for multiple pathways. For that, the expression has to be scaled, so the individual genes are comparable.) Would this also cause an issue?
limma provides the
roast
function for analysing pathways, and the expression values do not need to be scaled.Simply averaging expression for each pathway seems too simple to me. If you do choose to analyse data in that way, then yes it does complicate the variance modelling.
Sorry, my previous comment was not the best put. What I meant with (or rather instead of) using pathways, is that the genes (whose expression is to be averaged) are co-expressed, or we have a reason to believe so. So you would end up with a matrix of gene signatures (e.g. for pathways, or biological processes), rather than a matrix of gene expressions. I think the setting is almost the same, but could I ask if you would still recommend
roast()
for that?I already understood your question and, yes, I recommend roast for analysing co-regulated sets of genes.
If you want to ask any more questions about gene signatures, then please post a new question. I have answered you original question.
Alright, thank you very much!