How does normalisation affect the outcome of the Limma t-test like test (using the eBayes() function)?
1
0
Entering edit mode
Regeroka • 0
@regeroka-20875
Last seen 5.5 years ago

Hi all,

I am using the limma t-test on RNA-seq data, to compare the expression profiles between 2 conditions. I get different results when using log-expression values to when I'm using (z-score) normalised log-expression values.

Do you know why that could be, how scaling could affect a t-test? Should I use the scaled or unscaled log-expression data?

Thank you so much for your help!

limma • 1.9k views
ADD COMMENT
2
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia

Yes, you should input unscaled log-expression values to limma, not z-scores. One of the major purposes of limma is to gain statistical power by modelling the variances. If you divide out the variances, then you prevent limma from knowing what the true variances are. Another disadvantage of z-scores is that they bias the log-fold-change estimates.

ADD COMMENT
0
Entering edit mode

Thank you for your answer! How about using gene signatures? (e.g. taking all genes related to a certain pathway, and average their expression, and then repeat for multiple pathways. For that, the expression has to be scaled, so the individual genes are comparable.) Would this also cause an issue?

ADD REPLY
0
Entering edit mode

limma provides the roast function for analysing pathways, and the expression values do not need to be scaled.

Simply averaging expression for each pathway seems too simple to me. If you do choose to analyse data in that way, then yes it does complicate the variance modelling.

ADD REPLY
0
Entering edit mode

Sorry, my previous comment was not the best put. What I meant with (or rather instead of) using pathways, is that the genes (whose expression is to be averaged) are co-expressed, or we have a reason to believe so. So you would end up with a matrix of gene signatures (e.g. for pathways, or biological processes), rather than a matrix of gene expressions. I think the setting is almost the same, but could I ask if you would still recommend roast() for that?

ADD REPLY
0
Entering edit mode

I already understood your question and, yes, I recommend roast for analysing co-regulated sets of genes.

If you want to ask any more questions about gene signatures, then please post a new question. I have answered you original question.

ADD REPLY
0
Entering edit mode

Alright, thank you very much!

ADD REPLY

Login before adding your answer.

Traffic: 557 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6