Question

how limma work with continous variables

0

Entering edit mode

adR ▴ 40

@do-it-23093

Last seen 8 months ago

Germany, München

In one article, they used limma to fit a regression between each gene in their mRNA data with a specific protein expression. They report associated genes with that protein as log2FC. here is my question.

How is limma handling a continuous variable in the model.matrix(lets LKB1 expression)
How is the logFC calculated for each gene associated with LKB1 expression as LKB1 itself is a continuous variable? Or is it just the estimated cof?
How it should be interpreted given that I have such data and have output as below

enter image description here

Thank you!!

GordonSmyth limmaGUI lmFit voom • 838 views

ADD COMMENT • link updated 8 months ago by Gordon Smyth 52k • written 8 months ago by adR ▴ 40

0

Entering edit mode

limma fits linear models (as it says on the can!) and the logFC is just the fitted coefficient. It is exactly the same as for edgeR, which you asked about 4 years ago: Design edger with one or more continues variables

What measure of "LKB1 expression" are you entering into the limma linear model? Is it log2CPM or something else? The interpretation of the coefficient obviously depends on what you are entering into the linear model.

ADD REPLY • link 8 months ago Gordon Smyth 52k

0

Entering edit mode

The LKB1 expression is in log2 of protein abundance and the gene expression profile is in TPM.

ADD REPLY • link 8 months ago adR ▴ 40

0

Entering edit mode

limma is designed to analyse log-expression values. If the data is RNA-seq, then either logCPM or raw read counts input to voom are preferable. I have said before on this forum that I do not consider TPM to be a suitable measure of expression for differential expression analyses. You have attached the "voom" tag to your question, I hope that doesn't mean you are inputing TPM to voom, because you absolutely should not do that.

Anyway, if LKB1 is log2 protein abundance and y was log-expression, then the logFC coefficient in the limma output would be the log2-fold-change in the response gene expression that results from each doubling of LKB1 abundance (as James MacDonald has already said below).

ADD REPLY • link 8 months ago Gordon Smyth 52k

score 2 · Answer 1 · 2024-07-30

The interpretation of the coefficients for a linear model fit using limma is no different than 'regular' linear modeling using lm. If you include a continuous variable then limma fits a conventional linear regression, and then returns the estimate for that coefficient. Since it's a continuous variable, you interpret it as you would if using lm - it's the slope of the line, and represents the change in expression (on the log scale) for every unit change in LKB1. If you have log transformed LKB1, then the slope will represent the log change in expression of the gene for every doubling (assuming log2) in LKB1.