Question

Behaviour of weights in limma

0

Entering edit mode

Paul Harrison ▴ 100

@paul-harrison-5740

Last seen 5 weeks ago

Australia/Melbourne/Monash University B…

Hello, I have some data from a variant of RNA-seq which I am hoping do some moderated t-test differential testing on with limma. In this data, many of the reads have sequenced through into the poly(A) tail, and we believe this gives us information about changes in poly(A) tail length. For each gene and sample, we can calculate an average observed tail length. It seems easy enough to calculate a standard error for this average as well. In some cases we have few reads and the standard error is high, in others we have quite a lot of reads and the standard error is low. What I'm hoping is that this can be translated into weights that can be fed to limma to make it behave correctly. Do weights have some specific meaning in terms of measurement variance? And how does this interact with moderation between genes, for example could including highly noisy measurements from some genes detract from the significance of other genes where the measurement is more precise? regards, Paul Harrison Victorian Bioinformatics Consortium / Monash University

limma limma • 1.3k views

ADD COMMENT • link updated 10.6 years ago by Gordon Smyth 52k • written 10.6 years ago by Paul Harrison ▴ 100

score 0 · Answer 1 · 2014-06-05

Dear Paul, > Date: Wed, 4 Jun 2014 17:30:59 +1000 > From: Paul Harrison <paul.harrison at="" monash.edu=""> > To: Bioconductor mailing list <bioconductor at="" r-project.org=""> > Subject: [BioC] Behaviour of weights in limma > > Hello, > > I have some data from a variant of RNA-seq which I am hoping do some > moderated t-test differential testing on with limma. In this data, many > of the reads have sequenced through into the poly(A) tail, and we > believe this gives us information about changes in poly(A) tail length. > > For each gene and sample, we can calculate an average observed tail > length. It seems easy enough to calculate a standard error for this > average as well. I don't think that you can actually calculate a measingful standard error. The total error depends on both biological and technical components. You can predict how the measurement error depends on the number of reads, but you don't know what proportion of the total error the measurement error makes up. > In some cases we have few reads and the standard error is high, in > others we have quite a lot of reads and the standard error is low. > > What I'm hoping is that this can be translated into weights that can be > fed to limma to make it behave correctly. Do weights have some specific > meaning in terms of measurement variance? They have a specific meaning, but it is in terms of total variance not in terms of measurement variance. The meaning of weights in limma is the same as for any linear modelling or regression procedures, which is that the total variance is assumed inversely proportional to the weight. > And how does this interact with moderation between genes, Intimately. > for example could including highly noisy measurements from some genes > detract from the significance of other genes where the measurement is > more precise? Yes. Could you not simply use voom or edgeR, both of which already do what you seem to be asking, which is to take the number of reads into account when estimating variability and assessing DE? Best wishes Gordon > regards, > Paul Harrison > > Victorian Bioinformatics Consortium / Monash University ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}