Question

extracting cpm expression from limma fit

0

Entering edit mode

map2085 ▴ 40

@map2085-9227

Last seen 7.1 years ago

United States

I would like to extract mean log2cpm and log2cpm standard deviation per condition after eBayes:

e.g.

.....

design <- model.matrix( ~0 + conditions + other.variables )

......

fit <- eBayes(fit)

If length(conditions) > 2, then fit$coefficients contains directly interpretable mean log2cpm per condition.

However, if length(conditions) == 2, then how do I get the average log2cpm from fit? In this case, "coefficients" gives the fold change, and Amean is the "average" of the log2cpm expression for the two conditions. But how do I obtain the log2cpm per gene per condition?

As for the standard deviation of log2cpm expression per gene, per condition, is the following is correct?

fit$stdev.unscaled / fit$sigma

Thank you,

Matthew

limma cpm • 2.1k views

ADD COMMENT • link updated 8.9 years ago by José Luis Lavín ▴ 10 • written 9.2 years ago by map2085 ▴ 40

score 1 · Answer 1 · 2016-03-07

I presume you've processed your count data with voom. However, your example doesn't make much sense. If the length of the conditions factor is two, that means you only have two samples. It is impossible to fit to a linear model to this setup with lmFit if the two samples belong to different conditions, as you won't have any residual degrees of freedom. Thus, the only possibility is if the two samples belong the same condition, in which case the coefficient just represents the average log-expression in that condition - no log-fold changes involved. More generally, if you're using an intercept-free model, then the coefficients related to condition should represent the average expression in each condition (with caveats depending on the other.variables).

As for the other question, the standard errors of the coefficients (including the average expression within each condition, depending on how you've set up your model) should be:

fit$stdev.unscaled * fit$sigma

... if I recall correctly. To use limma's moderation, replace fit$sigma with sqrt(fit$s2.post).

score 0 · Answer 2 · 2016-03-17

0

Entering edit mode

map2085 ▴ 40

@map2085-9227

Last seen 7.1 years ago

United States

Dear Aaron,

Pardon my poor explanation above. I meant: there are two conditions, i.e. nlevels(conditions) = 2. Each has multiple samples, of course.

With regards to the Standard Deviation, I thought that:

standard error = fit$stdev.unscaled * fit$sigma

but I am looking for standard deviation. In the typical case, SD = SE * sqrt(N) . But in our complicated scenario (different library sizes, normalization methods, etc.) I suspect that deriving the standard deviation is much more involved...

Also, fit$s2.post is only available after a call to eBayes. But the object returned by eBayes has contrast (i.e. log2 fold change) coefficients, and thus the s2.post pertains to the fold-change. However, I am looking for the standard deviation of expression in each condition (not contrast)..

Thank you,

ADD COMMENT • link 9.1 years ago map2085 ▴ 40

0

Entering edit mode

Yes, I was referring to the standard error (this has been fixed above). This would seem to be the more relevant metric - my rule of thumb is that standard errors refer to the variability of your coefficient estimates, while standard deviations refer to the variability of your observations. I'm not sure whether the concept of the standard deviation of an estimate makes any sense. The variability of the coefficient estimate will naturally drop as you get more information (i.e., larger N, hence smaller SE) to obtain that estimate, while the variability of the observations won't change regardless of how many observations you collect.

For your second point, you need to make sure that the coefficients in your design matrix represent the average expression in each condition. If so, then the computed standard errors will represent those of average expression estimates. Either way, the shrunken variances are agnostic to the parameterisation of the design matrix so it shouldn't matter in that respect.

ADD REPLY • link 9.1 years ago Aaron Lun ★ 28k

score 0 · Answer 3 · 2016-06-20

Dear limma team,

First of all, I want to thank you for this wonderful package. This said, I need to ask you a question on the subject mentioned in this post, Standard error calculation from my fit object using the command in the post:

>std.error =  fit$stdev.unscaled * fit$sigma
> std.error
numeric(0)
> fit$stdev.unscaled
NULL
> fit$sigma
NULL

I'd ask if some experienced limma user may help me figuring out what the problem is.

Here is the structure of my fit object:

> str(fit)

List of 1
 $ :Formal class 'MArrayLM' [package "limma"] with 1 slot
  .. ..@ .Data:List of 12
  .. .. ..$ : num [1:19568, 1:2] 7.93 9.38 7.46 7.81 10.88 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : chr [1:19568] "ILMN_2417611" "ILMN_2896528" "ILMN_2721178" "ILMN_3033922" ...
  .. .. .. .. ..$ : chr [1:2] "g1" "g2"
  .. .. ..$ : int 2
  .. .. ..$ : NULL
  .. .. ..$ :List of 5
  .. .. .. ..$ qr   : num [1:24, 1:2] -3.464 0.289 0.289 0.289 0.289 ...
  .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. ..$ : NULL
  .. .. .. .. .. ..$ : chr [1:2] "g1" "g2"
  .. .. .. ..$ qraux: num [1:2] 1.29 1
  .. .. .. ..$ pivot: int [1:2] 1 2
  .. .. .. ..$ tol  : num 1e-07
  .. .. .. ..$ rank : int 2
  .. .. .. ..- attr(*, "class")= chr "qr"
  .. .. ..$ : int [1:19568] 22 22 22 22 22 22 22 22 22 22 ...
  .. .. ..$ : Named num [1:19568] 0.213 0.156 0.16 0.26 0.135 ...
  .. .. .. ..- attr(*, "names")= chr [1:19568] "ILMN_2417611" "ILMN_2896528" "ILMN_2721178" "ILMN_3033922" ...
  .. .. ..$ : num [1:2, 1:2] 0.0833 0 0 0.0833
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : chr [1:2] "g1" "g2"
  .. .. .. .. ..$ : chr [1:2] "g1" "g2"
  .. .. ..$ : num [1:19568, 1:2] 0.289 0.289 0.289 0.289 0.289 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : chr [1:19568] "ILMN_2417611" "ILMN_2896528" "ILMN_2721178" "ILMN_3033922" ...
  .. .. .. .. ..$ : chr [1:2] "g1" "g2"
  .. .. ..$ : int [1:2] 1 2
  .. .. ..$ : Named num [1:19568] 7.85 9.39 7.46 7.89 10.86 ...
  .. .. .. ..- attr(*, "names")= chr [1:19568] "ILMN_2417611" "ILMN_2896528" "ILMN_2721178" "ILMN_3033922" ...
  .. .. ..$ : chr "ls"
  .. .. ..$ : num [1:24, 1:2] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : NULL
  .. .. .. .. ..$ : chr [1:2] "g1" "g2"

Please accept my apologies if this can be considered a very naive question...

Best wishes

JL