Question

How is the mean-variance plot calculated?

0

Entering edit mode

neverstop • 0

@neverstop-7227

Last seen 8.8 years ago

I'm trying to understand how the mean-variance plot is calculated with the meanVarPlot() of EDASeq package. I tried to reproduce it manually with the code below but the result is very different.

Given the expression matrix (row=genes, columns=samples), I used the logarithm to transform the counts before computing mean and variance. Since there are zero counts, I considered the maximum between 0 and the log of the counts. After that I calculated the row means and the row variances (the mean and variance of each gene across all samples).

Here's the code I used:

rm(list=ls())
library(yeastRNASeq)
library(EDASeq)
data(geneLevelData)
head(geneLevelData)
Data=newSeqExpressionSet(as.matrix(geneLevelData))
meanVarPlot(Data,log=T)
geneLevelDataM=as.matrix(geneLevelData)

means=apply(pmax(log(geneLevelDataM),0),1,mean)
variances=apply(pmax(log(geneLevelDataM),0),1,var)
plot(means,variances)

means=apply(pmax(log2(geneLevelDataM),0),1,mean)
variances=apply(pmax(log2(geneLevelDataM),0),1,var)
plot(means,variances)

edaseq • 3.4k views

ADD COMMENT • link updated 8.8 years ago by Wolfgang Huber ★ 13k • written 8.8 years ago by neverstop • 0

score 2 · Accepted Answer · 2016-05-16

2

Entering edit mode

Wolfgang Huber ★ 13k

@wolfgang-huber-3550

Last seen 3 days ago

EMBL European Molecular Biology Laborat…

Have a look at the source, e.g. https://github.com/Bioconductor-mirror/EDASeq/blob/master/R/methods-SeqExpressionSet.R (search for "meanVarPlot").

Other ways for accessing the source of a Bioconductor package are: download the .tar.gz from the package homepage and unpack; via the subversion server; via the R command line, you can list each function's body (it's a bit wordy for S4 methods).

It looks like the function does something else than what its manual page says, regarding the log argument.

ADD COMMENT • link 8.8 years ago Wolfgang Huber ★ 13k

0

Entering edit mode

The function does indeed do something else than what the manual page says, i.e., it takes the log after computing mean and variance.

Thank you both for spotting this incongruence! I will fix it in devel.

Just to add to Wolfgang's answer re: source code, another way, within R, is to type:

getMethod("meanVarPlot", "SeqExpressionSet")

ADD REPLY • link 8.8 years ago davide risso ▴ 980