In EdgeR, the 'log cpm' values are calculated as:
log2(t( (t(x)+prior.count.scaled) / lib.size ))
However, before that the library size is offset as:
lib.size <- lib.size+2*prior.count.scaled
I do not understand the factor of '2'. We add the prior to all counts, so I could understand it if we adjusted the library size by adding the number of genes times the prior, e.g.:
lib.size <- lib.size+(numberOfGenes)*prior.count.scaled
I noticed that voom does something similar:
t(log2(t(counts + 0.5)/(lib.size + 1) * 1e+06))
where we again have a library offset of '1', which is twice the prior.
Can anyone help me out here?
Best, Mikael Christensen