Hi, I am analyzing RNA-seq and want to put cqn offset (GC% normalization) to DESeq2 object. I followed the vignette of DESeq2 but have a question about geometric mean standardization of cqn offset.
normFactors <- exp(cqn.out$glm.offset)
summary(rowMeans(normFactors))
Min., 1st Qu., Median, Mean, 3rd Qu., Max.
0.05063, 0.47634, 0.63402, 0.76283, 1.11039, 1.45960
normFactors1 <- normFactors / exp(rowMeans(log(normFactors)))
summary(rowMeans(normFactors1))
Min., 1st Qu., Median, Mean, 3rd Qu., Max.
1.009, 1.822, 2.309, 2.437, 3.022, 6.235.
I saw that the purpose of geometric mean standardization is that the mean of normalized counts for a gene is close to the mean of the unnormalized counts. But in this case, without the standardization, the distribution of normFactors are close to 1. Could you give any suggestion? Do I need a standardization in this case?
Also, could you explain the motivation of geometric mean standardization more specifically? why do we nee to make the mean of normalized counts for a gene close to the mean of the unnormalized counts ?