Hi I`m doing differential gene expression analysis with deseq2. The program reports padjusted and log2 fold change between the two groups. I would like to know what is the exact calculation of this formula, because when I try to calculate the average of the two groups and applicate the log2 fold change, I get a completely different value.
Thank you
The coefficients for a GLM are estimated iteratively, so there's no closed form solution that you can use. It's not like conventional linear regression where you could calculate the coefficients by hand.
With a multiple group analysis, where each group is assigned its own coefficient (or contrast of coefficients), it will be very close to what you compute with arithmetic average of scaled counts, especially for higher count genes. But not exactly as you say.
If DESeq is doing the calculation you believe it is, you should not get a completely different value. It should be pretty close.
If the calculations could be easily done in Excel, people would do that, and not use DESeq. People use DESeq because the best way is not doable in Excel.
With a multiple group analysis, where each group is assigned its own coefficient (or contrast of coefficients), it will be very close to what you compute with arithmetic average of scaled counts, especially for higher count genes. But not exactly as you say.