Hi all,
As many of you do, I apologize for asking a likely dumb question, but I appreciate in advance any clarification from you. As far as I know, calcNormFactors() produces two columns of information. The first is lib.size and the second is norm.factors, which multiplying these two columns together gives us an effective library size. However, I don't understand how the normalization factor was calculated, could you please explain me in a simple way as I'm basically a biologist?
From what I read, I understand that TMM_count = raw_counts / ( libsize * norm.factor ). Please kindly let me know what is differences between TMM_count and FPKM values in terms of normalization by library size?
Thank you
Thank you very much, James. It's very helpful, but I'm really sorry for this question, your mean from "samples" in "Dividing the samples by the library size accounts" in paragraph 2 is the mapped read for each gene in a given library?
thanks
Exactly. You divide the counts for each gene by the library size (in millions, because you don't want to be dealing with normalized counts of like 0.000002 or whatever).
Thanks a lot for your great help.