I wish to understand what is the difference between generatio in " dotplot" and enrichment score of "GSEA" in clusterprofiler.
I wish to understand what is the difference between generatio in " dotplot" and enrichment score of "GSEA" in clusterprofiler.
Hey,
I presume that you mean 'GeneRatio'? The GeneRatio in clusterProfiler::dotplot()
is calculated as: count / setSize
'count' is the number of genes that belong to a given gene-set, while 'setSize' is the total number of genes in the gene-set.
The Enrichment Score of GSEA is quite different. The calculation is elaborated in the published manuscript: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles
Step 1: Calculation of an Enrichment Score. We calculate an enrichment score (ES) that reflects the degree to which a set S is overrepresented at the extremes (top or bottom) of the entire ranked list L. The score is calculated by walking down the list L, increasing a running-sum statistic when we encounter a gene in S and decreasing it when we encounter genes not in S. The magnitude of the increment depends on the correlation of the gene with the phenotype. The enrichment score is the maximum deviation from zero encountered in the random walk; it corresponds to a weighted Kolmogorov–Smirnov-like statistic (ref. 7 and Fig. 1B).
Kevin
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.