Dear Gordon,
First of all: thanks a lot for the limma package, it's absolutely great and I am very grateful to you for having created such a useful piece of software.
What I wanted to ask you is this: I recently came across Adi Tarca's publication which compares the performance of different gene set analysis methods on 42 microarray datasets, each of which contains a comparison of healthy versus disease, of which the differential expression is used to perform a pathway analysis on KEGG.
The comparison shows that camera is:
- the best method in terms of specificity (all significant gene sets are truly significant).
- but suboptimal in terms of prioritization (are most relevant gene sets top ranked?)
- and suboptimal in terms of sensitivity (are gene sets which are called sensitivity).
I was wondering whether these results are outdated, however, because in recent interactions on BioC support (C: Do the outputs of Limma's competitive gene set methods (camera, romer) require a), while you did acknowledge that the original camera method specifically penalizes those sets that are most likely to be biologically relevant, you also offered a way to overcome this problem by using camera() with a preset intergene correlation of 0.05 (your initial suggestions) or 0.01 (later suggestion). I guess if camera would have been run by these settings, the results of Tarca et al (2013)'s comparison would have been much more favourable. I was wondering whether you could comment on this?
Also, Tarca et al did not include the other two limma methods for gene set testing (romer, mroast). I was wondering whether you have any idea how they would compare in terms of prioritization, specificity, and sensitivity?
Thanks!