Overrepresentation pathway with KEGGPROFILE and SPIA
Hi , Based on a selection of gene ID , to find the overrepresentation of pathway, we could use: 1. find_enriched_pathway function (KEGG profile) or 2. spia ( SPIA package) where PNDE gives an overrepresentation. These functions works very well. However (Based on a same  selection of gene ID!),  I get some differents results. If I compare the top ten list, I have only one pathwayID. I expected a similar result with a little potential differents! A. If I look inside the function to compute the PVALUE  the function seems the same : KEGG profile : pvalue[x] <- phyper(kegg_result_length[x], keggpathway2gene_length[x],             length(unique(unlist(keggpathway2gene))) - kegg_result_length[x],             length(unique(unlist(kegg_result))), lower.tail = F) And SPIA: ph[i] <- phyper(q = noMy - 1, m = pSize[i], n = length(all) -                 pSize[i], k = length(de), lower.tail = FALSE) HENCE, the compute of pvalues seems the same. B. The compute of pvalues seems the same ! Not really : the reference of compute the overepresentation . KEGG profile: the reference is based on keggpathway2gene And SPIA: the reference is based on "all" . All is all id  present on the chips. In my case ( Illumina HT6 v2 , this chips is considered as pangenomic. HENCE, the reference muste be the same in this case. My question.  In your opinion, Why this MAJOR difference between these both methods? Actually, I offer the both results  but I need to justify the difference. If the authors of these methods ( or others) could be given me some explications or explain to me where I'm wrong , I will appreciate that ! Greg Montréal > sessionInfo() R version 2.15.0 (2012-03-30) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=French_Canada.1252  LC_CTYPE=French_Canada.1252 [3] LC_MONETARY=French_Canada.1252 LC_NUMERIC=C [5] LC_TIME=French_Canada.1252 attached base packages: [1] grid      tcltk     stats     graphics  grDevices utils datasets  methods [9] base other attached packages:  [1] xtable_1.7-0              GO.db_2.8.0               NbClust_1.2  [4] gplots_2.11.0             MASS_7.3-17 KernSmooth_2.23-7  [7] caTools_1.13              bitops_1.0-4.1            gdata_2.12.0 [10] gtools_2.7.0              pls_2.3-0                 Mfuzz_2.16.0 [13] DynDoc_1.36.0             widgetTools_1.36.0        e1071_1.6-1 [16] class_7.3-3               KEGGprofile_1.0.0         KEGG.db_2.8.0 [19] TeachingDemos_2.8         png_0.1-4                 SPIA_2.8.0 [22] KEGGgraph_1.14.0          XML_3.95-0.1 annotate_1.36.0 [25] GOstats_2.24.0            graph_1.36.0 Category_2.24.0 [28] lumiHumanAll.db_1.18.0    org.Hs.eg.db_2.8.0 lumiHumanIDMapping_1.10.0 [31] RSQLite_0.11.2            DBI_0.2-5 AnnotationDbi_1.20.0 [34] limma_3.14.0              lumi_2.10.0               nleqslv_1.9.4 [37] Biobase_2.18.0            BiocGenerics_0.4.0 loaded via a namespace (and not attached):  [1] affy_1.36.0           affyio_1.26.0         AnnotationForge_1.0.0  [4] BiocInstaller_1.8.3   colorspace_1.1-1      genefilter_1.40.0  [7] GSEABase_1.20.0       IRanges_1.6.17        lattice_0.20-10 [10] Matrix_1.0-6          methylumi_2.4.0       mgcv_1.7-13 [13] nlme_3.1-103          preprocessCore_1.20.0 RBGL_1.34.0 [16] splines_2.15.0        stats4_2.15.0         survival_2.36-12 [19] tkWidgets_1.36.0      tools_2.15.0          zlibbioc_1.4.0 [[alternative HTML version deleted]]
