Question

Overrepresentation pathway with KEGGPROFILE and SPIA

0

Entering edit mode

gregory voisin ▴ 430

@gregory-voisin-945

Last seen 10.3 years ago

Canada

Hi , Based on a selection of gene ID , to find the overrepresentation of pathway, we could use: 1. find_enriched_pathway function (KEGG profile) or 2. spia ( SPIA package) where PNDE gives an overrepresentation. These functions works very well. However (Based on a same selection of gene ID!), I get some differents results. If I compare the top ten list, I have only one pathwayID. I expected a similar result with a little potential differents! A. If I look inside the function to compute the PVALUE the function seems the same : KEGG profile : pvalue[x] <- phyper(kegg_result_length[x], keggpathway2gene_length[x], length(unique(unlist(keggpathway2gene))) - kegg_result_length[x], length(unique(unlist(kegg_result))), lower.tail = F) And SPIA: ph[i] <- phyper(q = noMy - 1, m = pSize[i], n = length(all) - pSize[i], k = length(de), lower.tail = FALSE) HENCE, the compute of pvalues seems the same. B. The compute of pvalues seems the same ! Not really : the reference of compute the overepresentation . KEGG profile: the reference is based on keggpathway2gene And SPIA: the reference is based on "all" . All is all id present on the chips. In my case ( Illumina HT6 v2 , this chips is considered as pangenomic. HENCE, the reference muste be the same in this case. My question. In your opinion, Why this MAJOR difference between these both methods? Actually, I offer the both results but I need to justify the difference. If the authors of these methods ( or others) could be given me some explications or explain to me where I'm wrong , I will appreciate that ! Greg Montréal > sessionInfo() R version 2.15.0 (2012-03-30) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=French_Canada.1252 LC_CTYPE=French_Canada.1252 [3] LC_MONETARY=French_Canada.1252 LC_NUMERIC=C [5] LC_TIME=French_Canada.1252 attached base packages: [1] grid tcltk stats graphics grDevices utils datasets methods [9] base other attached packages: [1] xtable_1.7-0 GO.db_2.8.0 NbClust_1.2 [4] gplots_2.11.0 MASS_7.3-17 KernSmooth_2.23-7 [7] caTools_1.13 bitops_1.0-4.1 gdata_2.12.0 [10] gtools_2.7.0 pls_2.3-0 Mfuzz_2.16.0 [13] DynDoc_1.36.0 widgetTools_1.36.0 e1071_1.6-1 [16] class_7.3-3 KEGGprofile_1.0.0 KEGG.db_2.8.0 [19] TeachingDemos_2.8 png_0.1-4 SPIA_2.8.0 [22] KEGGgraph_1.14.0 XML_3.95-0.1 annotate_1.36.0 [25] GOstats_2.24.0 graph_1.36.0 Category_2.24.0 [28] lumiHumanAll.db_1.18.0 org.Hs.eg.db_2.8.0 lumiHumanIDMapping_1.10.0 [31] RSQLite_0.11.2 DBI_0.2-5 AnnotationDbi_1.20.0 [34] limma_3.14.0 lumi_2.10.0 nleqslv_1.9.4 [37] Biobase_2.18.0 BiocGenerics_0.4.0 loaded via a namespace (and not attached): [1] affy_1.36.0 affyio_1.26.0 AnnotationForge_1.0.0 [4] BiocInstaller_1.8.3 colorspace_1.1-1 genefilter_1.40.0 [7] GSEABase_1.20.0 IRanges_1.6.17 lattice_0.20-10 [10] Matrix_1.0-6 methylumi_2.4.0 mgcv_1.7-13 [13] nlme_3.1-103 preprocessCore_1.20.0 RBGL_1.34.0 [16] splines_2.15.0 stats4_2.15.0 survival_2.36-12 [19] tkWidgets_1.36.0 tools_2.15.0 zlibbioc_1.4.0 [[alternative HTML version deleted]]

SPIA SPIA • 1.1k views

ADD COMMENT • link 12.3 years ago gregory voisin ▴ 430