Entering edit mode
Hi ,
Based on a selection of gene ID , to find the overrepresentation of
pathway,
we could use: 1. find_enriched_pathway function (KEGG profile) or 2.
spia ( SPIA package) where PNDE gives an overrepresentation. These
functions works very well. However (Based on a same selection of gene
ID!),
I get some differents results.
If I compare the top ten list, I have only one pathwayID. I expected a
similar result with a little potential differents!
A. If I look inside the function to compute the PVALUE the function
seems the same :
KEGG profile :
pvalue[x] <- phyper(kegg_result_length[x], keggpathway2gene_length[x],
length(unique(unlist(keggpathway2gene))) -
kegg_result_length[x],
length(unique(unlist(kegg_result))), lower.tail = F)
And SPIA:
ph[i] <- phyper(q = noMy - 1, m = pSize[i], n = length(all) -
pSize[i], k = length(de), lower.tail = FALSE)
HENCE, the compute of pvalues seems the same.
B. The compute of pvalues seems the same ! Not really : the reference
of compute the overepresentation .
KEGG profile:
the reference is based on keggpathway2gene
And SPIA:
the reference is based on "all" . All is all id present on the chips.
In my case ( Illumina HT6 v2 , this chips is considered as pangenomic.
HENCE, the reference muste be the same in this case.
My question.
In your opinion,
Why this MAJOR difference between these both methods?
Actually, I offer the both results but I need to justify the
difference.
If the authors of these methods ( or others) could be given me some
explications or explain to me where I'm wrong , I will appreciate that
!
Greg Montréal
> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-pc-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=French_Canada.1252 LC_CTYPE=French_Canada.1252
[3] LC_MONETARY=French_Canada.1252 LC_NUMERIC=C
[5] LC_TIME=French_Canada.1252
attached base packages:
[1] grid tcltk stats graphics grDevices utils
datasets methods
[9] base
other attached packages:
[1] xtable_1.7-0 GO.db_2.8.0 NbClust_1.2
[4] gplots_2.11.0 MASS_7.3-17
KernSmooth_2.23-7
[7] caTools_1.13 bitops_1.0-4.1 gdata_2.12.0
[10] gtools_2.7.0 pls_2.3-0 Mfuzz_2.16.0
[13] DynDoc_1.36.0 widgetTools_1.36.0 e1071_1.6-1
[16] class_7.3-3 KEGGprofile_1.0.0 KEGG.db_2.8.0
[19] TeachingDemos_2.8 png_0.1-4 SPIA_2.8.0
[22] KEGGgraph_1.14.0 XML_3.95-0.1
annotate_1.36.0
[25] GOstats_2.24.0 graph_1.36.0
Category_2.24.0
[28] lumiHumanAll.db_1.18.0 org.Hs.eg.db_2.8.0
lumiHumanIDMapping_1.10.0
[31] RSQLite_0.11.2 DBI_0.2-5
AnnotationDbi_1.20.0
[34] limma_3.14.0 lumi_2.10.0 nleqslv_1.9.4
[37] Biobase_2.18.0 BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] affy_1.36.0 affyio_1.26.0 AnnotationForge_1.0.0
[4] BiocInstaller_1.8.3 colorspace_1.1-1 genefilter_1.40.0
[7] GSEABase_1.20.0 IRanges_1.6.17 lattice_0.20-10
[10] Matrix_1.0-6 methylumi_2.4.0 mgcv_1.7-13
[13] nlme_3.1-103 preprocessCore_1.20.0 RBGL_1.34.0
[16] splines_2.15.0 stats4_2.15.0 survival_2.36-12
[19] tkWidgets_1.36.0 tools_2.15.0 zlibbioc_1.4.0
[[alternative HTML version deleted]]