Entering edit mode
Massimo Pinto
▴
390
@massimo-pinto-3396
Last seen 10.2 years ago
Greetings all,
Having first searched the GMane archives, I suppose the following
question is appropriate. After selecting my 'entrezUniverse', I have
run an hypergeometric test, as implemented in functions provided in
GOstats, and thus obtained a readable, hyperlinked report containing a
list of the ontology nodes that appear to have been significantly
implicated, along with p values, odds ratio, number of significantly
regulated genes that fall in each listed node, etc.
The report is not exactly short, and I am looking for criteria to
proceed with the interpretation of the results. Specifically, I am
trying to hunt for the most 'interesting' implicated ontology nodes
and, to this end, a marker may be useful. Assuming this line of
thinking is appropriate and focusing on the first few lines of the
report:
> GO.df.CM3.ctr1.2.3
GOBPID Pvalue OddsRatio ExpCount Count Size
Term
1 GO:0040011 9.322848e-05 2.558205 11.8928490 26 145
locomotion
2 GO:0002376 2.337660e-04 1.887324 28.2147590 47 344
immune system process
3 GO:0007165 2.821193e-04 1.541496 82.4297464 110 1005
signal transduction
4 GO:0006954 2.840421e-04 2.892962 7.3817683 18 90
inflammatory response
5 GO:0051272 4.985200e-04 6.638731 1.5583733 7 19
positive regulation of cell motion
6 GO:0007154 5.866973e-04 1.493138 88.4992004 115 1079
cell communication
[...]
I do wonder whether the correct marker for my hunt is the p value, or
the Odds Ratio, which would rank my list differently. Plus, the
ontology nodes containing the largest number of genes (Size, above)
may be of too broad scope to reveal the presence of a biological
process that is specifically implicated in my experiment. By the same
token, ontology nodes with too few genes may not provide convincing
evidence of their implication.
Put shortly, what's the suggested strategy to proceed?
Thank you very much in advance to all of you who will read this post.
Yours
Massimo