Entering edit mode
Hi - I have some relatively small genes lists (around 10-20 significant genes (padj<0.05), and tried goseq to look for over represented GO terms and KEGG pathways. I also did the 'sampling' method as a negative control but this gave very similar results to the real test (similar pvalues and terms):
> head(GO.samp.MF.LowvNon) # this is the sampling method control category over_represented_pvalue under_represented_pvalue numDEInCat numInCat 1769 GO:0016362 0.001998002 1 1 2 2627 GO:0034711 0.001998002 1 1 3 1377 GO:0008466 0.003996004 1 1 1 2009 GO:0017002 0.003996004 1 1 7 2762 GO:0038023 0.003996004 1 4 579 3172 GO:0048185 0.005994006 1 1 11 term ontology 1769 activin receptor activity, type II MF 2627 inhibin binding MF 1377 glycogenin glucosyltransferase activity MF 2009 activin-activated receptor activity MF 2762 signaling receptor activity MF 3172 activin binding MF > head(GO.MF.LowvNon) # this is the real test category over_represented_pvalue under_represented_pvalue numDEInCat numInCat 1377 GO:0008466 0.001197658 1.0000000 1 1 1769 GO:0016362 0.002340668 0.9999987 1 2 2627 GO:0034711 0.003514714 0.9999962 1 3 18 GO:0000155 0.003516708 0.9999962 1 3 2762 GO:0038023 0.003856110 0.9996154 4 579 730 GO:0004673 0.004728336 0.9999922 1 4 term ontology 1377 glycogenin glucosyltransferase activity MF 1769 activin receptor activity, type II MF 2627 inhibin binding MF 18 phosphorelay sensor kinase activity MF 2762 signaling receptor activity MF 730 protein histidine kinase activity MF
my question is this: is this likely to be due to putting too few genes into the analysis?
I think my code is OK, as I've done this before with larger lists and got some good pvalues for the real test and sampling pvalues were close to 1.
Cheers for any insight.
matt
...I think I've got the wrong end of the stick with this: the method=sampling means not using the Wallenius method for the null distribution. For some reason I thought this was a background analysis or negative control to compare the real thing to...
...it must be getting late...
matt