Entering edit mode
Joern Toedling
▴
730
@joern-toedling-1244
Last seen 10.2 years ago
Dear all,
I would appreciate any suggestion on the following issue. I have
noticed
a major inconsistency between new and older topGO results. For the
older
ones, topGO used the "GO" package, while it uses "GO.db" for the new
results I can't figure out whether it is a problem with topGO only or
whether there are some serious inconsistencies between GO and GO.db
Here is the source code I used:
library("topGO")
## load list of genes of interest
load("brainOnlyGenes.RData")
## load genereal gene-to-GO mapping and universe of genes to use in
analysis:
load("mm9gene2GO.RData")
load("arrayGenesWithGO.RData")
## then the function to call topGO and to return a nice result table:
sigGOTable <- function(selGenes, GOgenes=arrayGenesWithGO,
gene2GO=mm9.gene2GO[arrayGenesWithGO], ontology="BP", maxP=0.001)
{
inGenes <- factor(as.integer(GOgenes %in% selGenes))
names(inGenes) <- GOgenes
GOdata <- new("topGOdata", ontology=ontology, allGenes=inGenes,
annot=annFUN.gene2GO, gene2GO=gene2GO)
myTestStat <- new("elimCount", testStatistic=GOFisherTest,
name="Fisher test", cutOff=maxP)
mySigGroups <- getSigGroups(GOdata, myTestStat)
sTab <- GenTable(GOdata, mySigGroups,
topNodes=length(usedGO(GOdata)))
names(sTab)[length(sTab)] <- "p.value"
return(subset(sTab, as.numeric(p.value) < maxP))
}#
## call it:
(brainRes <- sigGOTable(brainOnlyGenes))
# with topGO_1.4.0 using GO_2.0.1
# this is:
# GO.ID Term Annotated Significant Expected
p.value
# 1 GO:0007268 synaptic transmission 136 44 24.46
3.0e-05
# 2 GO:0007610 behavior 180 54 32.38
4.4e-05
# 3 GO:0007409 axonogenesis 119 38 21.41
0.00014
# 4 GO:0006887 exocytosis 40 17 7.20
0.00026
# 5 GO:0007420 brain development 136 40 24.46
0.00066
# which kind of make sense if it somehow to annotate a list of
interesting genes when investigating brain cells
## now unfortunately using all the same gene list, universe and gene-
to-GO mapping, and the same function as above
## with topGO_1.9.0 using GO.db_2.2.0, the result is:
# GO.ID Term Annotated
Significant Expected p.value
# 1 GO:0007268 mitochondrial genome maintenance 137
44 24.65 3.7e-05
# 2 GO:0007610 reproduction 180
54 32.39 4.4e-05
# 3 GO:0007409 single strand break repair 119
38 21.41 0.00014
# 4 GO:0006887 regulation of DNA recombination 40
17 7.20 0.00026
# 5 GO:0007420 regulation of mitotic recombination 136
40 24.47 0.00066
# which is obviously very, very different
Does anyone have an educated guess what is going on? Could it be a bug
a
in topGO? Or is the information in GO.db really different from the one
in GO, and in that case which one is the right one?
Best regards,
Joern