There are two things going on here. First, the Gene Ontology (GO) is set up as a directed acyclic graph, which simply means that we start at the highest, most general level (e.g., Biological Process), and then keep splitting into more specific terms. By default, since all of the lower GO terms are more specific than the higher GO terms, all the genes that are appended to a given GO term are also appended to all of the ancestor terms, all the way up to Biological Process. So as a (fake) example, if you have a term 'Protein phosphorylation' and a term 'Phosphorylation of Map Kinases', then all of the genes that are appended to the latter are also appended to the former, because Protein Phosphorylation is a superset of the lower term.
That's a roundabout way of getting to my point, which is that the GO keytype only gives you the genes that are appended to a particular GO term, not its descendant terms. For that you need the GOALL keytype.
Second, you are starting with Ensembl IDs, but are using the org.Hs.eg.db package (which is based on NCBI annotations) to do the mappings. By default that means you are mapping Ensembl IDs to Entrez Gene IDs, which is not a trivial process. Well, it's a trivial thing to do, but NCBI and EBI/EMBL are two different groups that are trying to do the same thing; there are assumptions and decisions made by each group that tend to cause their annotations to diverge more than you might think.
ANYWAY, why does goseq say there are 922 genes when you think there should only be 139? First, let's map the GO term to all Entrez Gene IDs:
> z <- select(org.Hs.eg.db, "GO:0000278", "ENTREZID", "GOALL")
'select()' returned 1:many mapping between keys and columns
> head(z)
GOALL EVIDENCEALL ONTOLOGYALL ENTREZID
1 GO:0000278 IEA BP 25
2 GO:0000278 TAS BP 25
3 GO:0000278 IMP BP 90
4 GO:0000278 IDA BP 91
5 GO:0000278 IDA BP 199
6 GO:0000278 IDA BP 207
> dim(z)
[1] 1340 4
So 1340 Entrez Gene IDs per GO term. But look at the first two rows! That's the same Gene ID twice!
> length(unique(z[,4]))
[1] 1056
So that's closer. But not quite. Let's map to Ensembl IDs
> zz <- select(org.Hs.eg.db, unique(as.character(z$ENTREZID)), "ENSEMBL")
'select()' returned 1:many mapping between keys and columns
> head(zz)
ENTREZID ENSEMBL
1 25 ENSG00000097007
2 90 ENSG00000115170
3 91 ENSG00000135503
4 199 ENSG00000204472
5 199 ENSG00000237727
6 199 ENSG00000206428
> length(unique(zz[,2]))
[1] 1180
Huh. Still not 922. But you only measured a certain set of genes!
> sum(unique(zz$ENSEMBL) %in% names(genes))
[1] 922
Et voila! There are 922 Ensembl genes that are in your set of genes, that map to that GO term or any of its descendants, that also map to any Entrez Gene ID.
Just beautiful.I didn't know there's a GOALL keytype.