I am trying to get all genes associated with a GO-term using Biomart. I cannot seem to get the genes listed by other services.
Example
Take a small term with 5 genes: 0018103 in amigo or quickGO
Prepare marts
ensembl37 = useEnsembl("ensembl", dataset = "hsapiens_gene_ensembl"
, host = "https://grch37.ensembl.org"
)
ensembl38 = useEnsembl("ensembl", dataset = "hsapiens_gene_ensembl"
, host = "https://ensembl.org"
)
Find genes
I find 1 gene using https://grch37.ensembl.org
getBM(attributes=c('hgnc_symbol'),
filters = 'go_parent_term', values = 'GO:0018103', mart = ensembl37, verbose = F); gene.data
> hgnc_symbol
> 1 DPM3
I get an error when using https://ensembl.org
getBM(attributes=c('hgnc_symbol'),
filters = 'go_parent_term', values = 'GO:0018103', mart = ensembl38, verbose = F); gene.data
> NULL
> Error in .processResults(postRes, mart = mart, sep = sep, fullXmlQuery = fullXmlQuery, :
> The query to the BioMart webservice returned an invalid result.
> The number of columns in the result table does not equal the number of attributes in the query.
> Please report this on the support site at http://support.bioconductor.org
I guess
- ensembl37 simply does not have the other genes (or maybe it takes only certain evidence/ annotation)
- in ensembl38 something changed with the syntax. I went through all posts in google / bioconductor / biostar, but can't seem to find the solution.
Ideally I should get all 5 genes with the latest annotation, but with ensembl37 is also fine.
I appreciate your help! Thanks, A
PS.
> R version 4.0.2 (2020-06-22)
> package.version("biomaRt")
[1] "2.45.9"
But then again, you should upgrade.
Thanks James!
For grch38 I can reproduce your call
The problem was that I specified the
host
argumentFor grch37 I still get only 1
It therefore may be a version problem? Not sure what package would be out of date. I am cautious with updating ... once I spent some frustrating days figuring out that a 3rd level dependence changed a default argument, and so I kept getting different results without a single warning...
Yeah, I think it's a version thing.
This shouldn't be a biomaRt version thing, so that's not good! It's supposed to just be an interface to the Ensembl server, and if you're querying https://grch37.ensembl.org both times, it should at least be receiving the same data back. I'll take a look at what might have happened here, because I don't remember intentionally introducing any changes that would manifest like this.