biomaRt scan error for latest assembly when indicating version Number
1
0
Entering edit mode
mullpaul • 0
@mullpaul-15059
Last seen 6.8 years ago
Rockefeller University

I am running biomaRt version 2.35.11 and I get a scan error when I try to access the most current version of a dataset using a version number (version 91). Removing the version accesses the latest dataset, but is it possible to indicate the latest version so that running the code in the future will always access this dataset?

​> mart_obj <- useEnsembl(biomart="ensembl", dataset='mmusculus_gene_ensembl', version=91)
> mart_obj@host
[1] "http://dec2017.archive.ensembl.org:80/biomart/martservice"
> df <- getBM(attributes = c('ensembl_transcript_id', 'ensembl_gene_id', 'external_gene_name', 'chromosome_name', 'description'),
+             mart       = mart_obj)
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
  line 1 did not have 3 elements
biomart rstudio ensembl mart • 1.8k views
ADD COMMENT
0
Entering edit mode

It looks like http://dec2017.archive.ensembl.org redirects you to http://www.ensembl.org which I'm pretty sure was not the behaviour before the latest release - I've definitely recommended using the archive URL for the current release for exactly the reason you state.

This is annoying as I recently turned of following redirection in biomaRt so queries end up where they're requested to be sent, but this breaks your query.  I'll have a look at the code and see if there's an obvious work around, but the most straight forward solution would be for Ensembl to let you use the dec2017 URL directly, rather than forcing the redirection.

ADD REPLY
0
Entering edit mode

Thanks a lot Mike! Not the end of the world if I just have to wait for 91 to get a true archive page, but agreed that this is pretty annoying.

ADD REPLY
1
Entering edit mode
Mike Smith ★ 6.6k
@mike-smith
Last seen 1 hour ago
EMBL Heidelberg

This should now be fixed in biomaRt version 2.35.12.  If you either provide the current version number to useEnsembl(), or the equivalent URL to useMart(), this will be detected and adjusted appropriately. e.g.

> library(biomaRt)
> 
> mart_obj <- useEnsembl(biomart="ensembl", dataset='mmusculus_gene_ensembl', version=91)
Note: requested host was redirected from
http://dec2017.archive.ensembl.org to http://www.ensembl.org:80/biomart/martservice
This often occurs when connecting to the archive URL for the current Ensembl release
You can check the current version number using listEnsemblArchives()
> 
> df <- getBM(attributes = c('ensembl_transcript_id', 'ensembl_gene_id'),
+      mart       = mart_obj)
>
> dim(df)
[1] 135075      2

This uses the list of archives present at https://www.ensembl.org/info/website/archives/index.html to determine the current release. When the next Ensembl release comes out biomaRt will no longer redirect your URL and will stick with the specified archive version - so your results should stay stable over time.

Let me know if this throws up any issues.

ADD COMMENT
0
Entering edit mode

thanks Mike, this works perfectly now!

ADD REPLY

Login before adding your answer.

Traffic: 887 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6