Dear all
The new Ensembl marts for release 88 are now live on www.ensembl.org.
If you are using biomaRt, you can change your host to access our most recent data:
ensembl_mart_88 <- useEnsembl(biomart=“ensembl")
Change affecting all marts: Species dropdown now displays species common name instead of species latin name
Ensembl Genes 88
Region filter performance improvement
Renamed some filter/attributes internal and display names
Renamed attribute "% GC content" to "Gene % GC content"
Mouse Genes 88
Region filter performance improvement
Ensembl Variation 88
Region filter performance improvement
Ensembl Regulation 88
Region filter performance improvement
Vega 68
You can find the complete list of the changes at http://www.ensembl.org/info/website/news.html
Important: Please note that we have moved to a new system this release to generate and populate the ensembl marts filters/attributes. As a result, we have improved the consistency of our mart across the vertebrate and other ensembl divisions. The following filters and attributes names have changed and will affect script using the BiomaRt package.
Please find the full list below or on our FTP site: ftp://ftp.ensembl.org/pub/release-88/release_88_biomart_changes.txt
1) External references entrezgene_transcript_name -> entrezgene_trans_name hgnc_transcript_name -> hgnc_trans_name rfam_transcript_name -> rfam_trans_name mirbase_transcript_name -> mirbase_trans_name uniprot_genename -> uniprot_gn uniprot_sptrembl -> uniprotsptrembl uniprot_swissprot -> uniprotswissprot go_id -> go goslim_goa_accession -> goslim_goa zfin_transcript_name -> zfin_id_trans_name wormbase_gene_seq_name -> wormbase_gseqname genome_rnai -> genomernai go_to_gene_id -> go_to_gene mgi_transcript_name -> mgi_trans_name clone_based_ensembl_gene_name -> clone_based_ensembl_gene clone_based_ensembl_transcript_name -> clone_based_ensembl_transcript clone_based_vega_gene_name -> clone_based_vega_gene clone_based_vega_transcript_name -> clone_based_vega_transcript vgnc_genename -> vgnc rgd_transcript_name -> rgd_trans_name xenbase_transcript_name -> xenbase_trans_name zfin_id -> zfin_id_id 2) Microarray probes/probesets efg_agilent_* -> agilent_* (e.g: efg_agilent_012795 -> agilent_012795) efg_nimblegen_gpl8673 -> nimblegen_gpl8673 efg_slri_gpl3518 -> slri_gpl3518 efg_ucsf_gpl9450 -> ucsf_gpl9450 efg_wustl_wustl_c_elegans -> wustl_wustl_c_elegans leiden_leiden2 -> spaink_lab_leiden_leiden2 leiden_leiden3 -> spaink_lab_leiden_leiden3 codelink -> codelink_codelink illumina_human_methylation_27 -> illumina_humanmethylation27 illumina_human_methylation_450 -> illumina_humanmethylation450 affy_xtropicalis -> affy_x_tropicalis 3) Protein domains and protein features low_complexity -> seg low_complexity_end -> seg_end low_complexity_start -> seg_start profile -> pfscan profile_end -> pfscan_end profile_start -> pfscan_start signal_domain -> signalp signal_domain_end -> signalp_end signal_domain_start -> signalp_start transmembrane_domain -> tmhmm transmembrane_domain_end -> tmhmm_end transmembrane_domain_start -> tmhmm_start prosite -> scanprosite prosite_end -> scanprosite_end prosite_start -> scanprosite_start 4) Other percentage_gc_content -> percentage_gene_gc_content so_parent_name -> so_mini_parent_name
Please make sure to update your scripts and commands.
Kind Regards,
Thomas
Dear Thomas,
I was checking the new release and I noticed that the EntrezGene ID is no more provided as an option in the FILTERS -> GENE section: there is only EntrezGene transcript name ID(s).
Looking at the xml code I found that NCBI gene ID(s) has the same filter name than the old EntrezGene ID (i.e. entrezgene). I tried then to convert some entrez ids that were working until yesterday with the previous version, but I don't get any results back.
In particular, I am selecting:
DATABASE: Ensembl Genes 88
DATASET: Human genes (GRCh38.p10)
FILTERS: NCBI gene ID(s) - VALUES: 10418, 10669, 10777
ATTRIBUTES: HGNC symbol, AFFY HG U133 Plus 2 probe
Am I doing anything wrong? Or is there any problem with the new release?
Kind Regards,
Alex
Dear Thomas,
I just made few tests and I think I found the problem. The filter is not using the NCBI ID(s) but the HGNC symbol instead.
The provided example for the NCBI ID(s) filter in the mart page in fact is a HGNC symbol, A1BG. The Gene ID for this symbol should be 1. I tried other HGNC symbols and this way is working.
Any chance this will be corrected?
Kind regards,
Alex