I am regularly using the annotationHub to retrieve/query the Ensembl-based gene annotations (ensembldb). This works fine for e.g human and mouse, but I now would like to obtain info made available through the Ensembl Plant database; specifically for Arabidopsis ( http://plants.ensembl.org/Arabidopsis_thaliana/Info/Index ).
Question: Is such ensembldb available at the annotationHub? I searched for it but could not find it...
while it is possible to create EnsDb databases also for ensemblgenomes (including plants, funghi etc) I did not do this on a regular basis and was also hesitant to add these to AnnotationHub because I was not sure how many users there will be for these.
Just let me know which species (for which Ensembl/Ensemblgenomes) version you need and I will create the EnsDb for you.
Thanks for your offer! As far as I am concerned only an EnsDb for the latest genome info for Arabidopsis would do for now. (EnsemblPlants, release 41, Sept 2018, here).
Also don't know if its helpful but there is a recent orgDb added to AnnotationHub for Arabidopsis matching the taxonomyid on the reference page you listed
Hi, I am having a similar problem. How do I access the rice data (Oryza sativa Japonica Group) made available through the Ensembl Plant database. Sorry if this is extremely obvious I am new to this, thank you in advance for any help you can provide.
Sincerely Cameron
I create EnsDb annotation resources for all species part of the Ensembl core databases which are then available through the AnnotationHub (see also Lori's reply). I don't create these by default for the Ensembl plants, fungi, etc databases.
It would however not be a big problem for me to create them on demand - just let me know what species and Ensembl release you need (unless the resources already available in AnnotationHub - see Lori's reply - are not already sufficient).
I've created the EnsDb (for Ensembl release 106, which corresponds to ensemblgenomes release 53). You can download the file from here. The file is called EnsDb.Mtruncatula.v106.sqlite. You can simply load this database using the EnsDb function.
Thank you, this helped a lot! If it's not too much trouble, would it be possible to get the same for "Rhizophagus irregularis DAOM 181602=DAOM 197198 (ASM43914v3)"?
I had a look at the Ensemblgenomes site for this fungus (here), but could not find the actual MySQL database that contains the gene, protein etc annotations. Without that I can not create the EnsDb.
In fact, for fungi, these are all available databases - could you maybe have a look through them to see if you could identify the one containing annotations for that species? I'm not familiar with fungi genus and species collections...
Hi Johannes,
Thanks for trying... I could not find it there in those databases. It is in the list of genomes in the parent directory, but I don't see that file in any of the sql databases.
Many thanks for your help. I want it for the last genome assemble release that is the
PN40024.v4. If not possible could perfectly be the previous version v3.
I've created an EnsDb for Ensembl version 107 (which has PN40024.v4). You can download the sqlite file (EnsDb.Vvinifera.v107.sqlite) from here. To use this database:
> library(ensembldb)
> edb <- EnsDb("EnsDb.Vvinifera.v107.sqlite")
> edb
EnsDb for Ensembl:
|Backend: SQLite
|Db type: EnsDb
|Type of Gene ID: Ensembl Gene ID
|Supporting package: ensembldb
|Db created by: ensembldb package from Bioconductor
|script_version: 0.3.7
|Creation time: Mon Jul 25 08:22:47 2022
|ensembl_version: 107
|ensembl_host: localhost
|Organism: Vitis vinifera
|taxonomy_id: 29760
|genome_build: PN40024.v4
|DBSCHEMAVERSION: 2.2
| No. of genes: 35134.
| No. of transcripts: 41097.
|Protein data available.
Would it be possible to get the EnsDB creation for Trichoderma reesei (GCA_000167675.2) and Aspergillus oryzae (ASM18445v3), if it's not too much trouble?
I was wondering, is it possible to add data to the EnsDbs afterwards or would you have to start from the beginning with the files? We have some Gene Ontology data for the organisms generated with Blast2GO which would be a nice addition and I was wondering if I could somehow add that.
Sorry, but there is no option to add additional (external) data to the EnsDb databases - they are built from Ensembl annotations and by design contain only these annotations. Depending on the need or use case, a workaround could maybe also be to extract annotations from the EnsDb as a GRanges object and then add additional information to that?
for which Ensembl release would you need the data? the most recent is 112, but if you used a different version before it would be good to know what release you need. Actually, even better than the Ensembl release would be the version of ensemblgenomes since both use different version numbers...
First, thank you so much for providing these annotations for the community. We feel often a bit left off in the plant community ^^
I have been trying to find a good Arabidopsis thaliana annotation for single cell ATAC seq analysis, and downloaded the Arabidopsis Annotation you created 4.5 years ago. But Unfortunately R display the error "Annotation must be a GRanges object"
Despite trying to convert it using different program but did not succeed.
Would you have a solution about that ?
You did not show the code that resulted in that error, so we have to guess what you did. Yet, the R packageSignac contains the function GetGRangesFromEnsDb() (link). May be that is worth looking at? Or did you already do so and refered to it as 'different program'?
Note that Signac is not a Bioconductor package, and therefore questions on Signac are best asked on its own website. Yet, if the function works fine on another EnsDb, then it is (indeed) related to the Arabidopsis EnsDb.
Thanks for your fast reply the error was as said in my question "Annotation must be a GRanges object", when I tried to use the annotation for my dataset.
I had tried makeGRangesFromEnsDb() without success but I missed the GetGRangesFromEnsDb(). It works using Johannes Annotation.
I used with success your previous EnsDb annotation Arabidopsis thaliana but many new genes are not annotated on that previous version.
Would it be possible to have a new EnsDb annotation for Arabidopsis thaliana ? The newest gene annotation is called Arabidopsis_thaliana.TAIR10.55 on Tair.
Would you also be able to provide a tutorial on how to do it, so people stop bothering you ? I could not find anything that works online.
there is information available in the ensembldbvignette on how to build an EnsDb directly from the Ensembl MySQL database(s) - but it's not straight forward to get the Ensembl Perl API and required Perl version installed properly.
I've created the EnsDb for arabidopsis thaliana (genome build TAIR10) for Ensembl release 110 (the current version). You can download the sqlite file from here. Please let me know if that was not the version you were looking for.
Currently i am working on oryza sativa indica and not able to find the Ensemble annotation dbi. Is it possible to create ensembldb for oryza sativa indica. This is the url for genome https://plants.ensembl.org/Oryza_indica/Info/Index
Hi Johannes,
Thanks for your offer! As far as I am concerned only an
EnsDb
for the latest genome info for Arabidopsis would do for now. (EnsemblPlants, release 41, Sept 2018, here).Thanks a lot for your help!
Guido
Also don't know if its helpful but there is a recent orgDb added to AnnotationHub for
Arabidopsis
matching the taxonomyid on the reference page you listedCheers
Lori, do you think it might be usefull to add also EnsDb for all species in ensemblgenomes to AnnotationHub (starting ev "only" with plants)?
Let's further this discussion off the support site
I've generated the
EnsDb
. You can get the file from here https://www.dropbox.com/sh/wglt28zlfzhjubs/AADzGqJ0zydKRmdqbOsH_Ru5a?dl=0after unzipping you can simply load the sqlite file with
edb <- EnsDb(<sqlite-file>)
Thanks! Meanwhile downloaded the file and everything is working fine.
Hi, I am having a similar problem. How do I access the rice data (Oryza sativa Japonica Group) made available through the Ensembl Plant database. Sorry if this is extremely obvious I am new to this, thank you in advance for any help you can provide. Sincerely Cameron
Dear Cameron,
I create
EnsDb
annotation resources for all species part of the Ensembl core databases which are then available through theAnnotationHub
(see also Lori's reply). I don't create these by default for the Ensembl plants, fungi, etc databases.It would however not be a big problem for me to create them on demand - just let me know what species and Ensembl release you need (unless the resources already available in
AnnotationHub
- see Lori's reply - are not already sufficient).cheers, jo
Hi Johannes,
Could I request the same EnsDb creation for Medicago truncatula?
Thank you very much,
Karen
Hi Karen,
I've created the
EnsDb
(for Ensembl release 106, which corresponds to ensemblgenomes release 53). You can download the file from here. The file is called EnsDb.Mtruncatula.v106.sqlite. You can simply load this database using theEnsDb
function.cheers, jo
Thank you, this helped a lot! If it's not too much trouble, would it be possible to get the same for "Rhizophagus irregularis DAOM 181602=DAOM 197198 (ASM43914v3)"?
Many thanks,
Karen
Dear Karen,
I had a look at the Ensemblgenomes site for this fungus (here), but could not find the actual MySQL database that contains the gene, protein etc annotations. Without that I can not create the
EnsDb
.In fact, for fungi, these are all available databases - could you maybe have a look through them to see if you could identify the one containing annotations for that species? I'm not familiar with fungi genus and species collections...
Hi Johannes, Thanks for trying... I could not find it there in those databases. It is in the list of genomes in the parent directory, but I don't see that file in any of the sql databases.
https://fungi.ensembl.org/Rhizophagus_irregularis_daom_181602_daom_197198_gca_002897155/Info/Index
Also sorry for the very delayed response :)
Dear Johannes, can you tell me how can I access to an EnsDb annotation of Vitis vinifera?
Many thanks,
António
Dear Antonio,
I can create you an
EnsDb
for Vitis vinifera - could you please tell me from which Ensembl (or Ensemblgenomes) release you want to have it?thanks, jo
Dear Johannes,
Many thanks for your help. I want it for the last genome assemble release that is the PN40024.v4. If not possible could perfectly be the previous version v3.
best regards,
António
Hi Antonio,
I've created an
EnsDb
for Ensembl version 107 (which has PN40024.v4). You can download the sqlite file (EnsDb.Vvinifera.v107.sqlite) from here. To use this database:cheers, jo
Many thanks,
works fine,
Best regards,
António
Hi Johannes,
Would it be possible to get the EnsDB creation for Trichoderma reesei (GCA_000167675.2) and Aspergillus oryzae (ASM18445v3), if it's not too much trouble?
Thank you so much!
Emmi
Hi Emmi,
the two
EnsDb
s are now also available in this folder (EnsDb.Aoryzae.v111.sqlite and EnsDb.Treesei.v111.sqlite).Best, jo
Thank you! This helps a lot.
Emmi
I was wondering, is it possible to add data to the EnsDbs afterwards or would you have to start from the beginning with the files? We have some Gene Ontology data for the organisms generated with Blast2GO which would be a nice addition and I was wondering if I could somehow add that.
Thank you for your answer!
Emmi
Sorry, but there is no option to add additional (external) data to the
EnsDb
databases - they are built from Ensembl annotations and by design contain only these annotations. Depending on the need or use case, a workaround could maybe also be to extract annotations from theEnsDb
as aGRanges
object and then add additional information to that?Thank you for your reply. Extracting annotations and adding more information could work in some cases, so I will try that.
Dear Johannes,
Would it be possible to create the EnsDb for Triticum aestivum (IWGSC)?
Thank you very much,
Daniele
Dear Daniele,
for which Ensembl release would you need the data? the most recent is 112, but if you used a different version before it would be good to know what release you need. Actually, even better than the Ensembl release would be the version of ensemblgenomes since both use different version numbers...
cheers, jo
Hi,
I'm using Ensembl Plants release 59 with the IWGSC RefSeq v1.1 gene annotation.
Thanks a lot for replying so quickly!
You can get the
EnsDb
EnsDb.Taestivum.v112.sqlite (for Ensembl 112/ ensemblgenomes 59) hereThank you so much!
It looks like there are three that could be of interested and utilized.