Entering edit mode
Dear all,
I've added ensembldb
databases for all species of Ensembl release 98 to AnnotationHub
. These databases provide now also the G-C nucleotide content for each transcript in metadata column "gc_content"
(see below).
cheers, jo
> library(AnnotationHub)
> query(AnnotationHub(), "EnsDb", "v98")
snapshotDate(): 2019-10-29
AnnotationHub with 1321 records
# snapshotDate(): 2019-10-29
# $dataprovider: Ensembl
# $species: Homo sapiens, Ailuropoda melanoleuca, Anolis carolinensis, Astya...
# $rdataclass: EnsDb
# additional mcols(): taxonomyid, genome, description,
# coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
# rdatapath, sourceurl, sourcetype
# retrieve records with, e.g., 'object[["AH53185"]]'
AH53185 | Ensembl 87 EnsDb for Anolis Carolinensis
AH53186 | Ensembl 87 EnsDb for Ailuropoda Melanoleuca
... ...
AH75116 | Ensembl 98 EnsDb for Xenopus tropicalis
AH75117 | Ensembl 98 EnsDb for Zonotrichia albicollis
> edb <- AnnotationHub()[["AH75117"]]
snapshotDate(): 2019-10-29
downloading 1 resources
retrieving 1 resource
|======================================================================| 100%
loading from cache
> transcripts(edb)
GRanges object with 30624 ranges and 9 metadata columns:
seqnames ranges strand | tx_id
<Rle> <IRanges> <Rle> | <character>
ENSZALT00000030003 ARWJ01023036.1 21700-155713 + | ENSZALT00000030003
ENSZALT00000030004 ARWJ01023036.1 21700-155713 + | ENSZALT00000030004
... ... ... ... . ...
ENSZALT00000002807 KB915622.1 13173-14579 - | ENSZALT00000002807
ENSZALT00000002808 KB915622.1 13173-14579 - | ENSZALT00000002808
tx_biotype tx_cds_seq_start tx_cds_seq_end
<character> <integer> <integer>
ENSZALT00000030003 protein_coding 21700 155677
ENSZALT00000030004 protein_coding 21700 155207
... ... ... ...
ENSZALT00000002807 lncRNA <NA> <NA>
ENSZALT00000002808 lncRNA <NA> <NA>
gene_id tx_support_level tx_id_version
<character> <integer> <character>
ENSZALT00000030003 ENSZALG00000017908 <NA> ENSZALT00000030003.1
ENSZALT00000030004 ENSZALG00000017908 <NA> ENSZALT00000030004.1
... ... ... ...
ENSZALT00000002807 ENSZALG00000001837 <NA> ENSZALT00000002807.1
ENSZALT00000002808 ENSZALG00000001837 <NA> ENSZALT00000002808.1
gc_content tx_name
<numeric> <character>
ENSZALT00000030003 42.803825519011 ENSZALT00000030003
ENSZALT00000030004 42.7888065056207 ENSZALT00000030004
... ... ...
ENSZALT00000002807 51.9956850053937 ENSZALT00000002807
ENSZALT00000002808 52.7659574468085 ENSZALT00000002808
seqinfo: 1729 sequences from Zonotrichia_albicollis-1.0.1 genome