I've been looking at mouse transcripts from Ensembl, e.g.:
> edb <- ah[["AH89211"]]
> txps <- transcripts(edb)
> txps[82575,]
GRanges object with 1 range and 9 metadata columns:
seqnames ranges strand | tx_id
<Rle> <IRanges> <Rle> | <character>
ENSMUST00000029812 3 135584655-135691546 - | ENSMUST00000029812
tx_biotype tx_cds_seq_start tx_cds_seq_end
<character> <integer> <integer>
ENSMUST00000029812 protein_coding 135585355 135667785
gene_id tx_support_level tx_id_version
<character> <integer> <character>
ENSMUST00000029812 ENSMUSG00000028163 1 ENSMUST00000029812.13
gc_content tx_name
<numeric> <character>
ENSMUST00000029812 53.0969 ENSMUST00000029812
-------
seqinfo: 118 sequences from GRCm38 genome
Is it possible to obtain the transcript_name
from the GTF? The one that has the gene symbol + dash + a number. E.g. this Nfkb1-201
from the GTF. They seem to prioritize this on the Ensembl gene viewer so it's convenient to use for cross-referencing.
3 havana transcript 135584655 135691546 . - . gene_id "ENSMUSG00000028163"; gene_version "17";
transcript_id "ENSMUST00000029812"; transcript_version "13"; gene_name "Nfkb1"; gene_source "ensembl_havana";
gene_biotype "protein_coding"; havana_gene "OTTMUSG00000016668"; havana_gene_version "5"; transcript_name "Nfkb1-201";
transcript_source "havana"; transcript_biotype "protein_coding"; tag "CCDS"; ccds_id "CCDS17858";
havana_transcript "OTTMUST00000040338"; havana_transcript_version "1"; tag "basic"; transcript_support_level "1";
I looked over vignette and man pages, but I may have missed something.
This is great! thanks Johannes. This will really help when connecting results to gene models from the Ensembl genome browser.
Oops it looks like my request may have caused some issues in release and devel. tximeta builds with error because this resource cannot load (edit: I've dealt with the error for now just by commenting out the part that would pull the resource from Ahub).
My session info:
Thanks Mike for reporting. I've opened an issue in
ensembldb
and will have a look at it.For what it's worth, the same problem (package build error) applies for my package
satuRn
with the exact same error message as flagged by Michael Love above.(https://master.bioconductor.org/checkResults/3.14/bioc-LATEST/satuRn/nebbiolo2-buildsrc.html)
Thanks for reporting. I submitted the fix yesterday - so I hope all is fine again after the next build round.
Thanks for fixing Johannes!
I am still having this issue with my package tximeta(). I have updated. I am very new to this so I am not sure what "next buiild round" means or when I should check to see if this has been resolved.
This is resolved for me with 2.18.2 from Bioconductor:
You will have to wait until his git commit propagates to the Bioconductor builds that you download with
BiocManager::install()
. Usually the commits are taken up at 5pm US East and show up on the website in the afternoon on the following day. You can track by following here:https://bioconductor.org/packages/release/bioc/html/ensembldb.html
when version > 2.18.1 and also here:
https://bioconductor.org/packages/release/bioc/news/ensembldb/NEWS
Ah I see, Thank you!
Thx Johannes. FYI this also caused problems to the OSCA book:
Hopefully this will clear out on the next book build report tomorrow.
H.
Hi Herve,
seems the second book was fixed. The first one still has an error, but that does not seem to be related to
ensembldb
or the recent changes. But please let me know if it is in fact due toensembldb
and I'll investigate/fix.cheers, jo
Good! Seems to be a different error. Hard to tell at first sight if this new error is still related to ensembldb. Since it took more than 1 hour for
R CMD build OSCA.multisample
to reach the point of failure, troubleshooting this is probably not going to be easy. I pass ;-)H.
Also fixed for
satuRn
, thank you!