Hi,
Apologies for the naive post but I'm wondering if there is a way of exporting information I have on a gene acquired from ensembldb to a GenBank file?
I have the following code, where I fetch the sequence and annotation of the Actin beta gene:
library(EnsDb.Hsapiens.v86)
library(dplyr)
Hs_edb <- EnsDb.Hsapiens.v86
Hs_dna <- getGenomeTwoBitFile(Hs_edb)
ACTB_db <- Hs_edb %>%
ensembldb::filter(filter = GeneNameFilter("ACTB")) %>%
ensembldb::filter(filter = ~tx_biotype == "protein_coding")
ACTB_gene <- genes(ACTB_db)
ACTB_seq <- getSeq(Hs_dna, ACTB_gene)
This gives me a GRanges object with the features and a DNAStringSet object with the sequence. What I would like is to export the sequence and features into a .gb or .gbk file. The reason I'd like to do this is that many of my colleagues use different forms of software to view sequence information and are not skilled with R. The only commonly interchangeable format that most pieces of software are able to parse is GenBank.
Thank you.
Hi James,
Thank you for replying. Actin beta was just used as a representative example. What I am actually dealing with are molecules that are not in the NCBI, e.g. chimeric molecules from transgenic mice. I've found a github package (gschofl/biofiles) but it only writes out files that have been parsed from .gbk files.
There's nothing that I can find in the Bioconductor corpus. You can search here. The biofiles package might be useful, but you would have to instantiate a
gbRecord
object and then agbFeatureTable
and aseqinfo
object and jam them into thegbRecord
. Sounds like fun!Or you could just roll yer own that just uses
writeLines
orcat
to output a text file that is similar enough that your colleagues can read it in.Thank you for the code search link James. Yes, I had high hopes for the biofiles idea, but became progressively less enthusiastic the more I went down the
gbRecord
rabbit hole. Hence the impulse to ask the community if anything else existed.writeLines
it is then. Cheers.