According to the GenomicFeatures annotation and this post it was a conscious decision to not include gene_name in the TxDb objects from the GenomicFeatures package.
Is the decision of not including gene names something that could be brought up again?
I think we can all agree that in the end the majority of end users would like gene names associated with their analysis as these ids are what supply the link to biological knowledge for most people. This - along with the fact that one of the main advantages of Bioconductor is to make data integration easy and seamless - the decision to omit gene names seems a bit out of character?
Apart from this, I also think it is a bit of a shame given the recent push towards us automating away many of the low level data handling hurdles associated with bioinformatics via packages such as tximeta.
Lastly it does seem like the ensembledb package TxDB object contains gene_names so adding them to the regular TxDb would further streamline BioC.
Looking forward to hear your thoughts.
Cheers
Kristoffer
I do now about this aporach (and it is also mentioned in the post I link to above). Unfortunately this will not generalize as 1) org.db does not exist for all species and 2) it require user input which is what I (and others) would like to automate away. 3) you might often run into problems with different transcriptome assembly versions (e.g. new genes in less well studied species).
Lastly it seems strange to not import gene_names from e.g. GTF files since the information is there (and actually are imported but not used).
I strongly support the request by @kvittingseerup. I'm working a lot with non-model organisms where OrgDB's are not available, including the gene names would make my life much easier.