I am relatively new to Bioconductor, and am strugling to find the genome coordinates to a few genes, say for instance "APC". I believe I managed to obtain the transcripts associated with the gene:
library(GenomicFeatures)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
library(Homo.sapiens)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
Then I used the select method to get the gene id fo "APC", and then the TXNAME of the gene model :
geneid <- select(Homo.sapiens, keys="APC", columns=c("SYMBOL","ENTREZID"),
keytype="SYMBOL")[['ENTREZID']]
txids <- select(txdb, geneid, "TXNAME", "GENEID")
This result in five transcripts. How can I get from these to the coding sequence / open reading frame coordinates ?
Yes would be nice with a more streamlined approach as you suggest. Based on your suggestion I ended up doing:
Thx.