Entering edit mode
mauede@alice.it
▴
870
@mauedealiceit-3511
Last seen 10.2 years ago
I forgot to specify that I am only dealing with Human species.
I used the ENSGxxxxx identifier to get out some data that I hoped
would uniquely identify the gene.
> gene.map <-
getBM(attributes=c("hgnc_symbol","external_gene_id","refseq_dna"),
filters
="ensembl_gene_id",values="ENSG00000206557",mart=hmart)
> show(gene.map)
As long as all Human genes are uniquely identified through their
respective "hgnc_symbol" I am fine.
Why should I use the other identifier you mention ENSTxxxx ?
My goal is to get the 3UTR sequence associated to experimentally
validated genes.
Through entering "Human" species and miRNA identifier "hsa-miR-yyy"
TarBase interface returns a
list of all gene ENSGxxxxxx that have been experimentally tested.
I input such ENSGxxxxxx identifier to getSequence (BioMat function)
to get the 3UTRr sequence.
I was surprised to find multiple 3UTR sequences associated to the same
ENSGxxxxxx.
Maybe each transcript is identified by a unique ENSTxxxx identifier...
TRUE/FALSE ?
Thank you.
Regards,
Maura
-----Messaggio originale-----
Da: Simon Anders [mailto:anders@ebi.ac.uk]
Inviato: dom 12/07/2009 23.14
A: mauede@alice.it
Cc: Bioconductor List
Oggetto: Re: [BioC] is there an identifier that uniquely identifies a
gene all over the many databases ?
Hi Maura
mauede@alice.it wrote:
> By trial-and-error it seems the attribute "hgnc_symbol" yields a
unique gene identifier ... but I am not quite sure.
> Instead a variable numbers of " refseq_dna" values are listed for
the same "hgnc_symbol".
HGNC is the Human Genome Organisation's Gene Nomencalture Committee.
Their gene symbols are in fact unique (that is the whole point of
HGNC)
but not every gene has a HGNC symbol yet. See
http://www.genenames.org/
for more information.
> In short, given the "ensembl_gene_id" (ENSGxxxxxxxxxxx), is it
possible to get the gene identifier for which this is a transcript ?
First of all, ENSGxxxxx IDs are for human genes. Human transcripts get
ENSTxxxx identifiers (with a "T" insetad of a "G"). Each Ensembl gene
can have several Ensembl transcripts, listing all the known splice
variants. Play a bit with the Ensembl web site to see examples.
To get the HGNC symbol for an ensembl gene ID, an easy way is to use
biomaRt. Ask again if you are not familiar with it.
Simon
+---
| Dr. Simon Anders, Dipl. Phys.
| European Bioinformatics Institute (EMBL-EBI)
| Hinxton, Cambridgeshire, UK
| office phone +44-1223-492680, mobile phone +44-7505-841692
| preferred (permanent) e-mail: sanders@fs.tum.de
tutti i telefonini TIM!
[[alternative HTML version deleted]]