org.Hs.eg.db does not rerieve all existing records
2
0
Entering edit mode
@antonvilas-7285
Last seen 9.2 years ago
Spain

Hi,

when I use org.Hs.eg.db to retrieve gene names with keytype = ËNSEMBL", I get incomplete lists. For example

> query
 [1] "ENSG00000201291" "ENSG00000206687" "ENSG00000212402" "ENSG00000231747" "ENSG00000239935" "ENSG00000241482"
 [7] "ENSG00000243053" "ENSG00000243588" "ENSG00000251811" "ENSG00000254713"

> GENEINFO <- select(org.Hs.eg.db, keys=query, columns=c("ENSEMBL","SYMBOL","GENENAME", "UNIGENE"), keytype="ENSEMBL")

results in

> GENEINFO
           ENSEMBL   SYMBOL                           GENENAME   UNIGENE
1  ENSG00000201291     <NA>                               <NA>      <NA>
2  ENSG00000206687     <NA>                               <NA>      <NA>
3  ENSG00000212402 SNORA74B small nucleolar RNA, H/ACA box 74B Hs.692720
4  ENSG00000231747     <NA>                               <NA>      <NA>
5  ENSG00000239935     <NA>                               <NA>      <NA>
6  ENSG00000241482     <NA>                               <NA>      <NA>
7  ENSG00000243053     <NA>                               <NA>      <NA>
8  ENSG00000243588     <NA>                               <NA>      <NA>
9  ENSG00000251811     <NA>                               <NA>      <NA>
10 ENSG00000254713     <NA>                               <NA>      <NA>

However, I can easily check online that ENSG00000201291 has

Approved symbol: RNU1-34P

Approved name: RNA, U1 small nuclear 34, pseudogene

(http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=HGNC:48376)

why am I not getting these records with the org.Hs.eg.db package?

best,

Antón

 

 

 

org.hs.eg.db • 1.5k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 3 hours ago
United States

It's because the org.Hs.eg.db package is based on Entrez Gene IDs, rather than Ensembl. If you want to annotate starting with Ensembl IDs, you are likely better off using biomaRt instead:

> library(biomaRt)
> query <- c("ENSG00000201291", "ENSG00000206687", "ENSG00000212402", "ENSG00000231747", "ENSG00000239935", "ENSG00000241482",
+            "ENSG00000243053", "ENSG00000243588", "ENSG00000251811", "ENSG00000254713")
> mart <- useMart("ensembl","hsapiens_gene_ensembl")

> getBM(c("ensembl_gene_id","hgnc_symbol","external_gene_name","unigene"), "ensembl_gene_id", query, mart)
  ensembl_gene_id hgnc_symbol external_gene_name   unigene
1 ENSG00000201291    RNU1-34P           RNU1-34P          
2 ENSG00000206687   RNU1-109P          RNU1-109P          
3 ENSG00000212402    SNORA74B           SNORA74B Hs.692720
4 ENSG00000231747                     AC079922.2          
5 ENSG00000243053    RPL31P58           RPL31P58          
6 ENSG00000251811                          Y_RNA          
7 ENSG00000254713  HNRNPA1P72         HNRNPA1P72    
ADD COMMENT
0
Entering edit mode
@antonvilas-7285
Last seen 9.2 years ago
Spain

It works! thanks for your answer.

ADD COMMENT

Login before adding your answer.

Traffic: 623 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6