Cannot retrieve Annotations for a GENEID from Homo.sapiens
1
0
Entering edit mode
@moiz-bootwalla-5215
Last seen 9.7 years ago
United States

I have a GENEID (100505503) that was obtained from TxDb.Hsapiens.UCSC.hg19.knownGene. I was trying to retrieve the SYMBOL for it using Homo.sapiens. I get the following error:

select(Homo.sapiens, keys="100505503", columns=columns(Homo.sapiens), keytype="GENEID")
Error in .testForValidKeys(x, keys, keytype) : 
  None of the keys entered are valid keys for 'ENTREZID'. Please use the keys method to see a listing of valid arguments.

Why does this happen? If there is no such ENTREZID then where did the TxDb package get the id from? It works when I query TxDb.

select(txdb, keys="100505503", columns=columns(txdb), keytype="GENEID")

      GENEID  CDSID CDSNAME CDSCHROM CDSSTRAND CDSSTART   CDSEND EXONID EXONNAME EXONCHROM EXONSTRAND EXONSTART  EXONEND
1  100505503 165377    <NA>    chr15         - 82824834 82824836 202855     <NA>     chr15          -  82824834 82824865
2  100505503 165376    <NA>    chr15         - 82824389 82824540 202854     <NA>     chr15          -  82824389 82824540
3  100505503 165375    <NA>    chr15         - 82823288 82823393 202853     <NA>     chr15          -  82823288 82823393
4  100505503 165374    <NA>    chr15         - 82822714 82822779 202852     <NA>     chr15          -  82822714 82822779
5  100505503 165373    <NA>    chr15         - 82821209 82821289 202851     <NA>     chr15          -  82821161 82821289
6  100505503     NA    <NA>     <NA>      <NA>       NA       NA 202856     <NA>     chr15          -  82906033 82906047
7  100505503     NA    <NA>     <NA>      <NA>       NA       NA 202854     <NA>     chr15          -  82824389 82824540
8  100505503     NA    <NA>     <NA>      <NA>       NA       NA 202853     <NA>     chr15          -  82823288 82823393
9  100505503     NA    <NA>     <NA>      <NA>       NA       NA 202852     <NA>     chr15          -  82822714 82822779
10 100505503     NA    <NA>     <NA>      <NA>       NA       NA 202851     <NA>     chr15          -  82821161 82821289
11 100505503 165404    <NA>    chr15         - 83209177 83209179 202916     <NA>     chr15          -  83209177 83209208
12 100505503 165403    <NA>    chr15         - 83208732 83208883 202915     <NA>     chr15          -  83208732 83208883
13 100505503 165402    <NA>    chr15         - 83207631 83207736 202914     <NA>     chr15          -  83207631 83207736
14 100505503 165401    <NA>    chr15         - 83207057 83207122 202913     <NA>     chr15          -  83207057 83207122
15 100505503 165400    <NA>    chr15         - 83205552 83205632 202912     <NA>     chr15          -  83205504 83205632
    TXID EXONRANK     TXNAME TXCHROM TXSTRAND  TXSTART    TXEND
1  56497        1 uc002bhr.1   chr15        - 82821161 82824865
2  56497        2 uc002bhr.1   chr15        - 82821161 82824865
3  56497        3 uc002bhr.1   chr15        - 82821161 82824865
4  56497        4 uc002bhr.1   chr15        - 82821161 82824865
5  56497        5 uc002bhr.1   chr15        - 82821161 82824865
6  56498        1 uc021ssu.2   chr15        - 82821161 82906047
7  56498        2 uc021ssu.2   chr15        - 82821161 82906047
8  56498        3 uc021ssu.2   chr15        - 82821161 82906047
9  56498        4 uc021ssu.2   chr15        - 82821161 82906047
10 56498        5 uc021ssu.2   chr15        - 82821161 82906047
11 56518        1 uc002bio.1   chr15        - 83205504 83209208
12 56518        2 uc002bio.1   chr15        - 83205504 83209208
13 56518        3 uc002bio.1   chr15        - 83205504 83209208
14 56518        4 uc002bio.1   chr15        - 83205504 83209208
15 56518        5 uc002bio.1   chr15        - 83205504 83209208

Kindly help me understand this discrepancy.

Thanks,

Moiz

> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-suse-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Homo.sapiens_1.1.2                      org.Hs.eg.db_3.0.0                      GO.db_3.0.0                            
 [4] RSQLite_1.0.0                           DBI_0.3.1                               OrganismDbi_1.8.0                      
 [7] TxDb.Hsapiens.UCSC.hg19.knownGene_3.0.0 GenomicFeatures_1.18.2                  AnnotationDbi_1.28.1                   
[10] Biobase_2.26.0                          GenomicRanges_1.18.1                    GenomeInfoDb_1.2.2                     
[13] IRanges_2.0.0                           S4Vectors_0.4.0                         BiocGenerics_0.12.0                    
[16] data.table_1.9.4                        BiocInstaller_1.16.0                   

loaded via a namespace (and not attached):
 [1] base64enc_0.1-2         BatchJobs_1.4           BBmisc_1.7              BiocParallel_1.0.0      biomaRt_2.22.0         
 [6] Biostrings_2.34.0       bitops_1.0-6            brew_1.0-6              checkmate_1.5.0         chron_2.3-45           
[11] codetools_0.2-8         digest_0.6.4            fail_1.2                foreach_1.4.2           GenomicAlignments_1.2.0
[16] graph_1.44.0            iterators_1.0.7         plyr_1.8.1              RBGL_1.42.0             Rcpp_0.11.3            
[21] RCurl_1.95-4.3          reshape2_1.4            Rsamtools_1.18.1        rtracklayer_1.26.1      sendmailR_1.2-1        
[26] stringr_0.6.2           tools_3.1.1             XML_3.98-1.1            XVector_0.6.0           zlibbioc_1.12.0
annotation annotationdbi homo.sapiens • 1.9k views
ADD COMMENT
2
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

I think it has to do with UCSC being behind the times for that gene. If you go to NCBI, and search the gene database for the EntrezID you show, you get a message saying it has been replaced with Entrez Gene ID 6218.

http://www.ncbi.nlm.nih.gov/gene/100505503

ADD COMMENT
0
Entering edit mode

Thank you James. Makes sense now.

Best,

Moiz

ADD REPLY

Login before adding your answer.

Traffic: 884 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6