GenomicFeatures::makeTxDbFromUCSC failing with an error: identical(current_classes, .UCSC_TXCOL2CLASS) is not TRUE
1
0
Entering edit mode
@mikhail-dozmorov-23744
Last seen 7 weeks ago
United States

Hi,
The GenomicFeatures::makeTxDbFromUCSC function fails with:

library(GenomicFeatures)
> hg19.refseq.db <- makeTxDbFromUCSC(genome="hg19", table="refGene")
Download the refGene table ... Error in .fetch_UCSC_txtable(genome(session), tablename, transcript_ids = transcript_ids) : 
  identical(current_classes, .UCSC_TXCOL2CLASS) is not TRUE
OK

The same error breaks the package built on all platforms, e.g. here. Reproducible on multiple computers. Any suggestions how to solve?

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.6.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] GenomicFeatures_1.46.1 AnnotationDbi_1.56.2   Biobase_2.54.0         GenomicRanges_1.46.1  
[5] GenomeInfoDb_1.30.0    IRanges_2.28.0         S4Vectors_0.32.3       BiocGenerics_0.40.0   

loaded via a namespace (and not attached):
 [1] MatrixGenerics_1.6.0        httr_1.4.2                  bit64_4.0.5                
 [4] assertthat_0.2.1            BiocFileCache_2.2.0         blob_1.2.2                 
 [7] GenomeInfoDbData_1.2.7      Rsamtools_2.10.0            yaml_2.2.1                 
[10] progress_1.2.2              pillar_1.6.4                RSQLite_2.2.8              
[13] lattice_0.20-45             glue_1.5.0                  digest_0.6.28              
[16] XVector_0.34.0              htmltools_0.5.2             Matrix_1.3-4               
[19] XML_3.99-0.8                pkgconfig_2.0.3             biomaRt_2.50.1             
[22] zlibbioc_1.40.0             purrr_0.3.4                 BiocParallel_1.28.2        
[25] tibble_3.1.6                KEGGREST_1.34.0             generics_0.1.1             
[28] ellipsis_0.3.2              cachem_1.0.6                SummarizedExperiment_1.24.0
[31] magrittr_2.0.1              crayon_1.4.2                memoise_2.0.1              
[34] evaluate_0.14               fansi_0.5.0                 xml2_1.3.2                 
[37] tools_4.1.0                 RMariaDB_1.2.1              prettyunits_1.1.1          
[40] hms_1.1.1                   BiocIO_1.4.0                lifecycle_1.0.1            
[43] matrixStats_0.61.0          stringr_1.4.0               DelayedArray_0.20.0        
[46] Biostrings_2.62.0           compiler_4.1.0              rlang_0.4.12               
[49] grid_4.1.0                  RCurl_1.98-1.5              rstudioapi_0.13            
[52] rjson_0.2.20                rappdirs_0.3.3              bitops_1.0-7               
[55] rmarkdown_2.11              restfulr_0.0.13             DBI_1.1.1                  
[58] curl_4.3.2                  R6_2.5.1                    GenomicAlignments_1.30.0   
[61] lubridate_1.8.0             knitr_1.36                  dplyr_1.0.7                
[64] rtracklayer_1.54.0          fastmap_1.1.0               bit_4.0.4                  
[67] utf8_1.2.2                  filelock_1.0.2              stringi_1.7.5              
[70] parallel_4.1.0              Rcpp_1.0.7                  vctrs_0.3.8                
[73] png_0.1-7                   dbplyr_2.1.1                tidyselect_1.1.1           
[76] xfun_0.28

Thanks,
Mikhail

genomicfea GenomicFeatures • 1.8k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 7 days ago
United States

Hi Mikhail,

Thanks for pointing that out. It's due to an internal change in RMariaDB and/or DBI that returns a blob instead of a list, so a check was failing. I've fixed it now:

> z <- makeTxDbFromUCSC("hg19","refGene")
Download the refGene table ... OK
Download the hgFixed.refLink table ... OK
Extract the 'transcripts' data frame ... OK
Extract the 'splicings' data frame ... OK
Download and preprocess the 'chrominfo' data frame ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning message:
In .extract_cds_locs_from_UCSC_txtable(ucsc_txtable) :
  UCSC data anomaly in 455 transcript(s): the cds cumulative length is
  not a multiple of 3 for transcripts \u2018NM_012133\u2019 \u2018NM_001289413\u2019
  \u2018NM_001322468\u2019 \u2018NM_001290061\u2019 \u2018NM_152503\u2019 \u2018NM_001005914\u2019 \u2018NM_012324\u2019
  \u2018NM_012314\u2019 \u2018NM_001330736\u2019 \u2018NM_000594\u2019 \u2018NM_001291317\u2019 \u2018NM_001291316\u2019
  \u2018NM_016098\u2019 \u2018NM_001184961\u2019 \u2018NM_138324\u2019 \u2018NM_138323\u2019 \u2018NM_019105\u2019
  \u2018NM_001322371\u2019 \u2018NM_001291784\u2019 \u2018NM_001318484\u2019 \u2018NM_001290052\u2019
  \u2018NM_001288952\u2019 \u2018NM_032470\u2019 \u2018NM_001037675\u2019 \u2018NM_001277816\u2019
  \u2018NM_001363372\u2019 \u2018NM_001363371\u2019 \u2018NM_001017915\u2019 \u2018NM_001286474\u2019
  \u2018NM_001177519\u2019 \u2018NM_001010909\u2019 \u2018NM_001135649\u2019 \u2018NM_019105\u2019 \u2018NM_025256\u2019
  \u2018NM_001318045\u2019 \u2018NM_001374269\u2019 \u2018NM_001309242\u2019 \u2018NM_032454\u2019 \u2018NM_032454\u2019
  \u2018NM_032454\u2019 \u2018NM_023035\u2019 \u2018NM_001010855\u2019 \u2018NM_001077350\u2019 \u2018NM_138319\u2019
  \u2018NM_001013742\u2019 \u2018NM_001348230\u2019 \u2018NM_001291986\u2019 \u2018NM_001359231\u2019
  \u2018NM_0013 [... truncated]
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /share/apps/MKL/mkl-2019.3/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] GenomicFeatures_1.46.3 AnnotationDbi_1.56.2   Biobase_2.54.0        
[4] GenomicRanges_1.46.1   GenomeInfoDb_1.30.0    IRanges_2.28.0        
[7] S4Vectors_0.32.3       BiocGenerics_0.40.0   

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.7                  lubridate_1.8.0            
 [3] lattice_0.20-45             prettyunits_1.1.1          
 [5] png_0.1-7                   Rsamtools_2.10.0           
 [7] Biostrings_2.62.0           assertthat_0.2.1           
 [9] digest_0.6.29               utf8_1.2.2                 
[11] BiocFileCache_2.2.0         R6_2.5.1                   
[13] RSQLite_2.2.9               httr_1.4.2                 
[15] pillar_1.6.4                zlibbioc_1.40.0            
[17] rlang_0.4.12                progress_1.2.2             
[19] curl_4.3.2                  blob_1.2.2                 
[21] Matrix_1.3-4                BiocParallel_1.28.3        
[23] stringr_1.4.0               RCurl_1.98-1.5             
[25] bit_4.0.4                   biomaRt_2.50.1             
[27] DelayedArray_0.20.0         RMariaDB_1.2.1             
[29] compiler_4.1.2              rtracklayer_1.54.0         
[31] pkgconfig_2.0.3             tidyselect_1.1.1           
[33] KEGGREST_1.34.0             SummarizedExperiment_1.24.0
[35] tibble_3.1.6                GenomeInfoDbData_1.2.7     
[37] matrixStats_0.61.0          XML_3.99-0.8               
[39] fansi_0.5.0                 crayon_1.4.2               
[41] dplyr_1.0.7                 dbplyr_2.1.1               
[43] GenomicAlignments_1.30.0    bitops_1.0-7               
[45] rappdirs_0.3.3              grid_4.1.2                 
[47] lifecycle_1.0.1             DBI_1.1.2                  
[49] magrittr_2.0.1              stringi_1.7.6              
[51] cachem_1.0.6                XVector_0.34.0             
[53] xml2_1.3.3                  ellipsis_0.3.2             
[55] filelock_1.0.2              generics_0.1.1             
[57] vctrs_0.3.8                 rjson_0.2.20               
[59] restfulr_0.0.13             tools_4.1.2                
[61] bit64_4.0.5                 glue_1.6.0                 
[63] purrr_0.3.4                 MatrixGenerics_1.6.0       
[65] hms_1.1.1                   parallel_4.1.2             
[67] fastmap_1.1.0               yaml_2.2.1                 
[69] memoise_2.0.1               BiocIO_1.4.0               
>

In both release and devel, which will propagate within the next day or so.

ADD COMMENT
0
Entering edit mode

Hello, I see that you say this error has been fixed. I'm still getting the error. I'm running version 1.44.2, which I see is lower than your 1.46.3 reported here; however, when I run BiocManager::install('GenomicFeatures') it informs me that my version is the same as the current version. Is there a way for me to update to 1.46.3?

Thanks

ETA: I looked at the Bioconductor page and saw that the current version of GenomicFeatures is 1.48. I tried upgrading my version of BiocManager before installing but got the same message. Here is my session info:

R version 4.1.3 (2022-03-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils    
[7] datasets  methods   base     

other attached packages:
[1] GenomicFeatures_1.44.2 AnnotationDbi_1.54.1  
[3] Biobase_2.52.0         GenomicRanges_1.44.0  
[5] GenomeInfoDb_1.28.0    IRanges_2.26.0        
[7] S4Vectors_0.30.0       BiocGenerics_0.38.0   

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6                  lattice_0.20-45            
 [3] prettyunits_1.1.1           png_0.1-7                  
 [5] Rsamtools_2.8.0             Biostrings_2.60.1          
 [7] assertthat_0.2.1            digest_0.6.27              
 [9] utf8_1.2.1                  BiocFileCache_2.0.0        
[11] R6_2.5.0                    RSQLite_2.2.7              
[13] httr_1.4.2                  pillar_1.6.1               
[15] zlibbioc_1.38.0             rlang_0.4.11               
[17] progress_1.2.2              curl_4.3.2                 
[19] rstudioapi_0.13             blob_1.2.1                 
[21] Matrix_1.4-1                BiocParallel_1.26.0        
[23] stringr_1.4.0               RCurl_1.98-1.3             
[25] bit_4.0.4                   biomaRt_2.48.3             
[27] DelayedArray_0.18.0         compiler_4.1.3             
[29] rtracklayer_1.52.1          pkgconfig_2.0.3            
[31] SummarizedExperiment_1.22.0 tidyselect_1.1.1           
[33] KEGGREST_1.32.0             tibble_3.1.2               
[35] GenomeInfoDbData_1.2.6      matrixStats_0.59.0         
[37] XML_3.99-0.6                fansi_0.5.0                
[39] crayon_1.4.1                dplyr_1.0.7                
[41] dbplyr_2.1.1                GenomicAlignments_1.28.0   
[43] bitops_1.0-7                rappdirs_0.3.3             
[45] grid_4.1.3                  lifecycle_1.0.0            
[47] DBI_1.1.1                   magrittr_2.0.1             
[49] stringi_1.6.2               cachem_1.0.5               
[51] XVector_0.32.0              xml2_1.3.2                 
[53] ellipsis_0.3.2              filelock_1.0.2             
[55] generics_0.1.0              vctrs_0.3.8                
[57] rjson_0.2.21                restfulr_0.0.13            
[59] tools_4.1.3                 bit64_4.0.5                
[61] glue_1.4.2                  purrr_0.3.4                
[63] MatrixGenerics_1.4.0        hms_1.1.0                  
[65] fastmap_1.1.0               yaml_2.2.1                 
[67] BiocManager_1.30.17         memoise_2.0.0              
[69] BiocIO_1.2.0
ADD REPLY
2
Entering edit mode

You are using Bioc version 3.13, which is two releases behind the current release. You cannot update packages without updating R, as Bioconductor versions are tightly bound to the R version. You need to install R-4.2.0 and Bioc 3.15 to get the current (fixed) version of GenomicFeatures.

ADD REPLY

Login before adding your answer.

Traffic: 668 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6