AnnotationHub downloading dataset fail
vitay.dora • 0
Last seen 6.9 years ago

I am a beginner in R and currently taking the course "Bioconductor for Genomic Data Science" on Coursera and faced a problem which I was unable to solve. When trying to download a dataset obtained with the AnnotationHub package I face the following error message:

> ahub = AnnotationHub()
> ahub = subset(ahub, species == "Homo sapiens")
> qhs = query(ahub, c("H3K4me3", "Gm12878"))                  
> gr1 = qhs[[2]]
downloading from ‘’
retrieving 1 resource
  |=======================================================| 100%

Error: failed to load resource
  name: AH27075
  title: wgEncodeUwHistoneGm12878H3k4me3StdHotspotsRep1.broadPeak.gz
  reason: scan() expected 'a real', got '%A'
In addition: Warning message:
In read.table(con, colClasses = bedClasses, = TRUE, na.strings = ".",  :
  line 2 appears to contain embedded nulls

If I download the dataset manually from a browser and then import it there are no problems. Anyone had this problem before or has any tips on how to solve it?

> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    
attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] rtracklayer_1.36.4   GenomicRanges_1.28.5 GenomeInfoDb_1.12.2  IRanges_2.10.3       S4Vectors_0.14.4     AnnotationHub_2.8.2  BiocGenerics_0.22.0 
loaded via a namespace (and not attached):
 [1] Rcpp_0.12.12                  BiocInstaller_1.26.1          compiler_3.4.1                XVector_0.16.0                bitops_1.0-6                 
 [6] tools_3.4.1                   zlibbioc_1.22.0               digest_0.6.12                 bit_1.1-12                    lattice_0.20-35              
[11] RSQLite_2.0                   memoise_1.1.0                 tibble_1.3.4                  pkgconfig_2.0.1               rlang_0.1.2                  
[16] Matrix_1.2-11                 DelayedArray_0.2.7            shiny_1.0.5                   DBI_0.7                       curl_2.8.1                   
[21] yaml_2.1.14                   GenomeInfoDbData_0.99.0       httr_1.3.1                    Biostrings_2.44.2             grid_3.4.1                   
[26] bit64_0.9-7                   Biobase_2.36.2                R6_2.2.2                      AnnotationDbi_1.38.2          BiocParallel_1.10.1          
[31] XML_3.98-1.9                  blob_1.1.0                    matrixStats_0.52.2            GenomicAlignments_1.12.2      Rsamtools_1.28.0             
[36] htmltools_0.3.6               SummarizedExperiment_1.6.3    mime_0.5                      interactiveDisplayBase_1.14.0 xtable_1.8-2                 
[41] httpuv_1.3.5                  RCurl_1.95-4.8               


annotationhub download_error • 1.0k views
Last seen 4 hours ago
United States

Maybe try again?

> hub[["AH27075"]]
downloading from ''
retrieving 1 resource
  |======================================================================| 100%

Attaching package: 'Biostrings'

The following object is masked from 'package:base':


GRanges object with 62051 ranges and 5 metadata columns:
          seqnames                 ranges strand |        name     score
             <Rle>              <IRanges>  <Rle> | <character> <numeric>
      [1]     chr1       [712885, 713666]      * |        <NA>         0
      [2]     chr1       [713880, 714754]      * |        <NA>         0
      [3]     chr1       [715073, 715760]      * |        <NA>         0
      [4]     chr1       [716034, 716236]      * |        <NA>         0
      [5]     chr1       [760336, 760655]      * |        <NA>         0
      ...      ...                    ...    ... .         ...       ...
  [62047]     chrX [154562076, 154562278]      * |        <NA>         0
  [62048]     chrX [154562552, 154562881]      * |        <NA>         0
  [62049]     chrX [154563374, 154563472]      * |        <NA>         0
  [62050]     chrX [154835272, 154835496]      * |        <NA>         0
  [62051]     chrX [154840640, 154842963]      * |        <NA>         0
          signalValue    pValue    qValue
            <numeric> <numeric> <numeric>
      [1]    57.50350 112.76800        -1
      [2]    68.21070 153.83200        -1
      [3]    38.71380  56.10910        -1
      [4]     8.98999   7.23979        -1
      [5]     3.25641   1.97914        -1
      ...         ...       ...       ...
  [62047]     5.76076   4.41978        -1
  [62048]     3.67832   2.44700        -1
  [62049]     3.24540   2.06143        -1
  [62050]     2.89763   1.82720        -1
  [62051]    86.64430 170.20100        -1
  seqinfo: 93 sequences (1 circular) from hg19 genome
> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 14393)

Matrix products: default

[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] BSgenome.Hsapiens.UCSC.hg19_1.4.0 BSgenome_1.45.1                  
 [3] Biostrings_2.45.2                 XVector_0.17.0                   
 [5] rtracklayer_1.37.2                GenomicRanges_1.29.6             
 [7] GenomeInfoDb_1.13.4               IRanges_2.11.7                   
 [9] S4Vectors_0.15.5                  AnnotationHub_2.9.5              
[11] BiocGenerics_0.23.0              

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.11                  BiocInstaller_1.26.1         
 [3] compiler_3.4.1                bitops_1.0-6                 
 [5] tools_3.4.1                   zlibbioc_1.23.0              
 [7] digest_0.6.12                 bit_1.1-12                   
 [9] lattice_0.20-35               RSQLite_2.0                  
[11] memoise_1.1.0                 tibble_1.3.3                 
[13] pkgconfig_2.0.1               rlang_0.1.1                  
[15] Matrix_1.2-10                 DelayedArray_0.3.16          
[17] shiny_1.0.3                   DBI_0.7                      
[19] curl_2.7                      yaml_2.1.14                  
[21] GenomeInfoDbData_0.99.1       httr_1.2.1                   
[23] grid_3.4.1                    bit64_0.9-7                  
[25] Biobase_2.37.2                R6_2.2.2                     
[27] AnnotationDbi_1.39.1          BiocParallel_1.11.4          
[29] XML_3.98-1.9                  blob_1.1.0                   
[31] matrixStats_0.52.2            GenomicAlignments_1.13.2     
[33] Rsamtools_1.29.0              htmltools_0.3.6              
[35] SummarizedExperiment_1.7.5    mime_0.5                     
[37] interactiveDisplayBase_1.15.0 xtable_1.8-2                 
[39] httpuv_1.3.5                  RCurl_1.95-4.8               
Last seen 4 hours ago
United States

Actually, now that I look at your package versions, you need to do



and then do what it tells you to do, which will probably be

biocLite(ask = FALSE)

