Entering edit mode
Hi, I am working with some HiChIP data, starting from HiC-Pro outputs. I am running into an error when running HiCDCPlus(gi_list) - see below. I have tried loading the HiC data via the allValidPairs file as well as using the matrix + bed file combination.
Thanks in advance for any help!
# include your problematic code here with any corresponding output
gi_list_hicpro <- add_hicpro_matrix_counts(
+ gi_list,
+ paste0(hic_mat_dir,"DMSO_rep1_chr_500000_abs.bed"),
+ paste0(hic_mat_dir,"DMSO_rep1_500000.matrix"),
+ add_inter = FALSE)
> gi_list_expand1D<-expand_1D_features(gi_list_hicpro)
> set.seed(1010) #HiC-DC downsamples rows for modeling
> gi_list_hicdcplus<-HiCDCPlus(gi_list_expand1D) #HiCDCPlus_parallel runs in parallel across ncores
Chromosome chr1 complete.
Chromosome chr10 complete.
Error in splineDesign(Aknots, x, ord) :
length of 'derivs' is larger than length of 'x'
> gi_list_hicdcplus<-HiCDCPlus(gi_list_expand1D, distance_type="log") #HiCDCPlus_parallel runs in parallel across ncores
Chromosome chr1 complete.
Chromosome chr10 complete.
Error in glm.fit(x = numeric(0), y = numeric(0), weights = NULL, start = NULL, :
object 'fit' not found
# please also include the results of running the following in an R session
> sessionInfo()
R version 4.0.1 (2020-06-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /n/app/openblas/0.2.19/lib/libopenblas_core2p-r0.2.19.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] BSgenome.Hsapiens.UCSC.hg19_1.4.3 BSgenome_1.58.0
[3] rtracklayer_1.50.0 Biostrings_2.58.0
[5] XVector_0.30.0 GenomicRanges_1.42.0
[7] GenomeInfoDb_1.26.7 IRanges_2.24.1
[9] S4Vectors_0.28.1 BiocGenerics_0.36.1
[11] HiCDCPlus_0.99.14
loaded via a namespace (and not attached):
[1] ProtGenerics_1.22.0 bitops_1.0-7
[3] matrixStats_0.60.0 bit64_4.0.5
[5] RColorBrewer_1.1-2 progress_1.2.2
[7] httr_1.4.2 InteractionSet_1.18.1
[9] backports_1.2.1 tools_4.0.1
[11] utf8_1.2.2 R6_2.5.0
[13] rpart_4.1-15 lazyeval_0.2.2
[15] Hmisc_4.5-0 DBI_1.1.1
[17] Gviz_1.34.1 colorspace_2.0-2
[19] nnet_7.3-16 gridExtra_2.3
[21] tidyselect_1.1.1 prettyunits_1.1.1
[23] bit_4.0.4 curl_4.3.2
[25] compiler_4.0.1 Biobase_2.50.0
[27] htmlTable_2.2.1 xml2_1.3.2
[29] DelayedArray_0.16.3 checkmate_2.0.0
[31] scales_1.1.1 askpass_1.1
[33] rappdirs_0.3.3 stringr_1.4.0
[35] digest_0.6.27 Rsamtools_2.6.0
[37] foreign_0.8-80 R.utils_2.10.1
[39] dichromat_2.0-0 htmltools_0.5.1.1
[41] base64enc_0.1-3 jpeg_0.1-9
[43] pkgconfig_2.0.3 MatrixGenerics_1.2.1
[45] ensembldb_2.14.1 dbplyr_2.1.1
[47] fastmap_1.1.0 htmlwidgets_1.5.3
[49] rlang_0.4.11 rstudioapi_0.13
[51] RSQLite_2.2.7 generics_0.1.0
[53] BiocParallel_1.24.1 dplyr_1.0.7
[55] R.oo_1.24.0 VariantAnnotation_1.34.0
[57] RCurl_1.98-1.3 magrittr_2.0.1
[59] GenomeInfoDbData_1.2.4 Formula_1.2-4
[61] Matrix_1.3-4 Rcpp_1.0.7
[63] munsell_0.5.0 fansi_0.5.0
[65] lifecycle_1.0.0 R.methodsS3_1.8.1
[67] stringi_1.6.2 MASS_7.3-54
[69] SummarizedExperiment_1.20.0 zlibbioc_1.36.0
[71] BiocFileCache_1.14.0 grid_4.0.1
[73] blob_1.2.2 crayon_1.4.1
[75] lattice_0.20-41 splines_4.0.1
[77] GenomicFeatures_1.42.3 hms_1.1.0
[79] knitr_1.33 pillar_1.6.2
[81] igraph_1.2.6 biomaRt_2.46.3
[83] XML_3.99-0.6 glue_1.4.2
[85] biovizBase_1.38.0 latticeExtra_0.6-29
[87] data.table_1.14.0 vctrs_0.3.8
[89] png_0.1-7 gtable_0.3.0
[91] openssl_1.4.4 purrr_0.3.4
[93] tidyr_1.1.3 assertthat_0.2.1
[95] cachem_1.0.5 ggplot2_3.3.5
[97] xfun_0.24 AnnotationFilter_1.14.0
[99] survival_3.2-11 tibble_3.1.2
[101] GenomicAlignments_1.26.0 AnnotationDbi_1.52.0
[103] memoise_2.0.0 cluster_2.1.2
[105] ellipsis_0.3.2 GenomicInteractions_1.24.0
I think the df and splineknotting options are yielding distance spline points that are outside the distance range available in the data. I would recommend trying lower
df
and/orsplineknotting="count-based"