I have been attempting to run the dmrcate function from the package DMRcate, following the vignette in the Miss Methyl documentation for identifying regions of differential methylation.
It runs through to the Y choromosome, then falls over with and error message stating that it needs at least two non-NA values to interpolate. Which sounds reasonable. I don't know where/how to track down the source of this error though.
> DMRs <- dmrcate(myAnnotation, lambda=100, C=2, min.cpgs=2)
Fitting chr1...
Fitting chr2...
Fitting chr3...
Fitting chr4...
Fitting chr5...
Fitting chr6...
Fitting chr7...
Fitting chr8...
Fitting chr9...
Fitting chr10...
Fitting chr11...
Fitting chr12...
Fitting chr13...
Fitting chr14...
Fitting chr15...
Fitting chr16...
Fitting chr17...
Fitting chr18...
Fitting chr19...
Fitting chr20...
Fitting chr21...
Fitting chr22...
Fitting chrX...
Fitting chrY...
Error in approx(x = x, y = i, xout = xout, rule = 2) :
need at least two non-NA values to interpolater
I'm storing MValues from EPIC arrays in an anndata object. It reads it in fine, and I subset out samples with AML and TALL. I transpose it. Using the same design and data set, I can run limma successfully. When I run DMRcate, I get the above error. The anndataobject holding the MValues has been filtered, I've removed cross hybridizing probes, probes associated with common snps, probes with any NA, and probes associated with the x/Y/M chromosomes.
I have used the following code to get to this point.
library("reticulate")
library("missMethyl")
library("anndata")
library("limma")
library("DMRcate")
ad <- read_h5ad("/data/projects/classifiers/data/methylation/methylationHG38MValues.h5ad")
ad = ad[ad$obs$diagnosis == "AML" | ad$obs$diagnosis =="TALL"]
data = t(ad$X)
diagnosisLevels = unique(ad$obs$diagnosis)
group <- factor(ad$obs$diagnosis,levels=diagnosisLevels)
design <- model.matrix(~group)
myAnnotation <- cpg.annotate(object = data, datatype = "array", what = "M",
arraytype = c("EPIC"),
analysis.type = "differential", design = design,
coef = 2)
DMRs <- dmrcate(myAnnotation, lambda=100, C=2)
I thought it might have been the filtering of the X/Y/M probes, but dmrcate gets past the X chromosome fine, so I am assuming it's not that. Not sure what I should be looking at to identify the problem, any suggestions would be appreciated.
session info is as follows:
sessionInfo( )
R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3; LAPACK version 3.9.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Australia/Sydney
tzcode source: system (glibc)
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] DMRcate_2.14.1
[2] limma_3.56.2
[3] anndata_0.7.5.6
[4] missMethyl_1.34.0
[5] IlluminaHumanMethylationEPICanno.ilm10b4.hg19_0.6.0
[6] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.1
[7] minfi_1.46.0
[8] bumphunter_1.42.0
[9] locfit_1.5-9.8
[10] iterators_1.0.14
[11] foreach_1.5.2
[12] Biostrings_2.68.1
[13] XVector_0.40.0
[14] SummarizedExperiment_1.30.2
[15] Biobase_2.60.0
[16] MatrixGenerics_1.12.3
[17] matrixStats_1.0.0
[18] GenomicRanges_1.52.0
[19] GenomeInfoDb_1.36.1
[20] IRanges_2.34.1
[21] S4Vectors_0.38.1
[22] BiocGenerics_0.46.0
[23] reticulate_1.30
loaded via a namespace (and not attached):
[1] later_1.3.1 splines_4.3.1
[3] BiocIO_1.10.0 bitops_1.0-7
[5] filelock_1.0.2 R.oo_1.25.0
[7] tibble_3.2.1 preprocessCore_1.62.1
[9] XML_3.99-0.14 rpart_4.1.19
[11] lifecycle_1.0.3 edgeR_3.42.4
[13] rprojroot_2.0.3 ensembldb_2.24.0
[15] lattice_0.21-8 MASS_7.3-60
[17] base64_2.0.1 scrime_1.3.5
[19] backports_1.4.1 magrittr_2.0.3
[21] rmarkdown_2.23 Hmisc_5.1-0
[23] yaml_2.3.7 httpuv_1.6.11
[25] doRNG_1.8.6 askpass_1.1
[27] Gviz_1.44.0 DBI_1.1.3
[29] RColorBrewer_1.1-3 abind_1.4-5
[31] zlibbioc_1.46.0 quadprog_1.5-8
[33] R.utils_2.12.2 purrr_1.0.1
[35] AnnotationFilter_1.24.0 biovizBase_1.48.0
[37] RCurl_1.98-1.12 nnet_7.3-19
[39] VariantAnnotation_1.46.0 rappdirs_0.3.3
[41] GenomeInfoDbData_1.2.10 genefilter_1.82.1
[43] annotate_1.78.0 permute_0.9-7
[45] DelayedMatrixStats_1.22.1 codetools_0.2-19
[47] DelayedArray_0.26.7 xml2_1.3.5
[49] tidyselect_1.2.0 beanplot_1.3.1
[51] BiocFileCache_2.8.0 base64enc_0.1-3
[53] illuminaio_0.42.0 GenomicAlignments_1.36.0
[55] jsonlite_1.8.7 multtest_2.56.0
[57] ellipsis_0.3.2 Formula_1.2-5
[59] survival_3.5-5 tools_4.3.1
[61] progress_1.2.2 Rcpp_1.0.11
[63] glue_1.6.2 gridExtra_2.3
[65] xfun_0.40 here_1.0.1
[67] dplyr_1.1.2 HDF5Array_1.28.1
[69] withr_2.5.0 BiocManager_1.30.22
[71] fastmap_1.1.1 latticeExtra_0.6-30
[73] rhdf5filters_1.12.1 fansi_1.0.4
[75] openssl_2.1.0 digest_0.6.33
[77] mime_0.12 R6_2.5.1
[79] colorspace_2.1-0 gtools_3.9.4
[81] jpeg_0.1-10 dichromat_2.0-0.1
[83] biomaRt_2.56.1 RSQLite_2.3.1
[85] R.methodsS3_1.8.2 utf8_1.2.3
[87] tidyr_1.3.0 generics_0.1.3
[89] data.table_1.14.8 rtracklayer_1.60.0
[91] prettyunits_1.1.1 httr_1.4.6
[93] htmlwidgets_1.6.2 S4Arrays_1.0.5
[95] pkgconfig_2.0.3 gtable_0.3.3
[97] blob_1.2.4 siggenes_1.74.0
[99] htmltools_0.5.5 ProtGenerics_1.32.0
[101] scales_1.2.1 png_0.1-8
[103] rstudioapi_0.15.0 knitr_1.43
[105] tzdb_0.4.0 rjson_0.2.21
[107] checkmate_2.2.0 nlme_3.1-162
[109] curl_5.0.1 org.Hs.eg.db_3.17.0
[111] cachem_1.0.8 rhdf5_2.44.0
[113] stringr_1.5.0 BiocVersion_3.17.1
[115] foreign_0.8-82 AnnotationDbi_1.62.2
[117] restfulr_0.0.15 GEOquery_2.68.0
[119] pillar_1.9.0 grid_4.3.1
[121] reshape_0.8.9 vctrs_0.6.3
[123] promises_1.2.0.1 dbplyr_2.3.3
[125] xtable_1.8-4 cluster_2.1.4
[127] htmlTable_2.4.1 bsseq_1.36.0
[129] evaluate_0.21 readr_2.1.4
[131] GenomicFeatures_1.52.1 cli_3.6.1
[133] compiler_4.3.1 Rsamtools_2.16.0
[135] rlang_1.1.1 crayon_1.5.2
[137] rngtools_1.5.2 nor1mix_1.3-0
[139] interp_1.1-4 mclust_6.0.0
[141] plyr_1.8.8 stringi_1.7.12
[143] deldir_1.0-9 BiocParallel_1.34.2
[145] assertthat_0.2.1 munsell_0.5.0
[147] lazyeval_0.2.2 DSS_2.48.0
[149] Matrix_1.6-0 ExperimentHub_2.8.1
[151] BSgenome_1.68.0 hms_1.1.3
[153] sparseMatrixStats_1.12.2 bit64_4.0.5
[155] ggplot2_3.4.2 Rhdf5lib_1.22.0
[157] shiny_1.7.4.1 KEGGREST_1.40.0
[159] statmod_1.5.0 interactiveDisplayBase_1.38.0
[161] AnnotationHub_3.8.0 memoise_2.0.1
[163] bit_4.0.5
```