dmrcate needs at least two non-NA values to interpolate
1
0
Entering edit mode
b.curran • 0
@bcurran-6988
Last seen 15 months ago
New Zealand

I have been attempting to run the dmrcate function from the package DMRcate, following the vignette in the Miss Methyl documentation for identifying regions of differential methylation.

It runs through to the Y choromosome, then falls over with and error message stating that it needs at least two non-NA values to interpolate. Which sounds reasonable. I don't know where/how to track down the source of this error though.


> DMRs <- dmrcate(myAnnotation, lambda=100, C=2, min.cpgs=2)
Fitting chr1...
Fitting chr2...
Fitting chr3...
Fitting chr4...
Fitting chr5...
Fitting chr6...
Fitting chr7...
Fitting chr8...
Fitting chr9...
Fitting chr10...
Fitting chr11...
Fitting chr12...
Fitting chr13...
Fitting chr14...
Fitting chr15...
Fitting chr16...
Fitting chr17...
Fitting chr18...
Fitting chr19...
Fitting chr20...
Fitting chr21...
Fitting chr22...
Fitting chrX...
Fitting chrY...
Error in approx(x = x, y = i, xout = xout, rule = 2) : 
  need at least two non-NA values to interpolater

I'm storing MValues from EPIC arrays in an anndata object. It reads it in fine, and I subset out samples with AML and TALL. I transpose it. Using the same design and data set, I can run limma successfully. When I run DMRcate, I get the above error. The anndataobject holding the MValues has been filtered, I've removed cross hybridizing probes, probes associated with common snps, probes with any NA, and probes associated with the x/Y/M chromosomes.

I have used the following code to get to this point.

library("reticulate")
library("missMethyl")
library("anndata")
library("limma")
library("DMRcate")


ad <- read_h5ad("/data/projects/classifiers/data/methylation/methylationHG38MValues.h5ad")
ad = ad[ad$obs$diagnosis == "AML" | ad$obs$diagnosis =="TALL"]
data = t(ad$X)

diagnosisLevels = unique(ad$obs$diagnosis)
group <- factor(ad$obs$diagnosis,levels=diagnosisLevels)
design <- model.matrix(~group)


myAnnotation <- cpg.annotate(object = data, datatype = "array", what = "M", 
                             arraytype = c("EPIC"), 
                             analysis.type = "differential", design = design, 
                             coef = 2)

DMRs <- dmrcate(myAnnotation, lambda=100, C=2)

I thought it might have been the filtering of the X/Y/M probes, but dmrcate gets past the X chromosome fine, so I am assuming it's not that. Not sure what I should be looking at to identify the problem, any suggestions would be appreciated.

session info is as follows:

sessionInfo( )
R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3;  LAPACK version 3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Australia/Sydney
tzcode source: system (glibc)

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] DMRcate_2.14.1                                     
 [2] limma_3.56.2                                       
 [3] anndata_0.7.5.6                                    
 [4] missMethyl_1.34.0                                  
 [5] IlluminaHumanMethylationEPICanno.ilm10b4.hg19_0.6.0
 [6] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.1 
 [7] minfi_1.46.0                                       
 [8] bumphunter_1.42.0                                  
 [9] locfit_1.5-9.8                                     
[10] iterators_1.0.14                                   
[11] foreach_1.5.2                                      
[12] Biostrings_2.68.1                                  
[13] XVector_0.40.0                                     
[14] SummarizedExperiment_1.30.2                        
[15] Biobase_2.60.0                                     
[16] MatrixGenerics_1.12.3                              
[17] matrixStats_1.0.0                                  
[18] GenomicRanges_1.52.0                               
[19] GenomeInfoDb_1.36.1                                
[20] IRanges_2.34.1                                     
[21] S4Vectors_0.38.1                                   
[22] BiocGenerics_0.46.0                                
[23] reticulate_1.30                                    

loaded via a namespace (and not attached):
  [1] later_1.3.1                   splines_4.3.1                
  [3] BiocIO_1.10.0                 bitops_1.0-7                 
  [5] filelock_1.0.2                R.oo_1.25.0                  
  [7] tibble_3.2.1                  preprocessCore_1.62.1        
  [9] XML_3.99-0.14                 rpart_4.1.19                 
 [11] lifecycle_1.0.3               edgeR_3.42.4                 
 [13] rprojroot_2.0.3               ensembldb_2.24.0             
 [15] lattice_0.21-8                MASS_7.3-60                  
 [17] base64_2.0.1                  scrime_1.3.5                 
 [19] backports_1.4.1               magrittr_2.0.3               
 [21] rmarkdown_2.23                Hmisc_5.1-0                  
 [23] yaml_2.3.7                    httpuv_1.6.11                
 [25] doRNG_1.8.6                   askpass_1.1                  
 [27] Gviz_1.44.0                   DBI_1.1.3                    
 [29] RColorBrewer_1.1-3            abind_1.4-5                  
 [31] zlibbioc_1.46.0               quadprog_1.5-8               
 [33] R.utils_2.12.2                purrr_1.0.1                  
 [35] AnnotationFilter_1.24.0       biovizBase_1.48.0            
 [37] RCurl_1.98-1.12               nnet_7.3-19                  
 [39] VariantAnnotation_1.46.0      rappdirs_0.3.3               
 [41] GenomeInfoDbData_1.2.10       genefilter_1.82.1            
 [43] annotate_1.78.0               permute_0.9-7                
 [45] DelayedMatrixStats_1.22.1     codetools_0.2-19             
 [47] DelayedArray_0.26.7           xml2_1.3.5                   
 [49] tidyselect_1.2.0              beanplot_1.3.1               
 [51] BiocFileCache_2.8.0           base64enc_0.1-3              
 [53] illuminaio_0.42.0             GenomicAlignments_1.36.0     
 [55] jsonlite_1.8.7                multtest_2.56.0              
 [57] ellipsis_0.3.2                Formula_1.2-5                
 [59] survival_3.5-5                tools_4.3.1                  
 [61] progress_1.2.2                Rcpp_1.0.11                  
 [63] glue_1.6.2                    gridExtra_2.3                
 [65] xfun_0.40                     here_1.0.1                   
 [67] dplyr_1.1.2                   HDF5Array_1.28.1             
 [69] withr_2.5.0                   BiocManager_1.30.22          
 [71] fastmap_1.1.1                 latticeExtra_0.6-30          
 [73] rhdf5filters_1.12.1           fansi_1.0.4                  
 [75] openssl_2.1.0                 digest_0.6.33                
 [77] mime_0.12                     R6_2.5.1                     
 [79] colorspace_2.1-0              gtools_3.9.4                 
 [81] jpeg_0.1-10                   dichromat_2.0-0.1            
 [83] biomaRt_2.56.1                RSQLite_2.3.1                
 [85] R.methodsS3_1.8.2             utf8_1.2.3                   
 [87] tidyr_1.3.0                   generics_0.1.3               
 [89] data.table_1.14.8             rtracklayer_1.60.0           
 [91] prettyunits_1.1.1             httr_1.4.6                   
 [93] htmlwidgets_1.6.2             S4Arrays_1.0.5               
 [95] pkgconfig_2.0.3               gtable_0.3.3                 
 [97] blob_1.2.4                    siggenes_1.74.0              
 [99] htmltools_0.5.5               ProtGenerics_1.32.0          
[101] scales_1.2.1                  png_0.1-8                    
[103] rstudioapi_0.15.0             knitr_1.43                   
[105] tzdb_0.4.0                    rjson_0.2.21                 
[107] checkmate_2.2.0               nlme_3.1-162                 
[109] curl_5.0.1                    org.Hs.eg.db_3.17.0          
[111] cachem_1.0.8                  rhdf5_2.44.0                 
[113] stringr_1.5.0                 BiocVersion_3.17.1           
[115] foreign_0.8-82                AnnotationDbi_1.62.2         
[117] restfulr_0.0.15               GEOquery_2.68.0              
[119] pillar_1.9.0                  grid_4.3.1                   
[121] reshape_0.8.9                 vctrs_0.6.3                  
[123] promises_1.2.0.1              dbplyr_2.3.3                 
[125] xtable_1.8-4                  cluster_2.1.4                
[127] htmlTable_2.4.1               bsseq_1.36.0                 
[129] evaluate_0.21                 readr_2.1.4                  
[131] GenomicFeatures_1.52.1        cli_3.6.1                    
[133] compiler_4.3.1                Rsamtools_2.16.0             
[135] rlang_1.1.1                   crayon_1.5.2                 
[137] rngtools_1.5.2                nor1mix_1.3-0                
[139] interp_1.1-4                  mclust_6.0.0                 
[141] plyr_1.8.8                    stringi_1.7.12               
[143] deldir_1.0-9                  BiocParallel_1.34.2          
[145] assertthat_0.2.1              munsell_0.5.0                
[147] lazyeval_0.2.2                DSS_2.48.0                   
[149] Matrix_1.6-0                  ExperimentHub_2.8.1          
[151] BSgenome_1.68.0               hms_1.1.3                    
[153] sparseMatrixStats_1.12.2      bit64_4.0.5                  
[155] ggplot2_3.4.2                 Rhdf5lib_1.22.0              
[157] shiny_1.7.4.1                 KEGGREST_1.40.0              
[159] statmod_1.5.0                 interactiveDisplayBase_1.38.0
[161] AnnotationHub_3.8.0           memoise_2.0.1                
[163] bit_4.0.5

```

miss DMRcate missMethyl • 745 views
ADD COMMENT
0
Entering edit mode
Tim Peters ▴ 200
@tim-peters-7579
Last seen 3 months ago
Australia

Hello,

If dmrcate() is telling you "Fitting chrX..." and "Fitting chrY..." then somehow these probes have been retained in your data object.

Try the following:

data.noXY <- rmSNPandCH(data, rmXY=TRUE)
myAnnotation <- cpg.annotate(object = data.noXY, datatype = "array", what = "M", 
                             arraytype = c("EPIC"), 
                             analysis.type = "differential", design = design, 
                             coef = 2)

DMRs <- dmrcate(myAnnotation, lambda=100, C=2)

Cheers, Tim

ADD COMMENT

Login before adding your answer.

Traffic: 441 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6