Question

How to get methylation data in count data format from "Infinium MethylationEPIC BeadChips (Illumina)" idat files

0

Entering edit mode

Jojo • 0

@edf159b6

Last seen 3.2 years ago

Germany

I have to analyse BeadChip Methylation Data. Since I don't have replicates in my experiment, I'm thinking of using the package 'DSS' for the analysis. This package takes data in count format for each CG position: chromosome number, genomic coordinate, total number of reads, and number of reads showing methylation, like:

chr     pos     N       X
chr18   3014904 26      2
chr18   3031032 33      12
chr18   3031044 33      13
chr18   3031065 48      24

I could read the Illumina .idat files using the library 'illuminaio', which gives this result.

> library(illuminaio)
> idat <- readIDAT("205715840012_R01C01_Grn.idat")

> names(idat)
 [1] "fileSize"      "versionNumber" "nFields"       "fields"        "nSNPsRead"     "Quants"        "MidBlock"     
 [8] "RedGreen"      "Barcode"       "ChipType"      "RunInfo"       "Unknowns"     

> idat$Quants[1:5,]
        Mean  SD NBeads
1600101 8827 870     20
1600111 2972 355     16
1600115 2550 484     16
1600123 1266 221     12
1600131  180  94     19

Now, I do not know how to covert this information to the above 'count data' information with chr, pos, N, X. Any help would be appreciated.

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale:
[1] LC_COLLATE=English_India.1252  LC_CTYPE=English_India.1252    LC_MONETARY=English_India.1252 LC_NUMERIC=C                  
[5] LC_TIME=English_India.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] IlluminaDataTestFiles_1.30.0 illuminaio_0.34.0            DSS_2.40.0                   bsseq_1.28.0                
 [5] SummarizedExperiment_1.22.0  MatrixGenerics_1.4.3         matrixStats_0.61.0           GenomicRanges_1.44.0        
 [9] GenomeInfoDb_1.28.4          IRanges_2.26.0               S4Vectors_0.30.2             BiocParallel_1.26.2         
[13] Biobase_2.52.0               BiocGenerics_0.38.0         

loaded via a namespace (and not attached):
 [1] base64_2.0                Rcpp_1.0.8                locfit_1.5-9.4            lattice_0.20-44           Rsamtools_2.8.0          
 [6] Biostrings_2.60.2         gtools_3.9.2              digest_0.6.29             R6_2.5.1                  evaluate_0.14            
[11] sparseMatrixStats_1.4.2   zlibbioc_1.38.0           rlang_1.0.1               rstudioapi_0.13           data.table_1.14.2        
[16] jquerylib_0.1.4           R.utils_2.11.0            R.oo_1.24.0               Matrix_1.4-0              rmarkdown_2.11           
[21] splines_4.1.0             stringr_1.4.0             RCurl_1.98-1.5            munsell_0.5.0             DelayedArray_0.18.0      
[26] HDF5Array_1.20.0          compiler_4.1.0            rtracklayer_1.52.1        xfun_0.29                 askpass_1.1              
[31] htmltools_0.5.2           openssl_1.4.6             GenomeInfoDbData_1.2.6    XML_3.99-0.8              permute_0.9-7            
[36] crayon_1.4.2              GenomicAlignments_1.28.0  bitops_1.0-7              rhdf5filters_1.4.0        R.methodsS3_1.8.1        
[41] grid_4.1.0                jsonlite_1.7.3            lifecycle_1.0.1           magrittr_2.0.2            scales_1.1.1             
[46] stringi_1.7.6             cli_3.1.1                 XVector_0.32.0            limma_3.48.3              bslib_0.3.1              
[51] DelayedMatrixStats_1.14.3 Rhdf5lib_1.14.2           rjson_0.2.21              restfulr_0.0.13           tools_4.1.0              
[56] BSgenome_1.60.0           fastmap_1.1.0             yaml_2.2.2                colorspace_2.0-2          rhdf5_2.36.0             
[61] BiocManager_1.30.16       knitr_1.37                sass_0.4.0                BiocIO_1.2.0

methylationArrayAnalysis illuminaio beadarray DSS bsseqData • 1.7k views

ADD COMMENT • link 3.2 years ago Jojo • 0

1

Entering edit mode

I don't think DSS is designed for array data but rather to analyse BS-seq data

ADD REPLY • link 3.2 years ago Basti ▴ 780

0

Entering edit mode

Yes, I read that, but I thought it might be possible to convert the information to BS-seq format somehow.

ADD REPLY • link 3.2 years ago Jojo • 0

1

Entering edit mode

Since methylation arrays rely on fluorescence signals data and not sequencing data I think it is a non-sense. There are plenty of packages specifically designed for methylation array analysis : minfi, ChAMPare among the most popular

ADD REPLY • link 3.2 years ago Basti ▴ 780

0

Entering edit mode

I have been using ChAMP for such analyses, but as far as I know, ChAMP doesn't give a way for a 'no-replicate' situation. Therefore, I tried to move to DSS. But I understand your point. Thanks!

ADD REPLY • link 3.2 years ago Jojo • 0