No output from read.bismark with bismark cov.files
0
0
Entering edit mode
ja569116 • 0
@08ccc285
Last seen 6 months ago
United States

Hi, I generated some input data with Bismark, with the following code:

bismark_methylation_extractor -p --scaffolds --bedGraph --buffer_size 40G --genome_folder /home/juaguila/Bismark/ --cytosine_report --parallel 10 ${base}.deduplicated.bam

I want to identify differentially methylated region with dmrseq, and I tested a few files if they could be read by bsseq properly.

    files <- c("V00001.deduplicated.bismark.cov.gz", "V00750.deduplicated.bismark.cov.gz",
    "V00796.deduplicated.bismark.cov.gz")
    samples <- c("V00001", "V00750", "V00796")

    bismarkBSseq <- read.bismark(files = files,
    rmZeroCov = TRUE,
    strandCollapse = TRUE,
    verbose = TRUE,
    nThread = 10L)
    bismarkBSseq
    saveRDS(bismarkBSseq, file = "dmrseq_LH.meth.rds")

I was told to save every output, since the code can fail so I used saveRDS, which has worked before when running MethylKit.

I tried reading and creating the matrices with my files, it was supposedly working , but no output was generated. Everything loaded well, but then, it didn't finish generating the matrices.

[read.bismark] Parsing files and constructing valid loci ...
Done in 64.9 secs
[read.bismark] Parsing files and constructing 'M' and 'Cov' matrices ...

The files look okay:

gzip -cd V00864.deduplicated.bismark.cov.gz | head
NW_022882922.1 4856 4856 0 0 4
NW_022882922.1 4866 4866 0 0 4
NW_022882922.1 4889 4889 0 0 5
NW_022882922.1 4892 4892 0 0 5
NW_022882922.1 4904 4904 0 0 5
NW_022882922.1 4953 4953 0 0 6
NW_022882922.1 4962 4962 0 0 7
NW_022882922.1 4966 4966 0 0 6
NW_022882922.1 4976 4976 0 0 5
NW_022882922.1 4977 4977 0 0 1

I am analyzing methylation data in bumblebees and each file has about 20.M lines, though not all the sites are present in all individuals.

Any ideas why this code did not work? I am running it in a server with 500GB of Ram and 48 cores.

I really want to figure and solve out this issue, because I have 44 files, and now I just tested 6 (six) files. So, at least it should work with a small sample size.

Could you tell if my code okay? I would really appreciate if you could help figure out what is going on.

Thank you very much;

> sessionInfo( )
R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /home/applications/R/4.3.1/lib64/R/lib/libRblas.so 
LAPACK: /home/applications/R/4.3.1/lib64/R/lib/libRlapack.so;  LAPACK version 3.11.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/New_York
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] dmrseq_1.20.1               bsseq_1.36.0               
 [3] SummarizedExperiment_1.30.2 Biobase_2.60.0             
 [5] MatrixGenerics_1.12.3       matrixStats_1.2.0          
 [7] GenomicRanges_1.52.0        GenomeInfoDb_1.36.4        
 [9] IRanges_2.34.1              S4Vectors_0.38.2           
[11] BiocGenerics_0.46.0        

loaded via a namespace (and not attached):
  [1] DBI_1.2.2                     bitops_1.0-7                 
  [3] permute_0.9-7                 biomaRt_2.56.1               
  [5] rlang_1.1.3                   magrittr_2.0.3               
  [7] compiler_4.3.1                RSQLite_2.3.5                
  [9] GenomicFeatures_1.52.2        DelayedMatrixStats_1.22.6    
 [11] reshape2_1.4.4                png_0.1-8                    
 [13] vctrs_0.6.5                   stringr_1.5.1                
 [15] pkgconfig_2.0.3               crayon_1.5.2                 
 [17] fastmap_1.1.1                 dbplyr_2.3.4                
[19] XVector_0.40.0                ellipsis_0.3.2               
 [21] utf8_1.2.4                    Rsamtools_2.16.0             
 [23] promises_1.2.1                tzdb_0.4.0                   
 [25] bit_4.0.5                     zlibbioc_1.46.0              
 [27] cachem_1.0.8                  progress_1.2.2               
 [29] blob_1.2.4                    later_1.3.1                  
 [31] rhdf5filters_1.12.1           DelayedArray_0.26.7          
 [33] Rhdf5lib_1.22.1               BiocParallel_1.34.2          
 [35] interactiveDisplayBase_1.38.0 prettyunits_1.2.0            
 [37] parallel_4.3.1                R6_2.5.1                     
 [39] RColorBrewer_1.1-3            stringi_1.8.3                
 [41] limma_3.56.2                  rtracklayer_1.60.1           
 [43] iterators_1.0.14              Rcpp_1.0.11                  
 [45] R.utils_2.12.3                readr_2.1.4                  
 [47] splines_4.3.1                 httpuv_1.6.11                
 [49] Matrix_1.6-5                  tidyselect_1.2.0             
 [51] abind_1.4-5                   yaml_2.3.8                   
 [53] codetools_0.2-19              curl_5.2.1                   
 [55] doRNG_1.8.6                   plyr_1.8.8                   
 [57] regioneR_1.32.0               lattice_0.21-8               
 [59] tibble_3.2.1                  shiny_1.7.5.1                
 [61] KEGGREST_1.40.1               BiocFileCache_2.8.0          
 [63] xml2_1.3.6                    Biostrings_2.68.1            
 [65] pillar_1.9.0                  BiocManager_1.30.22          
 [67] filelock_1.0.3                rngtools_1.5.2               
 [69] foreach_1.5.2                 generics_0.1.3               
 [71] RCurl_1.98-1.14               ggplot2_3.4.3                
 [73] hms_1.1.3                     BiocVersion_3.17.1           
 [75] sparseMatrixStats_1.12.2      munsell_0.5.0                
 [77] scales_1.3.0                  bumphunter_1.42.0            
 [79] gtools_3.9.5                  xtable_1.8-4                 
 [81] glue_1.7.0                    tools_4.3.1                  
 [83] AnnotationHub_3.8.0           BiocIO_1.10.0                
 [85] data.table_1.15.2             BSgenome_1.68.0              
 [87] locfit_1.5-9.9                GenomicAlignments_1.36.0     
 [89] XML_3.99-0.16.1               rhdf5_2.44.0                 
 [91] grid_4.3.1                    AnnotationDbi_1.62.2         
 [93] colorspace_2.1-0              nlme_3.1-164                 
 [95] GenomeInfoDbData_1.2.10       HDF5Array_1.28.1             
 [97] restfulr_0.0.15               annotatr_1.26.0              
[99] cli_3.6.2                     rappdirs_0.3.3               
[101] fansi_1.0.6                   S4Arrays_1.0.6               
[103] dplyr_1.1.3                   gtable_0.3.4                 
[105] outliers_0.15                 R.methodsS3_1.8.2            
[107] digest_0.6.35                 rjson_0.2.21                 
[109] memoise_2.0.1                 htmltools_0.5.7              
[111] R.oo_1.26.0                   lifecycle_1.0.4              
[113] httr_1.4.7                    mime_0.12                    
[115] bit64_4.0.5
bsseq methylation • 452 views
ADD COMMENT
0
Entering edit mode

If the function didn't finish running, you shouldn't expect any results. Did you kill the process? How did it not finish?

ADD REPLY

Login before adding your answer.

Traffic: 762 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6