EPIC v2 annotation and BigWig
0
0
Entering edit mode
MajaVu • 0
@9b236236
Last seen 3 months ago
Netherlands

Dear EPIC users,

I have question about EPIC v2. I analyzed my data using minfi and got beta values for all the probes that passed quality ect. Now I would like to create bigwig files so that I can look at my data in UCSC or another genome browser. However, that is where I encountered a problem.

This is the code I used:


library(IlluminaHumanMethylationEPICv2anno.20a1.hg38)
library(BSgenome.Hsapiens.UCSC.hg38)
library(rtracklayer)
data(Locations)

annotation <- getAnnotation(IlluminaHumanMethylationEPICv2anno.20a1.hg38)
annotation_df <- as.data.frame(annotation)

positions_EPIC=data.frame(row.names = row.names(annotation_df),
                     chr=annotation_df$chr,
                     start=annotation_df$pos,
                     end=ifelse(annotation_df$strand == "+", annotation_df$pos + 1, annotation_df$pos - 1),
                     strand=annotation_df$strand)

# Function to swap start and end if start is greater than end
swap_if_needed <- function(start, end) {
  if (start > end) {
    temp <- start
    start <- end
    end <- temp
  }
  return(c(start, end))
}

# Apply the function to each row of the data frame
for (i in 1:nrow(positions_EPIC)) {
  result <- swap_if_needed(positions_EPIC$start[i], positions_EPIC$end[i])
  positions_EPIC$start[i] <- result[1]
  positions_EPIC$end[i] <- result[2]
}

# Print the updated data frame
print(positions_EPIC)

hg38info=seqinfo(BSgenome.Hsapiens.UCSC.hg38)
mybeta2 = na.omit(mybeta)

for (i in 1:ncol(mybeta2)){
  mySample=merge(positions_EPIC,mybeta2[,i],by="row.names")
  names(mySample)[6]="score"
  mySample$Row.names=c()
  print(mySample)
  name=colnames(mybeta2)[i]
  write.table(mySample[,c("chr","start","end","score")],
              sep="\t",
              row.names = FALSE,
              col.names = FALSE,
              quote = FALSE,
              file=paste(name,".bedGraph",sep=""))
  mySample$strand = NULL
  print(mySample)
  myRanges=makeGRangesFromDataFrame(mySample,
                                    keep.extra.columns = TRUE)
  seqinfo(myRanges)=hg38info
  export.bw(myRanges,paste0(name,".bw"))
}

this worked till the very last line - export.bw

Error in FUN(extractROWS(unlisted_X, IRanges(X_elt_start[i], X_elt_end[i])),  : 
  BigWig ranges cannot overlap
In addition: Warning message:
In valid.GenomicRanges.seqinfo(x, suggest.trim = TRUE) :
  GRanges object contains 4 out-of-bound ranges located on sequences chr6,
  chr11, chr14, and chr18. Note that ranges located on a sequence whose length
  is unknown (NA) or on a circular sequence are not considered out-of-bound
  (use seqlengths() and isCircular() to get the lengths and circularity flags
  of the underlying sequences). You can use trim() to trim these ranges. See
  ?`trim,GenomicRanges-method` for more information.

Then I decided to skip coupe of list lines and just write all the bedgraph files (since that worked), and then I tried to convert them using bedgraphToBogwig finction from UCSC but then error was different - End coordinate 140288269 bigger than chr11 size of 135086622 line 168183 of file

And indeed in EPIC annotation this is the position of the prob/cpg enter image description here

however that is out of the coordinates in hg38 genome enter image description here

Does anyone knows why is this, and how to generate bigwig files?

Thank you all, Maja

> sessionInfo( )
R version 4.4.0 (2024-04-24 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 10 x64 (build 19045)

Matrix products: default


locale:
[1] LC_COLLATE=Dutch_Netherlands.utf8  LC_CTYPE=Dutch_Netherlands.utf8   
[3] LC_MONETARY=Dutch_Netherlands.utf8 LC_NUMERIC=C                      
[5] LC_TIME=Dutch_Netherlands.utf8    

time zone: Europe/Amsterdam
tzcode source: internal

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DNAmArray_2.0.0                                   
 [2] pls_2.8-3                                         
 [3] FDb.InfiniumMethylation.hg19_2.2.0                
 [4] org.Hs.eg.db_3.19.1                               
 [5] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2           
 [6] GenomicFeatures_1.56.0                            
 [7] AnnotationDbi_1.66.0                              
 [8] BiocManager_1.30.23                               
 [9] devtools_2.4.5                                    
[10] usethis_2.2.3                                     
[11] IlluminaHumanMethylationEPICv2manifest_1.0.0      
[12] BSgenome.Hsapiens.UCSC.hg38_1.4.5                 
[13] BSgenome_1.72.0                                   
[14] rtracklayer_1.64.0                                
[15] BiocIO_1.14.0                                     
[16] IlluminaHumanMethylationEPICv2anno.20a1.hg38_1.0.0
[17] minfi_1.50.0                                      
[18] bumphunter_1.46.0                                 
[19] locfit_1.5-9.10                                   
[20] iterators_1.0.14                                  
[21] foreach_1.5.2                                     
[22] Biostrings_2.72.1                                 
[23] XVector_0.44.0                                    
[24] SummarizedExperiment_1.34.0                       
[25] Biobase_2.64.0                                    
[26] MatrixGenerics_1.16.0                             
[27] matrixStats_1.3.0                                 
[28] GenomicRanges_1.56.1                              
[29] GenomeInfoDb_1.40.1                               
[30] IRanges_2.38.0                                    
[31] S4Vectors_0.42.0                                  
[32] BiocGenerics_0.50.0                               
[33] lubridate_1.9.3                                   
[34] forcats_1.0.0                                     
[35] stringr_1.5.1                                     
[36] dplyr_1.1.4                                       
[37] purrr_1.0.2                                       
[38] tidyr_1.3.1                                       
[39] tibble_3.2.1                                      
[40] ggplot2_3.5.1                                     
[41] tidyverse_2.0.0                                   
[42] readr_2.1.5                                       

loaded via a namespace (and not attached):
  [1] splines_4.4.0             later_1.3.2               bitops_1.0-7             
  [4] preprocessCore_1.66.0     XML_3.99-0.16.1           lifecycle_1.0.4          
  [7] processx_3.8.4            lattice_0.22-6            vroom_1.6.5              
 [10] MASS_7.3-60.2             base64_2.0.1              scrime_1.3.5             
 [13] magrittr_2.0.3            limma_3.60.3              yaml_2.3.8               
 [16] remotes_2.5.0             httpuv_1.6.15             doRNG_1.8.6              
 [19] askpass_1.2.0             sessioninfo_1.2.2         pkgbuild_1.4.4           
 [22] DBI_1.2.3                 RColorBrewer_1.1-3        abind_1.4-5              
 [25] pkgload_1.4.0             zlibbioc_1.50.0           quadprog_1.5-8           
 [28] RCurl_1.98-1.14           GenomeInfoDbData_1.2.12   genefilter_1.86.0        
 [31] annotate_1.82.0           DelayedMatrixStats_1.26.0 codetools_0.2-20         
 [34] DelayedArray_0.30.1       xml2_1.3.6                tidyselect_1.2.1         
 [37] UCSC.utils_1.0.0          beanplot_1.3.1            illuminaio_0.46.0        
 [40] GenomicAlignments_1.40.0  jsonlite_1.8.8            multtest_2.60.0          
 [43] ellipsis_0.3.2            survival_3.7-0            tools_4.4.0              
 [46] Rcpp_1.0.12               glue_1.7.0                SparseArray_1.4.8        
 [49] HDF5Array_1.32.0          withr_3.0.0               fastmap_1.2.0            
 [52] rhdf5filters_1.16.0       fansi_1.0.6               openssl_2.2.0            
 [55] callr_3.7.6               digest_0.6.35             timechange_0.3.0         
 [58] R6_2.5.1                  mime_0.12                 colorspace_2.1-0         
 [61] RSQLite_2.3.7             utf8_1.2.4                generics_0.1.3           
 [64] data.table_1.15.4         httr_1.4.7                htmlwidgets_1.6.4        
 [67] S4Arrays_1.4.1            pkgconfig_2.0.3           gtable_0.3.5             
 [70] blob_1.2.4                siggenes_1.78.0           htmltools_0.5.8.1        
 [73] profvis_0.3.8             scales_1.3.0              png_0.1-8                
 [76] rstudioapi_0.16.0         reshape2_1.4.4            tzdb_0.4.0               
 [79] rjson_0.2.21              nlme_3.1-164              curl_5.2.1               
 [82] cachem_1.1.0              rhdf5_2.48.0              miniUI_0.1.1.1           
 [85] restfulr_0.0.15           desc_1.4.3                GEOquery_2.72.0          
 [88] pillar_1.9.0              grid_4.4.0                reshape_0.8.9            
 [91] vctrs_0.6.5               urlchecker_1.0.1          promises_1.3.0           
 [94] xtable_1.8-4              cli_3.6.1                 compiler_4.4.0           
 [97] Rsamtools_2.20.0          rlang_1.1.3               crayon_1.5.3             
[100] rngtools_1.5.2            nor1mix_1.3-3             mclust_6.1.1             
[103] ps_1.7.6                  plyr_1.8.9                fs_1.6.4                 
[106] stringi_1.8.4             BiocParallel_1.38.0       htm2txt_2.2.2            
[109] munsell_0.5.1             Matrix_1.7-0              hms_1.1.3                
[112] sparseMatrixStats_1.16.0  bit64_4.0.5               Rhdf5lib_1.26.0          
[115] KEGGREST_1.44.1           statmod_1.5.0             shiny_1.8.1.1            
[118] memoise_2.0.1             bit_4.0.5
EPICv2manifest IlluminaHumanMethylationEPICv2anno.20a1.hg38 • 247 views
ADD COMMENT

Login before adding your answer.

Traffic: 591 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6