Find genes in pathways enrichment
1
0
Entering edit mode
@yoursbassanio-12717
Last seen 4.3 years ago

Below is the command I used to run Dmrcate and enrichment analysis.

dmrCate<-cpg.annotate("array", BetaValue, what="Beta", arraytype ="EPIC", analysis.type="differential",fdr=0.01) 
dmrcoutput <- dmrcate(dmrCate,lambda=500, C=5, pcutoff="fdr",min.cpgs=1,mc.cores=1)
wgbs.ranges_all <- extractRanges(dmrcoutput, genome = "hg19")

enrichment_GO <- goregion(wgbs.ranges_all,collection = "GO", array.type = "EPIC") enrichment_GO <-
enrichment_GO[order(enrichment_GO$P.DE),]

I have got the following results too

> "ONTOLOGY"    "TERM"  "N" "DE"    "P.DE"  "FDR"
> "GO:0005654"  "CC"    "nucleoplasm"   3501    1911.66666666667    5.78302322878527e-28    1.31685221942669e-23
> "GO:0043231"  "CC"    "intracellular membrane-bounded
> organelle"    11367   5316.54285714286    8.60779986601549e-27    9.80041053745193e-23
> "GO:0043229"  "CC"    "intracellular
> organelle"    13162   6092.37619047619    8.98325926857439e-24    6.25634156984302e-20

Is there any ways I could fetch the genes that are used/involved to identify in each of these pathways?

sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /scratch/mv83/Software/DMRcate/lib/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
 [1] splines   stats4    parallel  stats     graphics  grDevices utils    
 [8] datasets  methods   base     

other attached packages:
 [1] IlluminaHumanMethylationEPICanno.ilm10b4.hg19_0.6.0
 [2] missMethyl_1.20.3                                  
 [3] celltypes450_0.1                                   
 [4] limma_3.36.5                                       
 [5] DMRcate_1.18.0                                     
 [6] DMRcatedata_1.16.0                                 
 [7] DSS_2.30.0                                         
 [8] bsseq_1.18.0                                       
 [9] minfi_1.28.0                                       
[10] bumphunter_1.24.5                                  
[11] locfit_1.5-9.1                                     
[12] iterators_1.0.10                                   
[13] foreach_1.4.4                                      
[14] Biostrings_2.50.2                                  
[15] XVector_0.22.0                                     
[16] SummarizedExperiment_1.12.0                        
[17] DelayedArray_0.8.0                                 
[18] BiocParallel_1.16.2                                
[19] matrixStats_0.54.0                                 
[20] Biobase_2.42.0                                     
[21] GenomicRanges_1.34.0                               
[22] GenomeInfoDb_1.18.1                                
[23] IRanges_2.16.0                                     
[24] S4Vectors_0.20.1                                   
[25] BiocGenerics_0.28.0                                

loaded via a namespace (and not attached):
  [1] backports_1.1.3                                   
  [2] Hmisc_4.1-1                                       
  [3] plyr_1.8.4                                        
  [4] lazyeval_0.2.2                                    
  [5] ggplot2_3.1.1                                     
  [6] digest_0.6.18                                     
  [7] ensembldb_2.6.3                                   
  [8] htmltools_0.3.6                                   
  [9] GO.db_3.7.0                                       
 [10] magrittr_1.5                                      
 [11] checkmate_1.9.1                                   
 [12] memoise_1.1.0                                     
 [13] BSgenome_1.50.0                                   
 [14] cluster_2.0.8                                     
 [15] readr_1.3.1                                       
 [16] annotate_1.60.0                                   
 [17] R.utils_2.8.0                                     
 [18] askpass_1.1                                       
 [19] siggenes_1.56.0                                   
 [20] prettyunits_1.0.2                                 
 [21] colorspace_1.4-1                                  
 [22] blob_1.1.1                                        
 [23] BiasedUrn_1.07                                    
 [24] xfun_0.6                                          
 [25] dplyr_0.8.0.1                                     
 [26] crayon_1.3.4                                      
 [27] RCurl_1.95-4.12                                   
 [28] genefilter_1.64.0                                 
 [29] GEOquery_2.50.0                                   
 [30] IlluminaHumanMethylationEPICmanifest_0.3.0        
 [31] survival_2.44-1.1                                 
 [32] VariantAnnotation_1.28.3                          
 [33] glue_1.3.1                                        
 [34] ruv_0.9.6                                         
 [35] registry_0.5                                      
 [36] gtable_0.3.0                                      
 [37] zlibbioc_1.28.0                                   
 [38] Rhdf5lib_1.4.3                                    
 [39] HDF5Array_1.10.1                                  
 [40] scales_1.0.0                                      
 [41] DBI_1.0.0                                         
 [42] rngtools_1.3.1                                    
 [43] bibtex_0.4.2                                      
 [44] Rcpp_1.0.1                                        
 [45] xtable_1.8-3                                      
 [46] progress_1.2.0                                    
 [47] htmlTable_1.13.1                                  
 [48] foreign_0.8-71                                    
 [49] bit_1.1-14                                        
 [50] mclust_5.4.3                                      
 [51] preprocessCore_1.44.0                             
 [52] Formula_1.2-3                                     
 [53] htmlwidgets_1.3                                   
 [54] httr_1.4.0                                        
 [55] RColorBrewer_1.1-2                                
 [56] acepack_1.4.1                                     
 [57] pkgconfig_2.0.2                                   
 [58] reshape_0.8.8                                     
 [59] XML_3.98-1.16                                     
 [60] R.methodsS3_1.7.1                                 
 [61] Gviz_1.26.4                                       
 [62] nnet_7.3-12                                       
 [63] tidyselect_0.2.5                                  
 [64] rlang_0.3.3                                       
 [65] AnnotationDbi_1.44.0                              
 [66] munsell_0.5.0                                     
 [67] tools_3.5.1                                       
 [68] RSQLite_2.1.1                                     
 [69] stringr_1.4.0                                     
 [70] org.Hs.eg.db_3.7.0                                
 [71] knitr_1.22                                        
 [72] bit64_0.9-7                                       
 [73] beanplot_1.2                                      
 [74] methylumi_2.28.0                                  
 [75] purrr_0.3.2                                       
 [76] AnnotationFilter_1.6.0                            
 [77] nlme_3.1-137                                      
 [78] doRNG_1.7.1                                       
 [79] nor1mix_1.2-3                                     
 [80] R.oo_1.22.0                                       
 [81] xml2_1.2.0                                        
 [82] biomaRt_2.38.0                                    
 [83] compiler_3.5.1                                    
 [84] rstudioapi_0.10                                   
 [85] curl_3.3                                          
 [86] statmod_1.4.30                                    
 [87] tibble_2.1.1                                      
 [88] stringi_1.4.3                                     
 [89] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.0
 [90] GenomicFeatures_1.34.1                            
 [91] lattice_0.20-38                                   
 [92] ProtGenerics_1.14.0                               
 [93] Matrix_1.2-17                                     
 [94] IlluminaHumanMethylation450kmanifest_0.4.0        
 [95] permute_0.9-5                                     
 [96] multtest_2.38.0                                   
 [97] pillar_1.3.1                                      
 [98] data.table_1.12.2                                 
 [99] bitops_1.0-6                                      
[100] rtracklayer_1.42.2                                
[101] R6_2.4.0                                          
[102] latticeExtra_0.6-28                               
[103] gridExtra_2.3                                     
[104] codetools_0.2-16                                  
[105] dichromat_2.0-0                                   
[106] MASS_7.3-51.3                                     
[107] gtools_3.8.1                                      
[108] assertthat_0.2.1                                  
[109] rhdf5_2.26.2                                      
[110] openssl_1.3                                       
[111] pkgmaker_0.27                                     
[112] withr_2.1.2                                       
[113] GenomicAlignments_1.18.1                          
[114] Rsamtools_1.34.0                                  
[115] GenomeInfoDbData_1.1.0                            
[116] hms_0.4.2                                         
[117] quadprog_1.5-7                                    
[118] grid_3.5.1                                        
[119] rpart_4.1-13                                      
[120] tidyr_0.8.3                                       
[121] base64_2.0                                        
[122] DelayedMatrixStats_1.4.0                          
[123] illuminaio_0.24.0                                 
[124] biovizBase_1.30.1                                 
[125] base64enc_0.1-3
dmrcate enrichment analysis epic array goregion missmethyl • 1.6k views
ADD COMMENT
0
Entering edit mode

Hi,

Can one please help me on this please. I could provide more information if needed

ADD REPLY
0
Entering edit mode

Hi Bassanio,

I'm noticing that your object is called wgbs_ranges_all, yet you specify arraytype="EPIC" in your call to goregion(). Your enrichment terms won't be correct because goregion() assumes a background of CpGs from the EPIC or 450K arrays, not from the entire human genome. To my knowledge there isn't a routine that corrects for varying numbers of CpGs mapped to genes for WGBS, but if you're happy to retain that bias and simply assume that each gene in the genome has an equal chance of being differentially methylated, then perhaps something like the RITAN package to find term enrichment may be sufficient:

library(RITAN)
library(RITANdata)
dmgenes <- unique(unlist(lapply(wgbs_ranges_all$overlapping.genes, function (x) strsplit(x, ", "))))
enrichment <- term_enrichment(dmgenes, resources=names(geneset_list), report_resources_separately = TRUE)

Note that RITAN will, as you requested, report the genes involved in the results.

Cheers, Tim

ADD REPLY
0
Entering edit mode

Hi,

My data is from Epic array and not bi-sulphate sequencing.

ADD REPLY
0
Entering edit mode

Hi Bassanio,

Looking at your code properly I can see that now. Your naming convention threw me though - why would you name the GRanges object wgbs.ranges_all if it came from EPIC? Nevermind.

You should be able to get the gene names this way:

1) Get your subset of CpGs from your DMRs using Jim's method from your previous post: https://support.bioconductor.org/p/125350/ - let's call this character vector sigCpGs.

2) Use missMethyl::getMappedEntrezIDs() to get your Entrez IDs for the relevant genes:

entrezids <- getMappedEntrezIDs(sig.cpg = sigCpGs, all.cpg = rownames(BetaValue), array.type = "EPIC")$sig.eg

3) Use biomaRt to convert to hgnc:

library(biomaRt)
ensembl <- useMart("ensembl",dataset="hsapiens_gene_ensembl")
bm <- getBM(attributes=c('entrezgene_id', 'hgnc_symbol'), filters = 'entrezgene_id', values = entrezids, mart = ensembl)
mygenes <- bm$hgnc_symbol

Presto.

Good luck, Tim

ADD REPLY
0
Entering edit mode

Hi,

My data is from Epic array and not bi-sulphate sequencing.

ADD REPLY
0
Entering edit mode
@jovana-maksimovic-4422
Last seen 4 months ago
Australia

Hi,

We recently implemented a parameter in gometh, gsameth, goregion and gsaregion, sig.genes, allowing you to output an extra column in the results table showing the genes in the gene set that overlap the DM CpGs or CpGs underlying your regions. It is set to FALSE by default.

Ensure that you have the latest version of missMethyl installed to be able to use it.

Cheers, Jovana

ADD COMMENT

Login before adding your answer.

Traffic: 367 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6