Entering edit mode
yoursbassanio
▴
10
@yoursbassanio-12717
Last seen 4.2 years ago
Below is the command I used to run Dmrcate and enrichment analysis.
dmrCate<-cpg.annotate("array", BetaValue, what="Beta", arraytype ="EPIC", analysis.type="differential",fdr=0.01)
dmrcoutput <- dmrcate(dmrCate,lambda=500, C=5, pcutoff="fdr",min.cpgs=1,mc.cores=1)
wgbs.ranges_all <- extractRanges(dmrcoutput, genome = "hg19")
enrichment_GO <- goregion(wgbs.ranges_all,collection = "GO", array.type = "EPIC") enrichment_GO <-
enrichment_GO[order(enrichment_GO$P.DE),]
I have got the following results too
> "ONTOLOGY" "TERM" "N" "DE" "P.DE" "FDR"
> "GO:0005654" "CC" "nucleoplasm" 3501 1911.66666666667 5.78302322878527e-28 1.31685221942669e-23
> "GO:0043231" "CC" "intracellular membrane-bounded
> organelle" 11367 5316.54285714286 8.60779986601549e-27 9.80041053745193e-23
> "GO:0043229" "CC" "intracellular
> organelle" 13162 6092.37619047619 8.98325926857439e-24 6.25634156984302e-20
Is there any ways I could fetch the genes that are used/involved to identify in each of these pathways?
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /scratch/mv83/Software/DMRcate/lib/R/lib/libRblas.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] splines stats4 parallel stats graphics grDevices utils
[8] datasets methods base
other attached packages:
[1] IlluminaHumanMethylationEPICanno.ilm10b4.hg19_0.6.0
[2] missMethyl_1.20.3
[3] celltypes450_0.1
[4] limma_3.36.5
[5] DMRcate_1.18.0
[6] DMRcatedata_1.16.0
[7] DSS_2.30.0
[8] bsseq_1.18.0
[9] minfi_1.28.0
[10] bumphunter_1.24.5
[11] locfit_1.5-9.1
[12] iterators_1.0.10
[13] foreach_1.4.4
[14] Biostrings_2.50.2
[15] XVector_0.22.0
[16] SummarizedExperiment_1.12.0
[17] DelayedArray_0.8.0
[18] BiocParallel_1.16.2
[19] matrixStats_0.54.0
[20] Biobase_2.42.0
[21] GenomicRanges_1.34.0
[22] GenomeInfoDb_1.18.1
[23] IRanges_2.16.0
[24] S4Vectors_0.20.1
[25] BiocGenerics_0.28.0
loaded via a namespace (and not attached):
[1] backports_1.1.3
[2] Hmisc_4.1-1
[3] plyr_1.8.4
[4] lazyeval_0.2.2
[5] ggplot2_3.1.1
[6] digest_0.6.18
[7] ensembldb_2.6.3
[8] htmltools_0.3.6
[9] GO.db_3.7.0
[10] magrittr_1.5
[11] checkmate_1.9.1
[12] memoise_1.1.0
[13] BSgenome_1.50.0
[14] cluster_2.0.8
[15] readr_1.3.1
[16] annotate_1.60.0
[17] R.utils_2.8.0
[18] askpass_1.1
[19] siggenes_1.56.0
[20] prettyunits_1.0.2
[21] colorspace_1.4-1
[22] blob_1.1.1
[23] BiasedUrn_1.07
[24] xfun_0.6
[25] dplyr_0.8.0.1
[26] crayon_1.3.4
[27] RCurl_1.95-4.12
[28] genefilter_1.64.0
[29] GEOquery_2.50.0
[30] IlluminaHumanMethylationEPICmanifest_0.3.0
[31] survival_2.44-1.1
[32] VariantAnnotation_1.28.3
[33] glue_1.3.1
[34] ruv_0.9.6
[35] registry_0.5
[36] gtable_0.3.0
[37] zlibbioc_1.28.0
[38] Rhdf5lib_1.4.3
[39] HDF5Array_1.10.1
[40] scales_1.0.0
[41] DBI_1.0.0
[42] rngtools_1.3.1
[43] bibtex_0.4.2
[44] Rcpp_1.0.1
[45] xtable_1.8-3
[46] progress_1.2.0
[47] htmlTable_1.13.1
[48] foreign_0.8-71
[49] bit_1.1-14
[50] mclust_5.4.3
[51] preprocessCore_1.44.0
[52] Formula_1.2-3
[53] htmlwidgets_1.3
[54] httr_1.4.0
[55] RColorBrewer_1.1-2
[56] acepack_1.4.1
[57] pkgconfig_2.0.2
[58] reshape_0.8.8
[59] XML_3.98-1.16
[60] R.methodsS3_1.7.1
[61] Gviz_1.26.4
[62] nnet_7.3-12
[63] tidyselect_0.2.5
[64] rlang_0.3.3
[65] AnnotationDbi_1.44.0
[66] munsell_0.5.0
[67] tools_3.5.1
[68] RSQLite_2.1.1
[69] stringr_1.4.0
[70] org.Hs.eg.db_3.7.0
[71] knitr_1.22
[72] bit64_0.9-7
[73] beanplot_1.2
[74] methylumi_2.28.0
[75] purrr_0.3.2
[76] AnnotationFilter_1.6.0
[77] nlme_3.1-137
[78] doRNG_1.7.1
[79] nor1mix_1.2-3
[80] R.oo_1.22.0
[81] xml2_1.2.0
[82] biomaRt_2.38.0
[83] compiler_3.5.1
[84] rstudioapi_0.10
[85] curl_3.3
[86] statmod_1.4.30
[87] tibble_2.1.1
[88] stringi_1.4.3
[89] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.0
[90] GenomicFeatures_1.34.1
[91] lattice_0.20-38
[92] ProtGenerics_1.14.0
[93] Matrix_1.2-17
[94] IlluminaHumanMethylation450kmanifest_0.4.0
[95] permute_0.9-5
[96] multtest_2.38.0
[97] pillar_1.3.1
[98] data.table_1.12.2
[99] bitops_1.0-6
[100] rtracklayer_1.42.2
[101] R6_2.4.0
[102] latticeExtra_0.6-28
[103] gridExtra_2.3
[104] codetools_0.2-16
[105] dichromat_2.0-0
[106] MASS_7.3-51.3
[107] gtools_3.8.1
[108] assertthat_0.2.1
[109] rhdf5_2.26.2
[110] openssl_1.3
[111] pkgmaker_0.27
[112] withr_2.1.2
[113] GenomicAlignments_1.18.1
[114] Rsamtools_1.34.0
[115] GenomeInfoDbData_1.1.0
[116] hms_0.4.2
[117] quadprog_1.5-7
[118] grid_3.5.1
[119] rpart_4.1-13
[120] tidyr_0.8.3
[121] base64_2.0
[122] DelayedMatrixStats_1.4.0
[123] illuminaio_0.24.0
[124] biovizBase_1.30.1
[125] base64enc_0.1-3
Hi,
Can one please help me on this please. I could provide more information if needed
Hi Bassanio,
I'm noticing that your object is called
wgbs_ranges_all
, yet you specifyarraytype="EPIC"
in your call togoregion()
. Your enrichment terms won't be correct becausegoregion()
assumes a background of CpGs from the EPIC or 450K arrays, not from the entire human genome. To my knowledge there isn't a routine that corrects for varying numbers of CpGs mapped to genes for WGBS, but if you're happy to retain that bias and simply assume that each gene in the genome has an equal chance of being differentially methylated, then perhaps something like theRITAN
package to find term enrichment may be sufficient:Note that
RITAN
will, as you requested, report the genes involved in the results.Cheers, Tim
Hi,
My data is from Epic array and not bi-sulphate sequencing.
Hi Bassanio,
Looking at your code properly I can see that now. Your naming convention threw me though - why would you name the GRanges object
wgbs.ranges_all
if it came from EPIC? Nevermind.You should be able to get the gene names this way:
1) Get your subset of CpGs from your DMRs using Jim's method from your previous post: https://support.bioconductor.org/p/125350/ - let's call this character vector
sigCpGs
.2) Use
missMethyl::getMappedEntrezIDs()
to get your Entrez IDs for the relevant genes:3) Use biomaRt to convert to hgnc:
Presto.
Good luck, Tim
Hi,
My data is from Epic array and not bi-sulphate sequencing.