Hi all,
in DMRcate results I got no.probes associated to each significant DMR.
How do I get the ID of each of these probes clustered together into each DMR?
Thanks in advance, Regards
Giovanni
Hi all,
in DMRcate results I got no.probes associated to each significant DMR.
How do I get the ID of each of these probes clustered together into each DMR?
Thanks in advance, Regards
Giovanni
Hi Giovanni,
I think these are good general post-filtering options. However, your options depend on the type of hypothesis you are testing. Do you only want long blocks of differential methylation or do you believe that very localised methylation may also affect your phenotype? This will inform your cutoff for no.probes. Are you specifically looking at promoter methylation? If yes then taking only those with a gene symbol association makes sense.
Thanks for using DMRcate as intended and good luck!
Best,
Tim
Hi Giovanni,
The following should get you the IDs. Say you're using EPIC array data and your DMRs are in results.ranges
:
locs <- IlluminaHumanMethylationEPICanno.ilm10b2.hg19@data$Locations locs.ranges <- GRanges(locs$chr, IRanges(locs$pos, locs$pos)) names(locs.ranges) <- rownames(locs) DMR.IDs <- lapply(results.ranges, function (x) names(locs.ranges[locs.ranges %over% x]))
Best,
Tim
Hi Tim,
I'm doing some differential analyses on my processed data and in one of these I got a too big number of DMRs.
I would pre-filter probes but following your package's threads I found this will bias the method.
So I think it would be better post-filter the results:
- on no.probes > 1 (because no.probes = 1, would indicate DMPs instead of DMRs)
- on DMRs that effectively have a Gene Symbol associated
- on absolute value of maxbetafc greater than of 0.2 as minimum threshold of differential methylation level
What do you think and suggest about?
Thanks in advance, Best
Giovanni
Hi Tim,
I have been using this code for a while now and it always worked for me and returned the CpGs constituting the DMRs. However, when I tried to use it today, locs turns out empty and the same with loc.ranges. I switched to using another function (makeGRanges) in package humarray. I used the annotation file object I had for 450k probes (fdata.all). Following is the command I used to generate loc.ranges which looks like what I used to obtain using the above commands specified by you. But using this in the last command line does not return CpG IDs. I am not sure where I am making a mistake. I am not able to find the function(x) on github to be able to evaluate what it does. Please take a look at the codes and results below and let me know if you are able to spot any error. It will also be very useful to know any alternate methods to extract the CpGs from DMRcate results.
>loc.ranges<-makeGRanges(fdata.all$CHR, pos = fdata.all$MAPINFO, row.names = rownames(fdata.all), build = "hg19")
> head(loc.ranges)
GRanges object with 6 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
>DMR.IDs <- lapply(results.ranges, function (x) names(locs.ranges[locs.ranges %over% x]))
> DMR.IDs[1]
$`844`
cg00000029 16 [ 53468112, 53468112] *
cg00000108 3 [ 37459206, 37459206] *
cg00000109 3 [171916037, 171916037] *
cg00000165 1 [ 91194674, 91194674] *
cg00000236 8 [ 42263294, 42263294] *
cg00000289 14 [ 69341139, 69341139] *
-------
seqinfo: 24 sequences from hg19 genome; no seqlengths
Thank you,
Poojitha Rajasekar
Hi Tim,
I have used these set of commands before and it worked perfectly fine. However, I ran into the following error today,
DMR.IDs <- lapply(DMR1, function (x) names(locs.ranges[locs.ranges %over% x])) Error during wrapup: GRanges objects don't support [[, as.list(), lapply(), or unlist() at the moment
Is there a replacement for lappy or any new set of commands I can use to retrieve corresponding CpGs for regions in results.ranges?
Thank you.
Poojitha
Hi Tim,
I have used these set of commands before and it worked perfectly fine. However, I ran into the following error today,
DMR.IDs <- lapply(DMR1, function (x) names(locs.ranges[locs.ranges %over% x])) Error during wrapup: GRanges objects don’t support [[, as.list(), lapply(), or unlist() at the moment
Is there a replacement for lappy or any new set of commands I can use to retrieve corresponding CpGs for regions in results.ranges?
Thank you.
Poojitha
Hi Poojitha,
A bit of digging reveals this is a temporary situation: https://github.com/Bioconductor/GenomicRanges/blob/master/R/GenomicRanges-class.R#L597-L599 https://support.bioconductor.org/p/109137/
So I guess for the time being you'll need to use a for loop:
constituent.cpgs <- list()
for (i in 1:length(results.ranges)){
constituent.cpgs <- c(constituent.cpgs, locs.ranges[locs.ranges %over% results.ranges[i]])
}
Best, Tim
Hello Tim,
I have used package called DMRcate to analyse 450k data. I want to find the gene which associates the DMRs.And I met some questions.The outputs include gene_assoc, group, hg19coord, no.probes, minpval, meanpval and maxbetafc,when I used the old version to analyse 450k data. But when I update the package,the outputs became coord, no.cpgs, minfdr, Stouffer, maxbetafc,meanbetafc.There is no result of "gene_assoc".I want to find the gene names associating "coord",can you help me ? Can you please tel me how to associate the gene by using the newest DMRcate packages.
There follow the output results of the newest DMRcate. coord no.cpgs minfdr Stouffer maxbetafc meanbetafc 63999 chr6:33156164-33181870 265 0 0 -0.5008031 -0.02648790 63997 chr6:33128825-33149777 150 0 0 0.4176126 0.08611966 63917 chr6:32144195-32161004 128 0 0 -0.2574513 -0.03184096 63914 chr6:32114490-32123701 124 0 0 -0.4377015 -0.06195576 63889 chr6:31935801-31940855 101 0 0 -0.1555205 -0.02401999 12564 chr11:31817810-31841980 100 0 0 -0.4611059 -0.17113506
Hi,
Just like with any epigenetic mark, not all DMRs have an unambiguous gene assignment. This is why I've restricted the gene information to overlapping promoters. If you're interested in finding the nearest genes, I would use a method such as this: https://bioconductor.org/packages/release/bioc/vignettes/ChIPpeakAnno/inst/doc/quickStart.html#use-case-3-annotate-the-peaks-in-both-sides-with-nearest-transcription-start-sites-within-5k-bps.
Best, Tim
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hi Tim,
I agree with you but one clarification:
- I'm looking also at Body methylation and also to this region obviously there's a considerable Gene Symbol association; what I was wondering for some time if is it possible to know other informations, in addition to the hg19 coord., about probes without G.S. association because you know that there are some of these both in 450k array and in EPIC array.
Maybe I should post a separate question on this aspect.
Thanks for everything, Best
Giovanni
Hi Giovanni,
The annotations available through DMRcate are the Gencode transcript coordinates and the per-probe characteristics found in the attached annotation packages. The boilerplate I posted earlier in the thread should help you attach these to your DMRs. Anything else you'll need to import yourself.
Best,
Tim
Hi Tim,
I know the attached annotation packages available through DMRcate.
My comment is on the probes of these annotations that don't have more informations like Gene Symbol, Genomic Regions, etc.
Thanks for everything posted in this thread and good luck,
Giovanni
Hello Tim,
I have used package called DMRcate to analyse 450k data. I want to find the gene which associates the DMRs.And I met some questions.The outputs include gene_assoc, group, hg19coord, no.probes, minpval, meanpval and maxbetafc,when I used the old version to analyse 450k data. But when I update the package,the outputs became coord, no.cpgs, minfdr, Stouffer, maxbetafc,meanbetafc.There is no result of "gene_assoc".I want to find the gene names associating "coord",can you help me ? Can you please tel me how to associate the gene by using the newest DMRcate packages.
There follow the output results of the newest DMRcate. coord no.cpgs minfdr Stouffer maxbetafc meanbetafc 63999 chr6:33156164-33181870 265 0 0 -0.5008031 -0.02648790 63997 chr6:33128825-33149777 150 0 0 0.4176126 0.08611966 63917 chr6:32144195-32161004 128 0 0 -0.2574513 -0.03184096 63914 chr6:32114490-32123701 124 0 0 -0.4377015 -0.06195576 63889 chr6:31935801-31940855 101 0 0 -0.1555205 -0.02401999 12564 chr11:31817810-31841980 100 0 0 -0.4611059 -0.17113506