Question

Is it possible to run DMRcate directly with genomic positions instead of probe names?

0

Entering edit mode

Jeni ▴ 10

@jeni-23673

Last seen 2.4 years ago

Spain

Hi!

I am trying to run DMRcate with a dataset downloaded from a paper. This dataset consist, instead of a set of probe names and their corresponding beta values, of a set of genomic positions and their corresponding beta values.

When I run cpg.annotate I have to specify arraytype = "EPIC" or "450k". But these data come from none of them. So, is it possible to indicate just genomic positions to cog.annotate and get an annotation to find differentially methylated regions?

Thanks!

DMRcate • 1.5k views

ADD COMMENT • link updated 4.6 years ago by Tim Peters ▴ 230 • written 4.6 years ago by Jeni ▴ 10

1

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 2 days ago

United States

You don't need to specify the array type if you provide a GenomicRatioSet. And you can look at ?GenomicRatioSet-class to see how you construct one of those.

ADD COMMENT • link 4.6 years ago James W. MacDonald 68k

0

Entering edit mode

The problem is that I cannot create a GenomicRatioSet because I just have the matrix of Beta values, I couldn't obtain more information (such as idat files). My intention is to perform a differencial methylation analysis by using the beta values matrix that I downloaded from the provided material.

ADD REPLY • link 4.6 years ago Jeni ▴ 10

0

Entering edit mode

Of course you can create a GenomicRatioSet. Why do you think you can't? Is there something in the help page that you don't understand? Here's a fake example

 > fakebetas <- matrix(runif(10000), 1000)
 > fakegr <- GRanges("chr1", IRanges(seq(1,2000, 2), width = 1))
 > fakeGRatioSet <- GenomicRatioSet(fakegr, fakebetas)
!> fakeGRatioSet
 class: GenomicRatioSet
 dim: 1000 10
 metadata(0):
 assays(1): Beta
 rownames: NULL
 rowData names(0):
 colnames: NULL
 colData names(0):
 Annotation
   array:
 Preprocessing
   Method: NA
   minfi version: NA
   Manifest version: NA

You would obviously use your real beta values, and construct an appropriate GRanges object using the 'corresponding genomic positions' that you say you have. Or if it's really Illumina data (as Tim Peters seems to think), then you can just do what he suggested.

ADD REPLY • link 4.6 years ago James W. MacDonald 68k

score 2 · Accepted Answer · 2020-10-01

Hi Jeni,

It would be a very odd paper indeed that didn't specify which platform their assay was run on. Doesn't the Methods section say anything at all about this? The first thing I'd do is read it to find out how the beta values were generated.

Assuming that the Methods section does say which array type you use, you can rename your matrix by matching the chromosome and position to the probe ID. For example, for EPIC, load the data like so:

data(IlluminaHumanMethylationEPICanno.ilm10b4.hg19)
data(Locations)
Locations
DataFrame with 865859 rows and 3 columns
                   chr       pos      strand
           <character> <integer> <character>
cg18478105       chr20  61847650           -
cg09835024        chrX  24072640           -
cg14361672        chr9 131463936           +
cg01763666       chr17  80159506           +
cg12950382       chr14 105176736           +
...                ...       ...         ...
cg23079522        chr3 160569628           -
cg16818145        chr3 182782277           -
cg14585103        chr8 139940608           -
cg10633746       chr17  18164442           +
cg12623625        chr1  17946923           +

And then use whatever string formatting is appropriate to your rownames to rename them to the probe IDs.

Best, Tim