dmpFinder for other objects
1
0
Entering edit mode
neyousha • 0
@4a906e85
Last seen 10 weeks ago
Canada

I have been looking at dmpFinder function, and I was wondering if the function could work with beta values (as a matrix) and phenotypes (as a vector, for example race) without having the methylset or methylation object. My only worry is that how the function would be able to match the correct phenotypes to correct sample id beta values. Since the sample IDs are columns in the beta matrix, I am not sure what to do. Any suggestions would be appreciated. Thank you!

methylset minfi dmpFinder betaHMM beta • 505 views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States

The help page is helpful for answering your question:

Usage:

     dmpFinder(dat, pheno, type = c("categorical", "continuous"),
         qCutoff = 1, shrinkVar = FALSE)

Arguments:

     dat: A 'MethylSet' or a 'matrix'.

   pheno: The phenotype to be tested for association with methylation.

The expectation will be that you have samples in columns and beta values (ordered along the genome) in columns. The phenotypic data is also assumed to be in the same order as the columns of your matrix.

0
Entering edit mode

Hi James, Thank you a lot for responding. Hmm... so do you think if have matrix average_beta below: average_beta_matrix

and the phenotype dataset below pheno pheno

where both ids are matching in column and row, would dmpfiner give me a correct answer? Thank you! Neyousha

ADD REPLY
1
Entering edit mode

Yes, but you should probably use M-values instead of betas (convert using logit2).

However, dmpFinder is just a way to allow people to use lmFit from limma to fit a model without having to figure out how to use lmFit. If you care to fit more than just one coefficient you will have to use lmFit directly.

ADD REPLY
0
Entering edit mode

Thank you very much James, would you mind explaining why m-values instead of beta values? I have only beta_values available at this point.

ADD REPLY
0
Entering edit mode

You are better off using M-values instead of beta values because you are fitting a conventional linear regression, in which case it's better if the underlying distribution of your data is at least 'hump shaped'. This will be true for M-values, but not necessarily for beta values which are strictly between 0-1 and tend to be clustered at either extreme. If you have large enough N you can assume the central limit theorem will be in effect, in which case the underlying distribution doesn't matter (but the less hump-shaped, the larger N you need for the CLT to kick in).

It's just easier to defend using M-values because they are 'normal-ish' whereas beta values are not. If you really want to use betas, you might consider using DSS which models the data assuming a beta distribution.

But anyway, the logit2 function will convert your betas to M-values, so you can have M-values if you so desire/

ADD REPLY

Login before adding your answer.

Traffic: 572 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6