Hi,
I am trying to find a list of genes significantly associated with a continuous variable (e.g. glucose) so I used edgeR to create a design matrix as follows:
design <- model.matrix(~glucose)
Followed by:
disp <- estimateDisp(y, design, robust = TRUE) fit <- glmFit(disp, design, robust = TRUE) lrt <- glmLRT(fit, coef = 2) topTags(lrt, n=Inf, adjust.method = "BH", p.value = 0.05)
to give a list of genes associated with glucose.
In addition, I wanted to adjust for gender, so I included gender as a factor.
design <- model.matrix(~glucose+gender)
However, doing so resulted in gender-related genes (coming from Y chromosome) coming up as significantly associated. At this point, I am not sure what's going on.
Separately, I tried out limma-voom to replicate what I have done above.
limma.design <- model.matrix(~glucose+gender) limvm <- voom(y, limma.design, plot = TRUE) fit <- lmFit(limvm, limma.design) fit <- eBayes(fit) topTable(fit, coef=2, adjust.method = "BH", p.value = 0.05, number = Inf)
which I think should have done the same thing, but instead there are no gender-related genes, and the number of genes are much less. To find glucose associated genes while adjusting for gender, am I doing this correctly? Thank you.
When using edgeR, are you saying that testing for glucose (coef=2) gave Y chromosome genes as DE when using ~glucose+gender but not when using ~glucose alone. Or are you saying that you got Y chromosome genes as DE with both design matrices?
Y chromosome genes were given when ~glucose+gender was used but not when ~glucose alone was used.
But from what I understand from your comments below, I can either:
- remove sex-linked genes from the analysis or,
- don't include sex into the model (i.e. just use ~glucose) as robust = TRUE can down-weight the sex-linked genes?
Yes.