Hi,
I have few doubts while using design matrix and voom function in limma
1st doubt:
I'm working on RNA seq data to find out differentially expressed genes for a group where three conditions exist. I would like to include gender, age and one of the control gene expression as confounders in the design matrix so take out the effect of sex, age and control gene expression. How to model the equation when we have both categorical and continuous confounders?
2nd doubt:
when we do use model.matrix function. whether we have to consider intercept or not? How does it differs in results. Please find my code below
## Differential expression analysis
# with intercept
design.adipose=model.matrix(~0+AdiposeCPMCountGT1_min20percentsamples$samples$group.1+AdiposeCPMCountGT1_min20percentsamples$samples$Gender)
# With out intercept
design.adipose = model.matrix(~AdiposeCPMCountGT1_min20percentsamples$samples$group.1+AdiposeCPMCountGT1_min20percentsamples$samples$Gender)
# To rename column names
colnames(design.adipose)[1] = "MAO"
colnames(design.adipose)[2] = "MHNW"
colnames(design.adipose)[3] = "MHO"
colnames(design.adipose)[4] = "Gender"
# To do comparisons between groups
contr.matrix.Adipose = makeContrasts(MHOvsMHNW = MHO-MHNW,MAOvsMHNW = MAO-MHNW,MAOvsMHO = MAO-MHO,levels = colnames(design.adipose)[1:4])
# Voom function
v = voom(AdiposeDGE,design.adipose,lib.size = colSums(AdiposeDGE$counts)*AdiposeDGE$samples$norm.factors,normalize.method="quantile",plot = TRUE)
Thanks a lot Smyth. As my data is now adjusted for covariates like age,sex and control gene expression. where will be this adjusted data was stored? As per the R script mentioned below, it won't be v$E expression data. My doubt is , is it in vfit or efit?
It's not stored anywhere in the sense that you are thinking. These covariates are just accounted for inside the linear model.
If you want to explore what your data looks like when these covariates are accounted for, take a look at limma's
removeBatchEffects
function.Thanks for that.
In removeBatchEffects function,
I have log cpm withTMM normalised data with covariates like sex, age and control gene expression. I don't have batch information. Could i keep batch argument as NULL? can i pass all my covariates into covariates argument?
Thanks in advance,
Pratap.
You do have batch information. For this purpose, gender is the batch and age and control gene expression are the covariates. The batch arguments are for categorical confounders and the covarate arguments are for continuous confounders.
However, you should not be using removeBatchEffect() because it is not the correct way to do anything you asked about in your original question.
I think you are mis-understanding the process somewhat. The statistical tests for differential expression have been adjusted for the confounders, but this does not require that the data itself be adjusted in any way. The tests are adjusted, not the data.
Thanks Gordan for your prompt reply.
I have one more doubt. In my case, is the 1st coefficient is comparison of MHO with MHNW, 2nd coefficient is comparison of MAO with MHNW and 3rd coefficient is comparison of MAO with MHW . Please let me know, if I understood wrong. Please find my code below.
design.adipose = model.matrix(~AdiposeCPMCountGT1_min20percentsamples$samples$group.1+AdiposeCPMCountGT1_min20percentsamples$samples$Gender)
# To rename column names
colnames(design.adipose)[1] = "MAO"
colnames(design.adipose)[2] = "MHNW"
colnames(design.adipose)[3] = "MHO"
colnames(design.adipose)[4] = "Gender"
# To do comparisons between groups
contr.matrix.Adipose = makeContrasts(MHOvsMHNW = MHO-MHNW,MAOvsMHNW = MAO-MHNW,MAOvsMHO = MAO-MHO,levels = colnames(design.adipose)[1:4])
# Voom function
v = voom(AdiposeDGE,design.adipose,lib.size = colSums(AdiposeDGE$counts)*AdiposeDGE$samples$norm.factors,normalize.method="quantile",plot = TRUE)
You are now asking a new question about contrasts that was not included in your original post. It would better to post a new question about how to form contrasts rather than just tacking additional questions onto replies and comments.