Question

Help with constructing design matrix for edgeR/limma

0

Entering edit mode

adrianna_07 • 0

@f959015a

Last seen 2.6 years ago

Australia

Hi all,

I am uncertain about constructing a design matrix for my study below, enter image description here

Our comparison would be contr.matrix <- makeContrasts(Condition = Infertile-Healthy,levels = colnames(design)). We are interested in the average differences driven by human fertility, hopefully taking into consideration of the infertile subpopulations (4 levels: RIF, RM, RFL and RIF_RM), and because PCA shows Lib_batch accounts for the highest proportion of variance in the data (followed by Pathology) so that will be accounted for.

However, all the healthy donors have no associated pathologies so design <- model.matrix(~0 + Condition + Lib_prep + Pathology, data = dgeList$samples) would give columns that are linearly dependent/not of full rank. We saw no significant differential expression when pathology is not accounted for, and we understand the sample size for each subpopulation is too small to be assessed independently. Would you be able to offer some advice in regard to constructing an appropriate design matrix for this study?

Thanks very much!!

Cheers, A

limma design.matrix model.matrix edgeR • 1.2k views

ADD COMMENT • link 2.7 years ago adrianna_07 • 0

score 2 · Accepted Answer · 2022-05-27

If the infertile subpopulations have different expression profiles then you need

design <- model.matrix(~Pathology)

and not include Condition in the model as it is redundant. You can then form a contrast between the average of the infertile groups vs healthy.

I would also compute array quality weights in limma, to adjust for poorer quality samples.

However, if you have a substantial batch effect associated with Lib_batch, then you are in trouble. As a factor, Lib_batch is highly confounded with Pathology so the two effects can only be separated to a small extent. This problem may be unsolvable.

In the end, your question here is more of a research question than a software question, so you need to consult with your principal investigator and/or with a senior bioinformatician at your institution.