Question

EdgeR design matrix with 3 factors

0

Entering edit mode

jackiesalzbank • 0

@jackiesalzbank-24090

Last seen 3.9 years ago

Hello, I am running a glm analysis with edgeR on 64 biologically independent samples, with 3 factors. I have sex (M and F), age (P0, P7, P15, P30), and genotype (Control, KO). I have two separate questions, 1) I would like to look at the main effect of genotype over time. 2) I would like to look at the interaction between sex and genotype. I started by pasting all the factors together to use as a single factor without an intercept. I am having trouble understanding what exactly each coefficient means and which contrasts can be used to answer my questions.

design<-model.matrix(~0 + grouping)
> colnames(design)
 [1] "groupingControl0F."  "groupingControl0M."  "groupingControl15F." "groupingControl15M." "groupingControl30F." "groupingControl30M."
 [7] "groupingControl7F."  "groupingControl7M."  "groupingKO0F."       "groupingKO0M."       "groupingKO15F."      "groupingKO15M."     
[13] "groupingKO30F."      "groupingKO30M."      "groupingKO7F."       "groupingKO7M."      
>

For example: if I contrast groupingControl0F. and groupingControl0M. does this mean I am holding genotype and age constant and looking at the main effect of sex?

Any help you can provide would be greatly appreciated!

edgeR glm RNAseq • 1.2k views

ADD COMMENT • link updated 4.2 years ago by James W. MacDonald 67k • written 4.2 years ago by jackiesalzbank • 0

score 2 · Answer 1 · 2020-09-14

2

Entering edit mode

James W. MacDonald 67k

@james-w-macdonald-5106

Last seen 2 days ago

United States

You are fitting what I was taught to call a cell means model, meaning you are simply computing the mean of each 'cell' or group of samples. So for example, groupingControl0F is simply the female controls at P0. And given that you can make any comparisons you want.

For instance, let's say you want to know if there are any differences between male and female controls at P0. In that case you want

groupingControl0M - groupingControl0F

And the contrast matrix for that would be

> matrix(c(-1,1,rep(0,12)), ncol = 1)
      [,1]
 [1,]   -1
 [2,]    1
 [3,]    0
 [4,]    0
 [5,]    0
 [6,]    0
 [7,]    0
 [8,]    0
 [9,]    0
[10,]    0
[11,]    0
[12,]    0
[13,]    0
[14,]    0

Because the first two coefficients are those of interest, and you want to zero out the others. This is all just simple algebra. If you want the interaction of KO and gender at P0, it's (groupingKO0M - groupingControl0M) - (groupingKO0F - groupingControl0F), which is the same thing as groupingKO0M - groupingControl0M - groupingKO0F + groupingControl0F; you just construct a column of your contrast matrix so you have two 1s and two -1s at the correct position to make that comparison (noting that the rows of your contrast matrix correspond to the columns of your design matrix). So

> matrix(c(1,-1,rep(0,6),-1,1,rep(0,6)), ncol = 1)
      [,1]
 [1,]    1
 [2,]   -1
 [3,]    0
 [4,]    0
 [5,]    0
 [6,]    0
 [7,]    0
 [8,]    0
 [9,]   -1
[10,]    1
[11,]    0
[12,]    0
[13,]    0
[14,]    0
[15,]    0
[16,]    0

And to make sure you're getting what you want, you can do silly things like

> z <- colnames(design)
>  zz <- matrix(c(1,-1,rep(0,6),-1,1,rep(0,6)), ncol = 1)
> paste0(zz[as.logical(zz)], z[as.logical(zz)])
[1] "1groupingControl0F"  "-1groupingControl0M" "-1groupingKO0F"   
[4] "1groupingKO0M"

Which confirms that the specified contrast column does the comparison I wanted.

ADD COMMENT • link 4.2 years ago James W. MacDonald 67k

0

Entering edit mode

Or I suppose you could do something less silly

> row.names(zz) <- z
> zz
                   [,1]
groupingControl0F     1
groupingControl0M    -1
groupingControl15F    0
groupingControl15M    0
groupingControl30F    0
groupingControl30M    0
groupingControl7F     0
groupingControl7M     0
groupingKO0F         -1
groupingKO0M          1
groupingKO15F         0
groupingKO15M         0
groupingKO30F         0
groupingKO30M         0
groupingKO7F          0
groupingKO7M          0

Which is what you would get (plus a descriptive column header) if you used makeContrasts, which might be the easier way to go, assuming you have fewer than a bazillion contrasts to make.

ADD REPLY • link 4.2 years ago James W. MacDonald 67k

0

Entering edit mode

Thanks so much for your quick response! Would I be able to use the coefficient terms directly by using the makeContrasts() functions or must I construct a contrast matrix in order to test for specific interaction terms?

I am also concerned because it seems from the vignettes that for these contrasts there needs to be a reference term. This would be totally fine for the effects of genotype because controls would be the reference but for time and for sex I am not sure choosing a reference would be appropriate. For example for the effect of time, I don't want p0 to be used as "baseline" differences as I am interested in the differences that happens at that time point as well.

ADD REPLY • link 4.2 years ago jackiesalzbank • 0

0

Entering edit mode

Yes, you can use the coefficient terms directly. You could get the same thing as my second example by doing

makeContrasts((groupingKO0M - groupingControl0M) - (groupingKO0F - groupingControl0F), levels = design)

It's just really boring (to me) if you have lots of contrasts to type all that out, rather than just generating the contrast matrix by hand.

You are misunderstanding the vignette. Or more correctly, the User's Guide. You don't have to have a reference, but the default for model.matrix is to construct a design matrix using treatment contrasts, where you define a baseline level, and all the other coefficients are contrasts between a given group and the baseline. In your case you will be much better off constructing the design matrix as you have (using ~ 0 + grouping), which tells R you don't want a baseline level.

ADD REPLY • link 4.2 years ago James W. MacDonald 67k

0

Entering edit mode

Ah, I see. So if I were interested in the effect of time I wouldn't necessarily have to use p0 as the reference? I could contrast how P15 changes from P7 for example?

I suppose I am still a bit confused with how to look at the impact of time

ADD REPLY • link 4.2 years ago jackiesalzbank • 0

1

Entering edit mode

Yes. That's what I meant when I said it's just simple algebra. Any comparison can be made, although it might be difficult to interpret. So you could hypothetically be interested in time-dependent changes between P7 and P15 that are different for males and females, which is, again, simple algebra

(groupingControl15M - groupingControl7M) - (groupingControl15F - groupingControl7F)

or if you want the effect of KO and time within males, between P15 and P7,

(groupingKO15M - groupingKO7M) - (groupingControl15M - groupingControl7M)

or whatever. But this is starting to get beyond the scope of this support site, and into experimental questions or statistical design. For that you are either going to have to figure out what you care about yourself, of if you are getting hung up on the design aspects you will either need to do the required research to have the knowledge to know what to do, or find someone local who can help you.

ADD REPLY • link 4.2 years ago James W. MacDonald 67k