Two-way interaction, linear redundancy
1
1
Entering edit mode
Sindre ▴ 110
@sindre-6193
Last seen 4.4 years ago

I want to conduct an (example) experiment. I have 3 skinny rats and 3 fat rats. First, I give them a standardized meal and will take blood samples to measure levels of protein "X" before and after this meal. Second, all rats undergoes weight loss for 1 week. Third, after weight loss, I again give them a standardized meal and will take blood samples to measure levels of protein "X" before and after this meal.

Here is the design:

rats <- factor(rep(rep(1:3,each=2),4)) # 3 rats in each group
weightloss <- factor(rep(c("Start", "End"), each=12), levels = c("Start","End"))
meal <- factor(rep(c("Before", "After"), 12),  levels = c("Before","After"))
group <- rep(factor(rep(c("Skinny", "Fat"), levels = c("Skinny","Fat"), each=6)),2)

There are a few questions I want to answer:
1. The response in protein "X" levels in skinny mice before weight loss: Simple paired design
2. The response in protein "X" levels in skinny mice after weight loss: Simple paired design
3. The difference in response in protein "X" levels in skinny mice before vs. after weight loss: Interaction
4. The response in protein "X" levels in fat mice before weight loss: Simple paired design
5. The response in protein "X" levels in fat mice after weight loss: Simple paired design
6. The difference in response in protein "X" levels in fat mice before vs. after weight loss: Interaction
7. The difference in response in protein "X" levels in fat mice before vs. after weight loss different from the difference in response in protein "X" levels in skinny mice before vs. after weight loss: interaction of the interactions


I have no problems with comparisons 1:6, only 7 gives me trouble. Here is my attempt:

design <- model.matrix(~0+group:meal:weightloss+rats)
colnames(design)[4:11] <- gsub(colnames(design)[4:11], pattern=":", replacement=".")
library("caret")
findLinearCombos(design) # Column 11 gives me a problem

# The intuitive way is to write this interaction of interactions formula as follows:
((groupFat.mealAfter.weightlossEnd-groupFat.mealBefore.weightlossEnd)-((groupFat.mealAfter.weightlossStart-groupFat.mealBefore.weightlossStart))-((groupSkinny.mealAfter.weightlossEnd-groupSkinny.mealBefore.weightlossEnd)-(groupSkinny.mealAfter.weightlossStart-groupSkinny.mealBefore.weightlossStart)))

However, I am not quite sure how to re-write it to remove the linear redundant column 11? Could it be like this:

design <- design[,!grepl("Before", colnames(design))] # get to full rank

# Then the interaction of interactions is the following:
(groupFat.mealAfter.weightlossEnd-groupSkinny.mealAfter.weightlossEnd)-(groupFat.mealAfter.weightlossStart-groupSkinny.mealAfter.weightlossStart)

Correct?

limma caret lm lmer lme4 • 1.1k views
ADD COMMENT
3
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 2 hours ago
The city by the bay

I wonder what makes people inclined to use interaction terms, when their lives could be so much easier.

G <- paste(group, meal, weightloss, sep="_")

# This is what your rats really are, Skinny "1" != Fat "1"
true.rats <- paste0(group, rats)

G <- relevel(factor(G), "Fat_Before_Start") # to ensure model.matrix behaves
design <- model.matrix(~0 + true.rats + G)
design <- design[,!grepl("Before_Start", colnames(design))] # get to full rank

The first 6 columns represent rat-specific blocking factors - more specifically, the fitted log-expression of each rat before the meal at the start of the experiment. The remaining coefficients represent the average log-fold change of each treatment relative to the before-start baseline.

(By relevel'ing G, and then removing the other Before_Start coefficient from design, we shaped the design matrix to achieve this interpretation for the coefficients. Technically, we could have done ~ 0 + G + true.rats, but I have deliberately formulated it in this manner to ensure that the coefficients of interest are nested within the true.rats levels. This means that any comparisons between G coefficients are comparisons of responses. The ~0 + G + true.rats approach invites direct comparisons between G coefficients that represent contrasts of average log-expression levels in the first skinny rat to the first fat rat - this is unlikely to be what was intended.)

At this point, your contrasts are easy, so I'll cut to the chase.

The difference in response in protein "X" levels in fat mice before vs. after weight loss different from the difference in response in protein "X" levels in skinny mice before vs. after weight loss

I will omit the makeContrasts call for clarity.

# Response in fat at the start of the week.
GFat_After_Start

# Response in fat at the end of the week.
GFat_After_End - GFat_Before_End

# Difference in response in fat between the start and end of the week.
(GFat_After_End - GFat_Before_End) - GFat_After_Start

# Difference in differences in response between fat and skinny.
((GFat_After_End - GFat_Before_End) - GFat_After_Start)
  - (GSkinny_After_End - GSkinny_Before_End) - GSkinny_After_Start)

Probably best not to use the phrasing "after weight loss", because that's easily confused with "after the meal".

ADD COMMENT
0
Entering edit mode

Wow. That was advanced, and beautiful! I am glad this gives the same result as my approach, but I learned something new. Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 645 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6