Automatically generate contrasts from a model matrix / formula
2
0
Entering edit mode
Jonathan ▴ 10
@c31cf0e5
Last seen 5 days ago
United States

I wonder whether there is a function/tool/package that will generate a contrast matrix from a model-matrix or a formula, without manually specifying the compared groups.
For example, assuming 3 groups A, B, C and given a formula such as ~0 + Group + Age + Sex and/or its generated model.matrix(), I'm looking for a way to generate a contrast matrix such as

GroupAvsB GroupAvsC GroupBvsC
GroupA 1 1 0
GroupB -1 0 1
GroupC 0 -1 -1
Age 0 0 0
Sex 0 0 0

This can be even more complicated if a formula has interacting terms such as ~ 0 + Group*Sex + Age, or when there is a substantial number of groups.

The CRAN package emmeans is doing a similar parsing of a model:

pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs)
pairwise.table <- emmeans(pigs.lm, pairwise ~ source)[[2]]
coef(pairwise.table) %>% magrittr::set_colnames(c(colnames(.)[1], summary(pairwise.table)$contrast))

#>      source fish - soy fish - skim soy - skim
#> fish   fish          1           1          0
#> soy     soy         -1           0          1
#> skim   skim          0          -1         -1
limma • 384 views
ADD COMMENT
1
Entering edit mode
@gordon-smyth
Last seen 6 minutes ago
WEHI, Melbourne, Australia

I am very concerned that you are attempting to extract pairwise comparisons from an interaction formula like 0 + Group*Sex + Age, because doing so is not statistically meaningfull. It is not a statistically correct analysis. You cannot analyse "main effects" in an interaction model without resolving the interactions first.

I have explained before on this forum that factorial interaction models are not usually helpful in genomic analyses. In your example, a much more biologically meaningful approach is usually to convert Group and Sex into a factor with 6 levels and to extract Sex-specific group comparisons. The limma makeContrasts() function makes this sort of analysis explicit and straightforward. By contrast, the statistical concepts of "main effects" do not correspond to anything that could be published in a good medical journal.

makeContrasts()` requires explicit contrasts because I think it is helpful for users to understand the comparisons they are making rather than extracting comparisons they might not understand from a complex model.

limma allows users to fit completely general linear models, more general than can be generated using R's model formulas. In general, it is not possible to predict what contrasts will be meaningful for the user, so the idea of automatically generating contrasts is simply not possible. Analyses need to be designed with the scientific hypotheses in mind. It is not the case that there is a standard universal analysis for each model formula.

In the simple case of an additive Group variable, it would be possible to automatically generate pairwise comparisons but I have been reluctant to do so. In the three group comparison you mention at the start of your question, it is the quite easy to specify the contrasts using makeContrasts() and the resulting code would be shorter and easier to understand than the emmeans code that you show. If the Group has a substantial number of levels, then I don't really want to encourage pairwise comparisons because I think that will generate a confusing analysis that is hard to interpret. I'd rather force users to think about what comparisons are most important, or perhaps do an F-test.

ADD COMMENT
1
Entering edit mode
ATpoint ★ 4.6k
@atpoint-13662
Last seen 16 hours ago
Germany

You only need a few lines of code to make all pairwise unique contrasts automatically. This assumes that you are using syntactically valid names all the way.

group <- c("A", "A", "B", "B", "C", "C")
sex <- c("m", "f", "m", "f", "m", "f")
age <- rnorm(6, 30, 5)

design <- model.matrix(~0 + group + sex + age + sex:age)
colnames(design) <- gsub("^group", "", colnames(design))
design
#>   A B C sexm      age sexm:age
#> 1 1 0 0    1 35.86183 35.86183
#> 2 1 0 0    0 26.52460  0.00000
#> 3 0 1 0    1 39.94285 39.94285
#> 4 0 1 0    0 35.35639  0.00000
#> 5 0 0 1    1 26.20887 26.20887
#> 6 0 0 1    0 26.35095  0.00000
#> attr(,"assign")
#> [1] 1 1 1 2 3 4
#> attr(,"contrasts")
#> attr(,"contrasts")$group
#> [1] "contr.treatment"
#> 
#> attr(,"contrasts")$sex
#> [1] "contr.treatment"

# Use combn() to get all unique pairwise combinations of the group levels
combinations_group <- combn(x = unique(group), m = 2, function(x) paste(x[1], x[2], sep = "-"))
combinations_group
#> [1] "A-B" "A-C" "B-C"

# Use limma to make the contrasts matrix based on the unique groupwise combinations and the design matrix
contrasts <- limma::makeContrasts(contrasts = combinations_group, levels = make.names(colnames(design)))
contrasts
#>           Contrasts
#> Levels     A-B A-C B-C
#>   A          1   1   0
#>   B         -1   0   1
#>   C          0  -1  -1
#>   sexm       0   0   0
#>   age        0   0   0
#>   sexm.age   0   0   0
ADD COMMENT

Login before adding your answer.

Traffic: 386 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6