Question

Equivalence tests with edgeR

1

Entering edit mode

roberto.spreafico ▴ 20

@robertospreafico-7544

Last seen 8.1 years ago

United States

Hi,

I was wondering how to use edgeR to perform equivalence tests. I would like to compare two groups and extract genes that are significantly similar. I cannot do that by selecting large p-values from a regular difference tests because absence of evidence is not evidence of absence. I understand the principles of TOST equivalence tests, but I am not sure whether the application for the negative binomial test built in edgeR is straightforward.

Thank you for your help,

Roberto

rnaseq edger equivalence • 1.9k views

ADD COMMENT • link updated 10.1 years ago by Aaron Lun ★ 28k • written 10.1 years ago by roberto.spreafico ▴ 20

score 5 · Accepted Answer · 2015-04-01

5

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 2 hours ago

The city by the bay

This is not an easy thing to do, as we cannot work with the log-fold changes directly as we would in limma. Rather, we need to test against the upper and lower log-fold change thresholds of the TOST through the offsets.

Assume we have a DGEList named y for which we have estimated the dispersions, a design matrix and a coefficient of interest coef for our DE comparison. We could then test against the null hypothesis that the coefficient is set at the upper TOST threshold. In this case, we use a 3-fold change as the threshold.

orig.offs <- getOffset(y)
y$offset <- orig.offs + log(3)*design[,coef]
fit.up <- glmFit(y, design)
lrt.up <- glmLRT(fit.up, coef=coef)

We do the same for the lower TOST threshold.

y$offset <- orig.offs - log(3)*design[,coef]
fit.down <- glmFit(y, design)
lrt.down <- glmLRT(fit.down, coef=coef)

We then compute the one-sided p-values for each direction, and intersect them to get those with log-fold changes that lie between the upper and lower TOST boundaries. Those that are outside get a p-value of 1.

p.up <- ifelse(lrt.up$table$logFC < 0, lrt.up$table$PValue/2, 1)
p.down <- ifelse(lrt.down$table$logFC > 0, lrt.down$table$PValue/2, 1)
p.tost <- pmax(p.up, p.down)

So, as you can see, it's not a pleasant process, and it's pretty conservative. You'll probably need to use fairly wide TOST thresholds to get any significance. Obviously, if you're planning to apply this genome-wide, you'll have to do a BH correction on p.tost.

ADD COMMENT • link 10.1 years ago Aaron Lun ★ 28k

0

Entering edit mode

Thanks Aaron for the awesome solution! And just to make sure: if I need to define contrasts, I would just need to edit the offset of one of the two mates, right?

ADD REPLY • link 10.1 years ago roberto.spreafico ▴ 20

2

Entering edit mode

I'm not entirely sure what you mean by the mates, but the problem becomes more complicated if you have a contrast vector instead of simply dropping a coefficient. I'd suggest reparameterizing your design matrix such that your DE comparison can be expressed by just dropping a coefficient. This should be easy enough to do for most comparisons. Otherwise, you'll have to do something like that in glmLRT, which involves some complicated stuff with QR decompositions to get the null design matrix.

ADD REPLY • link 10.1 years ago Aaron Lun ★ 28k

0

Entering edit mode

Cool! I'll change my design matrix as you suggest then. Thanks a lot for your suggestions!

ADD REPLY • link 10.1 years ago roberto.spreafico ▴ 20