Entering edit mode
Edwin Groot
▴
230
@edwin-groot-3606
Last seen 10.2 years ago
Hello all,
For calculating the significance of the overlap of 2 gene lists I use
either the Fisher exact test or the hypergeometric distribution.
When I tried this on the overlap of 3 gene lists I have to make a
2x2x2
contingency table of what is IN or NOT in a list. Unfortunately the
Fisher test cannot handle multidimensional contingency tables.
Can someone with statistics background check if the Mantel-Haenszel
test can be used to calculate the significance of the overlap of 3
gene
lists? My sample code is below.
# Here is a case that has an overlap close to that expected by chance.
The experiment has 2000 genes, and 3 lists of 50, 100 and 200 genes
each. In the overlap of all 3 lists, 5 genes are found.
> c.table <- c(5,15, 5,175, 0,80, 40,1680)
> dim(c.table) <- c(2,2,2)
> dimnames(c.table) <- list(List1=c("in1","not1"),
List2=c("in2","not2"), List3=c("in3","not3"))
> c.table
, , List3 = in3
List2
List1 in2 not2
in1 5 5
not1 15 175
, , List3 = not3
List2
List1 in2 not2
in1 0 40
not1 80 1680
> sum(c.table)
[1] 2000
> fisher.test(c.table)
Error in fisher.test(c.table) : if 'x' is not a matrix, 'y' must be
given
> mantelhaen.test(c.table)
Mantel-Haenszel chi-squared test with continuity correction
data: c.table
Mantel-Haenszel X-squared = 1.1764, df = 1, p-value = 0.2781
alternative hypothesis: true common odds ratio is not equal to 1
95 percent confidence interval:
0.764363 5.403287
sample estimates:
common odds ratio
2.032258
The test tells me that the overlap is probably due to chance alone,
which is what I expected.
>From what I understand, the Mantel-Haenszel tells me that it is
testing
that the overlap of List 1 and List 2 is independent from List 3. To
me, that seems like extending the Fisher test to a third dimension
(gene list).
Thanks for your insights,
Edwin
--
Dr. Edwin Groot, postdoctoral associate
AG Laux
Institut fuer Biologie III
Schaenzlestr. 1
79104 Freiburg, Deutschland
+49 761-2032948