In the MAITAnalysis vignette, section 5, a competitive gene set enrichment is performed
zlmCond <- zlm(~condition + cngeneson, sca)
# bootstrap, resampling cells
# R should be set to >50 if you were doing this for real.
boots <- bootVcov1(zlmCond, R=4)
gsea <- gseaAfterBoot(zlmCond, boots, sets_indices, CoefficientHypothesis("conditionStim"))
Where sets_indices
is a list of indices of genes in sca
that provides the different gene sets. This tests whether in cells with the same value of cngeneson
(the cellular detection rate), the difference between `condition=='Stim'` and `condition=='Unstim'` is different between the geneset and the background. `zlmCond` is passed to `gseaAfterBoot` because it contains "true" values of the `conditionStim` coefficients. According to the documentation ?gseaAfterBoot
:
A 4D array is returned, with dimensions "set" (each module), "comp" ('disc'rete or 'cont'inuous), "metric" ('stat' gives the average of the coefficient, 'var' gives the variance of that average, 'dof' gives the number of genes that were actually tested in the set), "group" ('test' for the genes in test-set, "null" for all genes outside the test-set).
The @tests
slot is not intended to be user-facing and doesn't have an accessor method. Instead, it's recommended to call calcZ
or summary
. But we can verify the documentation is accurate using the example provided in the ?gseaAfterBoot
help.
data(vbetaFA)
vb1 = subset(vbetaFA, ncells==1)
vb1 = vb1[,freq(vb1)>.1][1:15,]
zf = zlm(~Stim.Condition, vb1)
boots = bootVcov1(zf, 5)
sets=list(A=1:5, B=3:10, C=15, D=1:5)
gsea=gseaAfterBoot(zf, boots, sets, CoefficientHypothesis('Stim.ConditionUnstim'))
dimnames(gsea@tests)
$set [1] "A" "B" "C" "D"
$comp [1] "cont" "disc"
$metric [1] "stat" "var" "dof" "avgCor"
$group [1] "test" "null"
So this is a 4 x 2 x 4 x 2 array with leading dimension giving the gene set, second dimension giving the model component, 3rd dimension collecting statistics from the bootstraps, and 4th dimension holding the values from within the gene set vs the background. The components "stat", "var", "dof" and "avgCor" represent the mean, variance, degrees of freedom and average correlation for a gene set, model component and gene-group combination.
For example
gsea@tests['A','cont','stat','test']
is the set A, continuous, mean of the test group.
Thanks Andrew! I think I get the idea.