Dear all,
I would like to get some advice over how to do a post hoc test on a nested interaction model. This is related to another question I asked previously ( Nested design in edgeR ) .
The question is similar to the previous one (with higher number of levels), we would like to know how do different brain regions (13 regions) correlate to the level of a protein B measure by immuno-staining.
Here is the current approach i am taking:
## Nested model B_region_design=model.matrix(~0+region+B:region,meta) ## calculate commond, trend, tagwise dispersion for EdgeR model fitting subcounts_disp=estimateDisp(subcounts,B_region_design,robust=T) fit=glmQLFit(design = B_region_design,y = subcounts_disp,robust = T) ## interaction coefficient starts from column 14 to 26 res_QL=glmQLFTest(fit,coef =14:26) tb=topTags(res_QL,n= Inf )$table ## create a contrast matrix with pairwise comparison between all regions head(contrast)
COM:B-AUD:B |
CTXsp:B-AUD:B |
ENTI:B-AUD:B |
FB:B-AUD:B |
HPd:B-AUD:B |
HPs:B-AUD:B |
HY:B-AUD:B |
OLF:B-AUD:B |
PTL:B-AUD:B |
RSP:B-AUD:B |
⋯ |
PTL:B-OLF:B |
RSP:B-OLF:B |
SSp:B-OLF:B |
TH:B-OLF:B |
RSP:B-PTL:B |
SSp:B-PTL:B |
TH:B-PTL:B |
SSp:B-RSP:B |
TH:B-RSP:B |
TH:B-SSp:B |
|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AUD |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
⋯ |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
COM |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
⋯ |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
CTXsp |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
⋯ |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
ENTI |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
⋯ |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
FB |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
⋯ |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
HPd |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
⋯ |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
##
res_QL_contrast=glmQLFTest(fit,contrast = contrast)
I used res_QL_contrast
to find genes which have a regional difference in association with protein B, and the topTags
outputs a table with contains logFC
for each comparison and a single pvalue
.
So i guess any significant test will only indicate that there is at least one comparison (between two regions) are different.
And to know which two regions are significantly different, I have to test on individual contrast separately.
I am planning to do something as the following :
toptags_list[] for (i in 1:ncol(contrast){ res_list=glmQLFTest(fit,contrast = level2_contrast[,i]) toptags_list[i]=topTags(res_list,n=Inf)$table }
And after finished testing all comparisons, I will concatenate all the tables in the toptags_list
and run a global FDR p-values correction.
I would like to know if this a valid way to do post-hoc comparison.
And also is there a better way of doing this analysis, as my data is very large (40000 genes 4700 samples), and doing all these comparison in one combined contrast already taking a long time, not mentioning if I want to do it by individual columns.
Thank you very much in advance,
Ashley