Competitive gene set testing between two sets of genes
1
1
Entering edit mode
le2336 ▴ 20
@le2336-10789
Last seen 4.2 years ago

Hello,

I ran camera() in edgeR to test whether 2 gene sets are highly ranked in my mutant data compared to my wild-type data in terms of differential expression relative to other genes.

design <- model.matrix(~0 + genotype)
contrast <- makeContrasts(mutant - wildtype, levels=design)
camera_test <- camera(y, id_matrix, design=design, contrast = contrast)

wildtype vs mutant NGenes Direction PValue FDR
Gene set 1 1879 Down 1.92E-20 4.2E-20
Gene set 2 4196 Down 2.76E-13 3.1E-13

To follow up on these results, I would like to test whether the difference in rank between these 2 gene sets is significant, i.e. to test whether Gene set 1 is more significantly downregulated in the mutant than Gene set 2. What is the best way to approach this? Thank you.

       
gene set testing camera edger • 1.8k views
ADD COMMENT
2
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia

Well, this isn't a standard thing to do. I guess you could simply compute a two-sample t-test between the test statistics for the two sets. Something like this:

fit <- glmFit(y, design)
lrt <- glmLRT(fit, contrast=contrast)
z <- sign(lrt$table$logFC) * sqrt(lrt$table$LR)
z.geneset1 <- z[ id_matrix[["Gene set 1"]] ]
z.geneset2 <- z[ id_matrix[["Gene set 2"]] ]
t.test(z.geneset1, z.geneset2)

You could also visualize the differences using:

barcodeplot(z, index=id_matrix[["Gene set 1"]], index2=id_matrix[["Gene set 2"]] )

 

 

ADD COMMENT
0
Entering edit mode

Hi Gordon,

Thank you for your response. I ran the t-tests and all the results are "p-value < 2.2e-16". However, the barcode plots suggest that Gene set 1 tends to have more downregulated genes with more negative log-fold-changes than Gene set 2. I wonder if the two-sample t-test may be too sensitive to appreciate this difference.

Would the following type of comparison be reasonable? Within each genotype, I first obtain a test statistic for Gene set 1 and Gene set 2. Using these values, I perform a second comparison of the test statistics between genotypes. The comparison would thus be: Mutant(Gene set 1 vs Gene set 2) vs wildtype (Gene set 1 vs Gene set 2).

ADD REPLY
0
Entering edit mode

I don't understand what you mean. You say that you ran multiple t-tests, but I advised you to do only one t-test.

ADD REPLY
0
Entering edit mode

My apologies for the confusion! I did only run one t-test as you recommended for this specific comparison, and obtained "p-value < 2.2e-16" as the result. The other t-tests I mentioned were run for other gene sets in the same dataset, again with only one t-test per comparison -- in those instances I also obtained "p-value < 2.2e-16". Perhaps this is why this isn't a standard thing to do. Many thanks again for your help.

I am still curious if the alternative comparison I described makes sense to perform, specifically for Gene set 1 and another set of genes that do not change in my mutant. As this is now a separate analysis that deviates from the original question, I could open this as a new question if you would prefer.

ADD REPLY

Login before adding your answer.

Traffic: 621 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6