Hi all,
I am doing a DE analysis of transcriptomics data (Quantseq) where I compare 2 groups consisting of 300 patients in each group. I have normalized my libraries for library size and composition, have filtered out lowly expressed genes (with a count of <10) and did a glmQLFTest.
Using the decideTests function I get the following output:
summary(decideTests(glf))
-1*x 1*y
Down 414
NotSig 13509
Up 638
If I then use the glmTreat function where I use a FC cut-off of 1.5 (or logFC cut-off of 0.58), I only retain 19 genes!
tr <- glmTreat(fit, contrast = xvsy, lfc = log2(1.5))
is.de <- decideTestsDGE(tr)
summary(is.de)
-1*x 1*y
Down 9
NotSig 14542
Up 10
So my question is: how can it be that there are only this ''many" DE genes left? Of course a conclusion can be that the groups are more similar than previous thought, but I think that is highly unlikely. Are there other factors that might have influenced this result? Can I for example take an other lfc cut-off? And can I also take other (clinical) factors into account with the glmQLFTest (for example, age, comorbidities etc.)
Furthermore, in the result are some genes with a logFC of >0.58 and <-0.58, but with a FDR of >0.05, so I assume this is the reason these are not taken into account in the decideTests table for DE genes. In the below table you can see the downregulated genes and there are 28 downregulated genes with a logFC of <-0.58.
> test <- topTags(tr, n = Inf)
> sort(test$table$logFC)
[1] -2.6109802 -1.7722134 -1.4860788 -1.4045712 -1.2123096 -1.0688111 -0.9362862 -0.9275250 -0.8460224 -0.7968878 -0.7695722
[12] -0.7607765 -0.7589310 -0.7300097 -0.7214861 -0.6902481 -0.6802954 -0.6769655 -0.6632714 -0.6614588 -0.6494449 -0.6491878
[23] -0.6425783 -0.6393694 -0.6289857 -0.6231364 -0.6192043 -0.6020387 -0.5798458 -0.5759566 -0.5705645 -0.5683261 -0.5673250
[34] -0.5652899 -0.5617309 -0.5606385 -0.5543450 -0.5532832 -0.5505854 -0.5444628 -0.5315794 -0.5298383 -0.5251693 -0.5226470
[45] -0.5220916 -0.5196427 -0.5173993 -0.5146078 -0.5070766 -0.5040420 -0.4991255 -0.4955685 -0.4951795 -0.4932777 -0.4902114
Please, help! :)
Dear Gordon,
I am using the EdgeR manual as reference, but as a starter I sometimes worry that I am missing or skipping an essential step.
I could use glmTreat to get just a list of genes which are highly up/downregulated and in my case I can manually annotate them (those 19 genes). With the results of the glmQLFTest I will try to do a network/pathway analysis.
Anyway, many thanks for your answer, much appreciated!