edger glmQLFit mix glmLRT
2
1
Entering edit mode
@catalina-aguilar-hurtado-6554
Last seen 4.1 years ago
United States

Hi sorry for the random question. I was using edger and used:  "fit <- glmQLFit(y, design) " followed by: "lrt <- glmLRT(fit, coef=6) " instead of using glmQLFTest  and got some interesting results. Which of course changed to very few degs when saw the mistake and I used the actual  glmQLFTest or glmFit with glmLRT.  Just wondering what happens when you mix this tests.

Thanks,

Cata

 

edger • 2.3k views
ADD COMMENT
6
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 36 minutes ago
The city by the bay

Don't mix the tests.

glmQLFit uses the trended NB dispersion, and relies on the QL dispersion to model variability around the trend. However, glmLRT does not have any concept of QL dispersions and will ignore any such values in the supplied DGEGLM object. By mixing up the tests, you are effectively demanding that edgeR only consider the trended dispersion, without any gene-to-gene variability around the trend. This would be inappropriate for the vast majority (all?) of the data sets I have worked with, probably manifesting in anticonservativeness.

If you had used glmFit, it would try to use the tagwise NB dispersions from estimateDisp (or friends). This would be more suitable for glmLRT, as the tagwise NB dispersions do give stronger weighting to the gene-specific information. Of course, if you then tried to use glmQLFTest, it will fail as there are no QL dispersions to work with.

As you may already know, our general recommendation is that glmQLFit plus glmQLFTest is the way to go for most data sets. This is because the uncertainty of the dispersion estimates and the variability around the trend is handled more accurately than with glmFit and glmLRT. Needless to say, getting more DEGs doesn't tell you anything about the correctness of the analysis.

ADD COMMENT
0
Entering edit mode

Ryan and Aaron thanks so much for your detailed explanation. I understand now what happen to my data and good to know it is wrong to mix them (even by mistake). I know getting more DEGs doesn't mean that the analysis is good, just with poorly annotated organism can be hard to answers the biological question with just a few genes, but of course those are the results.

Thanks.  

ADD REPLY
4
Entering edit mode
@ryan-c-thompson-5618
Last seen 12 weeks ago
Icahn School of Medicine at Mount Sinai…

As you have realized, the correct function to use before glmLRT is glmFit, while glmQLFit must be used with glmQLFTest. You should not expect glmQLFit followed by glmLRT to give you any meaningful result.

(In the current version of edgeR, I believe this combination (QLFit + LRT) will give you a likelihood ratio test using only the dispersion trend instead of the genewise dispersions. This is almost certainly not what you want, and in any case it only works that way due to an implementation detail, so even in the unlikely case that it was what you wanted, it would still be wrong to use it.)

ADD COMMENT
0
Entering edit mode

Thanks a lot and appreciate the clarification, I also performed the test and found this discrepancy of using QLFIT+LRT gives massive number of DEGs. When I perform QLFit +QLFTest(1 DEG with FDR 0.05) and glmFit+LRT (No degs), obviously my design is not clean since there I have no WT and my ctrl are diseased animals with the one for testing are diseased animals with 2 different drugs. I saw a few number of people using the  QLFIT+LRT tests and this should be retired in the new version of edgeR as you both highlighted it is incorrect to use this way since we do not want any dispersion trend without a gene-gene variability. It would be great if this added in the manual, or please point out if its there in the manual and I missed it. Thanks

ADD REPLY
0
Entering edit mode

Where would people get the idea to mix-and-match the GLM fitting and test functions? There is no section in the user's guide that even states that this is possible. Indeed, the documentation for the two test functions explicitly indicates what kind of object is expected by each function, based on the description of the glmfit argument.

Perhaps a case could be made for modifying the code or documentation to protect users from themselves. However, I would suggest that the real problem is that you've been hanging around people who haven't been following the instructions properly.

At any rate, glmLRT will not be retired, as it is still useful in a few contexts.

ADD REPLY
0
Entering edit mode

I do not think, there is a need to retire glmLRT and definitely people using it without reading the manuals or proper understanding and clarification. Well till date I never had the need of glmQLFit , so I was pretty happy with glmFIT + glmLRT. It was need I found in a collaborator so I started digging in and found such threads and usage of mix tests. Yes, already reported to the people using the tests wrongly but I appreciate the great clarification. Some times it is hard to convince a biologists why they cannot see a DEG w/o a proper design and then if a drug response is really not effective enough on specific tissues. 

P.S: Never asked for any test to be retired rather implementation of such mix tests should throw an error. Hope I could clarify. But it is just a suggestion. 

ADD REPLY

Login before adding your answer.

Traffic: 670 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6