The basis for the full rank design matrix requirement in DESeq2
1
0
Entering edit mode
Nik Tuzov ▴ 90
@nik-tuzov-8783
Last seen 11 months ago
United States

Dear Prof. Love:

There have been numerous questions about “not full rank error” in DESeq2. The vignette addresses the issue and I understand when the error is generated. What I would like to find out is why you decided to include that feature in the first place. The point is that both SAS (PROC GLM, MIXED, GENMOD with Negative Binomial response) and R (lm) are quite comfortable with incomplete rank designs.

When one factor is perfectly confounded with another (called “Linear combinations” in the vignette) I suppose it was a good idea to generate an error, but, strictly speaking, SAS and R will produce an answer even in that case (R will produce some missing values and a warning “not defined because of singularities”).

Was your intention just to force the user to “consult a statistician” or there were some estimation difficulties when fitting the model with an incomplete rank matrix?

Regards, Nik Tuzov

deseq2 • 587 views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 1 day ago
United States

Plus or minus epsilon in 100% of the cases of user’s encountering the full rank error it was because they have accidentally included covariates that have a linear dependency. Typically batch confounded with condition, or attempting to control both effects within subject as well as subject traits such as sex or age.

ADD COMMENT
0
Entering edit mode

I was almost sure that some estimation difficulties were the reason because catching statistical design errors is well outside DESeq2 mandate. It would be better if that feature were optional, not hard coded. However, if the user were allowed to use any design matrix (including those in GLM coding which you call EMM) it may have some side effects in lfcShrink().

ADD REPLY
0
Entering edit mode

Catching design errors is equally as important as estimating the parameters in my opinion. Adding support for non full rank X would involve additional complexity for little to no gain.

ADD REPLY

Login before adding your answer.

Traffic: 510 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6