Hello everyone, I am a beginner with the DESeq2 package, and I am trying to understand the mathematical principles behind this package.
In statistical modeling and hypothesis testing, a contrast vector c is used to construct specific linear combinations in order to test whether this linear combination equals a certain hypothesized value. A example below.
If I have a condition with four group, to avoid dummy trap & not full rank of matrix, a regression equation must like : Y = Beta_0 + Beta_1 X_1 + Beta_2 X_2 + Beta_3 * X_3 + error. Where "Y" is read counts and "Beta_1 to Beta_3" is dummy variables.
Question1: When I have a factor that has been incorporated into the intercept, and I want to compare Beta_1 with the factor that has been incorporated into the intercept, how should I set the contrast vector? Which one of the contrast vector ( [1,-1,0,0] or [0,1,0,0] ) is correct?
Question2: Log2foldChange represents the logarithmic expression difference between two groups. When the contrast vector multiply Beta vector = q. Is set to q = 0, doesn't taking the logarithm result in infinity? How does the software avoid the occurrence of infinite values?
Oh, and I don't understand your second question. If you set a contrast vector to be all zeros (which I think is what you are asking), then you are telling
DESeq2
that you don't want any comparison and it tells you that is not a thing.Thank you for your reply.
After a long period of study, I realized that I had misunderstood the workings of DESeq2. Thank you for your clear explanation.
When I was reading the literature, I misunderstood the situation regarding the matrix not being full rank. This situation occurs in the context of shrinkage estimation, not during the MLE estimation stage, which led to my misunderstanding about the test of multiple linear combinations on coefficients from the Wald test.