How does the function model.matrix (to define experimental design) really works ?
1
0
Entering edit mode
Aurora ▴ 20
@aurora-15104
Last seen 6.0 years ago

Good morning,

I am working with edgeR to perform differential expression on rna-seq data.

I have a design with two variables :  type (obese/normal) and treatment

type is a vector with two levels (obese/normal)

treatment is a vector with 5 levels ( 5 different treatments)

But when I perform this line of code : design_matrix = model.matrix(~0+type+treatment)

the design_matrix that results have only 6 columns : typeLean, typeObese and only 4 columns for treatments ( the treatment that appears to be the first level of the treatment vector is not in my design_matrix ! )

Does anyone knows why ? How could I have all the treatments in my design_matrix ?

Thank you,

Have a good day

experimental design model.matrix edger • 1.3k views
ADD COMMENT
4
Entering edit mode
@ryan-c-thompson-5618
Last seen 6 weeks ago
Icahn School of Medicine at Mount Sinai…

It is important to note that a factor with K levels only adds K-1 coefficients to the design matrix, because there are only K-1 independent differences between K groups. So your design matrix should have 1 + (5-1) + (2-1) = 6 coefficients. For more information on how factors are encoded into a design matrix, have a look here. You are most likely using the "dummy coding" since that is the default.

You can sidestep the problem of factor coding for one of your factors by using a model with no intercept (i.e. ~0), but the rest of the factors must still be coded as normal.

ADD COMMENT
0
Entering edit mode

Thanks a lot for this explanation ! 

ADD REPLY

Login before adding your answer.

Traffic: 957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6