I have a question when I analysis the microarray data collected from Affymetrix mouse transcriptome array. I have four groups
Control,Special Diet,Special Diet+Drug1,Special Diet+Drug2
When I use limma to ge DE genes, I tried to design the matrix in four ways.
1. 4 groups, Control,Special Diet,Special Diet+Drug1,Special Diet+Drug2
design1 <- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3,4,4,4)))
contrast.matrix <- makeContrasts(SpecialDiet-Control, SpecialDietDrug1-Control, SpecialDietDrug2-Control, levels=design)
2. 3 groups, Control,Special Diet,Special Diet+Drug1
design2 <- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3)))
contrast.matrix <- makeContrasts(SpecialDiet-Control, SpecialDietDrug1-Control, levels=design)
3. 3 groups, Control,Special Diet,Special Diet+Drug2
design3 <- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3)))
contrast.matrix <- makeContrasts(SpecialDiet-Control, SpecialDietDrug2-Control, levels=design)
4. 2 groups Control,Special Diet
design4 <- model.matrix(~ 0+factor(c(1,1,1,2,2,2)))
contrast.matrix <- makeContrasts(SpecialDiet-Control, levels=design)
When I compared the results of those results, I found the DE genes between same compared groups are totally different. For example, Special Diet vs Control. In design1, there are 102 genes. In design2, there are 163 genes. In design3, there are 128 genes. In design4, there are 160 genes.
Why? How can I design the design and contrast matrix for my experiments?
Please use the 'Add Comment' button to respond to a post. Anyway, if it's just a change in the number of DE genes, I wouldn't worry about it, the numbers involved are quite small. Also, you shouldn't split the data set into two groups - even though you could have done the two drug treatments in separate experiments, the fact is that you actually did them in a single experiment, so it makes sense to analyze all of the data as that from a single experiment. Remember, you get more accuracy when you have more samples to estimate the variance, so if you generated the samples at the same time, with the same cell type and on the same platform, you should be analyzing it all together to exploit this improved accuracy.
Understand! Thanks so much for you help.