Hi, I am a bit confused with what design matrix to use to test for differential expression. I've used two examples of design matrix for time course analyzis with many time points of two conditions, and I get different results when using voom. For design1 I got 884 genes with adjusted p-value <0.05 while for design 2 I got 146 genes with adjusted p-value<0.05. The code I used was:
> counts <- read.delim("counts.txt", row.names=1, stringsAsFactors=FALSE, header=TRUE)
> y <- DGEList(counts=counts)
> A <- aveLogCPM(y)
> y2 <- y[A>1,]
> y2 <- calcNormFactors(y2)
> library(splines)
> targets <- read.delim("targets.txt",header=TRUE)
> X <- ns(targets$Time, df=5)
> Group <- factor(targets$Group)
> design1 <- model.matrix(~Group+Group:X)
> v1 <- voom(y2, design1, plot=TRUE)
> fit1 <- lmFit(v1, design1)
> fit1 <- eBayes(fit1)
> toptable_voomd1 <-topTable(fit1, coef=ncol(design1), number=30000, adjust="BH")
> write.table(toptable_voomd1, file="DE voom timecourse different trend design1.txt", sep="\t", quote=FALSE)
> design2 <- model.matrix(~Group*X)
> v2 <- voom(y2, design2, plot=TRUE)
> fit2 <- lmFit(v2, design2)
> fit2 <- eBayes(fit2)
> toptable_voomd2 <-topTable(fit2, coef=ncol(design2), number=30000, adjust="BH")
> write.table(toptable_voomd2, file="DE voom timecourse different trend design2.txt", sep="\t", quote=FALSE)
Design1 column names are:
[1] "X.Intercept." "GroupControl" "GroupAA.X1" "GroupControl.X1" "GroupAA.X2" "GroupControl.X2"
[7] "GroupAA.X3" "GroupControl.X3" "GroupAA.X4" "GroupControl.X4" "GroupAA.X5" "GroupControl.X5"
Design2 column names are:
[1] "X.Intercept." "GroupControl" "X1" "X2" "X3" "X4"
[7] "X5" "GroupControl.X1" "GroupControl.X2" "GroupControl.X3" "GroupControl.X4" "GroupControl.X5"
Is the result in voom showing different trends in time between group1 and group 2?
Which design is more suitable to find the different trends in time?
Below are the top ten genes obtained for each design with voom, and the targets description:
<caption>Design1</caption>
|
|
|
Thank you,
Camila
Note that by using
coef = ncol(design)
in topTable you are extracting the last contrast of the design matrix. Which is not the same between the design1 and the design2. You should probably set different contrast matrix to test the same hypothesis with different designs.