If I look the raw countdata, ~55% of my genes have zero counts in all the samples (64 samples in total for 8 genotypes). Though filterByExpr effectively eliminated several high FC yielding but low level expressed unwanted genes, I noticed that some of the biologically interesting genes are eliminated. For ex, if I take a subset of the samples for a single genotype having two time points for control and stress, following pattern (ie, 2h stress induction) is interesting to my biological question,
2h_C1_1 2h_C1_2 4h_C1_1 4h_C1_2 2h_S1_1 2h_S1_2 4h_S1_1 4h_S1_2
Gene_x 0 0 18 15 48 54 29 35
How can I deal with elimination?
I use code,
group <- factor(paste(targets$Treat,targets$Time,targets$Genotype,sep="."))
cbind(targets,Group=group)
y <- DGEList(counts=x)
keep <- filterByExpr(y)
y <- y[keep, , keep.lib.sizes=FALSE]
y <- calcNormFactors(y)
design <- model.matrix(~0+group)
colnames(design) <- levels(group)
y <- estimateDisp(y, design)
fit <- glmQLFit(y, design)
my.contrasts <- makeContrasts(--contrast--,levels=design)
qlf <- glmQLFTest(fit, contrast=my.contrasts[--contrast--])
topTags(qlf)
Now the list looks perfect..Thank you Gordon Symth!