Entering edit mode
Xiaohui Wu
▴
280
@xiaohui-wu-4141
Last seen 10.3 years ago
Hi all,
I'm using edgeR for multiple groups comparison.
I have two treatments (WT and mutant), each treatment in three tissues
(root, leaf, seed), each tissue either has replicates or not.
I want to compare the expression between WT and mutant in general
(considering all the 3 tissues) and in each tissue seperately.
The following is my code to find DE between WT and mutant considering
all the 3 tissues:
treatment=factor(c(rep('WT',8),rep('MUTANT',8)))
tissue=factor(
c('root','root','root','leaf','seed','seed','seed','seed',
'root','root','root','leaf','seed','seed','seed','seed'))
design <- model.matrix(~treatment+tissue)
dge<- DGEList(dat1, group=rep(c("WT","OXT"),each=8))
dge <- estimateGLMCommonDisp(dge, design)
dge$common.dispersion
glmfit.dge <- glmFit(dge, design,dispersion=dge$common.dispersion)
lrt.dge <- glmLRT(dge, glmfit.dge, coef=2) # I want to compare WT and
MUTANT, adjusting for tissue root, seed, and leaf, so the coef=2 which
is treatmentWT
I have some questions:
1) How to interpret the results? I thought the final DE considers DE
in root between WT and Mutant, and DE in leaf between WT and Mutant,
and leaf, right? So If I want to know the foldChange, should I get
seperated FC of root, leaf, and seed?
2) If to compare leaf and seed, then I should use "lrt.dge <-
glmLRT(dge, glmfit.dge, coef=4", right?
But if I want to compare root and seed, can I mannually change the
levels in tissue, like using "levels(tissue)=c('seed','leaf','root')",
then do the same comparison?
3) Some experiments in root are paired, like WT_root1 paired with
Mutant_root1, need I consider this information? I'm afraid this will
make the design matrix too complicate, there will be three factors,
and only 1 or 2 samples are paired from the 16 samples.
4) From your experience, is it better to use pooled data to compare WT
and Mutant (exactTest), or adjust multiple factors (GLMfit)? Is there
any way to evaluate the result? because I don't know which gene is
actually DE.
5) What can I do if there are very few DE genes after adjusting
pvalue? I have filtered genes by CPM>1 for 8 or 4 samples. There are
only 20 genes can be found from the filtered 14,000 genes. (the total
gene is ~120,000).
--> Output design matrix is like this:
(Intercept) treatmentWT tissueroot tissueseed
1 1 1 1 0
2 1 1 1 0
3 1 1 1 0
4 1 1 0 0
5 1 1 0 1
6 1 1 0 1
7 1 1 0 1
8 1 1 0 1
9 1 0 1 0
10 1 0 1 0
11 1 0 1 0
12 1 0 0 0
13 1 0 0 1
14 1 0 0 1
15 1 0 0 1
16 1 0 0 1
> sessionInfo()
R version 2.14.0 (2011-10-31)
Platform: x86_64-pc-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936
[2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
[3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936
[4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_People's Republic of China.936
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] edgeR_2.4.0
loaded via a namespace (and not attached):
[1] limma_3.10.0
I would really appreciate your comments or suggestions.
Many thanks!
Xiaohui Wu