Entering edit mode
Dear Laura,
Here is a reproducible example of edgeR code for finding genes that
are DE
between any of your cultivars, regardless of which cultivars:
nlibs <- 24
cultivar <-
factor(c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12))
design <- model.matrix(~cultivar)
y <- matrix(rnbinom(ngenes*nlibs,mu=50,size=10),ngenes,nlibs)
d <- DGEList(counts=y)
d <- estimateGLMCommonDisp(d,design)
d <- estimateGLMTrendedDisp(d,design)
d <- estimateGLMTagwiseDisp(d,design)
fit <- glmFit(d,design)
lrt <- glmLRT(d,fit,coef=2:12)
topTags(lrt)
This test performs a likelihood ratio test, similar to an F-test on 11
degrees of freedom.
decideTestsDGE() however doesn't work when the contrast is on more
than 1
degree of freedom. I'm not sure what to do about that yet, because
decideTestsDGE shows the number of up and down genes, and the concept
of
up and down doesn't exist for F-like tests.
Best wishes
Gordon
> Date: Wed, 16 Nov 2011 11:08:47 +0100
> From: lpascual <laura.pascual at="" avignon.inra.fr="">
> To: bioconductor at r-project.org
> Subject: [BioC] bayseq and edgeR for multi groups comparisons
> Message-ID: <4EC38BAF.3050000 at avignon.inra.fr>
> Content-Type: text/plain
>
> Dear all,
>
>
> I am starting to analyze the results of a DGE experiment, and I have
> some doubts about how to perform multi-group comparisons. Our data
set
> consists in just one developmental stage, but I have data from 12
> different genotypes (I have duplicates of the samples).
>
> I will like to identify all the differentially expressed genes among
the
> different genotypes. I will like to know your opinion about edgeR
and
> baySeq packages as both seam to be able to perform multi-group
analysis.
>
> And I will also appreciate if you could give me any idea about the
best
> way to design mi contrast hypothesis.
>
>
> *When working with edgeR I have employ this type of contrast matrix
for
> my analysis*
>
>
> > cultivar <-
> factor(c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12))
>
> > design <- model.matrix(~cultivar)
>
> > design
>
> (Intercept) cultivar2 cultivar3 cultivar4 cultivar5 cultivar6
cultivar7
> ?....
>
> 1 1 0 0 0 0 0 0
>
> 2 1 0 0 0 0 0 0
>
> 3 1 1 0 0 0 0 0
>
> 4 1 1 0 0 0 0 0
>
> 5 1 0 1 0 0 0 0
>
> 6 1 0 1 0 0 0 0
>
> 7 1 0 0 1 0 0 0
>
> 8 1 0 0 1 0 0 0
>
> 9 1 0 0 0 1 0 0
>
> 10 1 0 0 0 1 0 0
>
> :
>
> :
>
> :
>
> attr(,"assign")
>
> [1] 0 1 1 1 1 1 1 1 1 1 1 1
>
> attr(,"contrasts")
>
> attr(,"contrasts")$cultivar
>
> [1] "contr.treatment"
>
>
> *However I believe that will just help me to identify the genes that
are
> differentially expressed in one of the cultivars with respect to the
> rest. Besides if I keep going doing all the contrast at the same
time as
> people from the bioconductor mail list suggest me to extract the
> differentially expressed genes I get a mistake.*
>
>
> > glmfit.d2 <- glmFit(d2, design, dispersion = d2$common.dispersion)
>
> > names(glmfit.d2)
>
> [1] "coefficients" "fitted.values" "deviance" "df.residual"
>
> [5] "abundance" "design" "offset" "dispersion"
>
> [9] "method" "counts" "samples"
>
> > lrt.d2 <- glmLRT(d2, glmfit.d2, coef =2:12)
>
>
> > summary(decideTestsDGE(lrt.d2, adjust.method="BH", p.value=0.05))
>
> Error in array(x, c(length(x), 1L), if (!is.null(names(x)))
> list(names(x), :
>
> attempt to set an attribute on NULL
>
>
> *If I check just one of my possible contrasts it works*
>
>
> > lrt.d2test <- glmLRT(d2, glmfit.d2, coef =2)
>
> > summary(decideTestsDGE(lrt.d2test, adjust.method="BH",
p.value=0.05))
>
> [,1]
>
> -1 875
>
> 0 14026
>
> 1 1460
>
>
> *However I think I am just getting genes whose expression is
different
> in one of the cultivars, but not changing in the rest. And I will
like
> just to get all the genes differentially expressed no matter in
which
> cultivar or cultivars.*
>
>
> *I have also try to carry these analysis with baySeq package, and I
have
> design this contrast hypothesis.*
>
>
> Slot "groups":
>
> $NDE
>
> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
>
> $DE
>
> [1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12
>
>
> *But, I believe with these approach I will not be able to identify
genes
> that for example may have the same expression in cultivars 1,2,3,4
but
> different to the expression in cultivars 5,6,7,8. What do you think
> about it? Do you think I need to add model for each of the possible
> combinations? *
>
> *I will appreciate any suggestion that you could give me about the
best
> way to analyze those data.*
>
>
> *Thanks in advance for your attention*
>
>
>
> *Laura Pascual*
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:5}}