Hi Lisa,
Lisa Luo wrote:
> Hi List, I have a questions regarding Limma and 2way ANOVA. I have
a
> data set containing 2 cell lines and a gene knockout. So in the
> design file, I have cell1.KO, cell1.WT, cell2.KO and cell2.WT. I
> want to get the differentially expressed genes between KO and WT.
Is
> the contrast (0.5*(cell1.KO-cell1.WT+cell2.KO-cell2.WT)) right? Is
> this the same as looking knockout effect in 2way anova? When I take
a
> look at the heatmap, the genes identified seemed to be
differentially
> expressed in either one of the comparison?
This is not the same as a conventional main effect in a two-way ANOVA,
but it _does_ measure the difference between KO and WT. Just not the
way
that you might think.
Note that the statistic you are constructing has the average
difference
between KO and WT in the numerator, and a moderated measure of the
standard error associated with each coefficient in the denominator.
What this means is you will select those genes where the average
difference between KO and WT is 'large' and the variability of each
group (cell1.KO, cell2.KO, cell1.WT, cell2.WT) is low. Because of
this,
you can get a significant contrast if e.g., cell1.KO - cell1.WT is
large
(but cell2.KO - cell2.WT is very small), if the variance estimate for
each term is small, which is what I think you are seeing.
If you are really looking for a standard main effect (i.e., KO vs WT
ignoring cell type) then you can do one of two things. First, you can
set your parameterization up so you are explicitly fitting a model
with
main effects, or you can fit a simpler model where you are just doing
a
t-test comparing KO vs WT and pooling the cell types.
As an example, let's say you have two replicates of each sample, and
the
replicates look like this:
> rep(c("cell1.KO","cell1.WT","cell2.KO","cell2.WT"), each=2)
[1] "cell1.KO" "cell1.KO" "cell1.WT" "cell1.WT" "cell2.KO" "cell2.KO"
"cell2.WT" "cell2.WT"
Now you can set up your model like this:
> KO <- factor(rep(1:2, each = 2, times = 2))
> KO
[1] 1 1 2 2 1 1 2 2
Levels: 1 2
> CELL <- factor(rep(1:2, each = 4))
> CELL
[1] 1 1 1 1 2 2 2 2
Levels: 1 2
> design <- model.matrix(~KO + CELL)
> design
(Intercept) KO2 CELL2
1 1 0 0
2 1 0 0
3 1 1 0
4 1 1 0
5 1 0 1
6 1 0 1
7 1 1 1
8 1 1 1
attr(,"assign")
[1] 0 1 2
attr(,"contrasts")
attr(,"contrasts")$KO
[1] "contr.treatment"
attr(,"contrasts")$CELL
[1] "contr.treatment"
Now your second coefficient measures the difference between KO and WT
while ignoring cell type, just like in a conventional two-way ANOVA.
HTH,
Jim
>
> Thanks,
>
> Lisa
>
> ---------------------------------
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________ Bioconductor mailing
> list Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
> archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
**********************************************************
Electronic Mail is not secure, may not be read every day, and should
not be used for urgent or sensitive issues.
Hi Lisa and James,
You are both correct I think. The contrast given by Lisa is the
KOvsWT effect under the contr.sum (sum to zero) parametrization while
the contrast given by James is the KOvsWT effect under the
contr.treat (treatment) parametrization. See ?contr.sum. The sum to
zero parametrization is the classical parametrization used by
statistics textbooks for factorial anova. The treatment
parametrization is the default used by R for linear models and anova.
The treatment parametrization was popularised by computer software
many years ago by programs such as GLIM.
The moral is that there is no unique definition of main effect in a
two-way anova. Neither parametrization is right or wrong. The sum to
zero parametrization gives you the genotype effect averaged over the
two cell types. The treatment parametrization gives you the genotype
effect for cell type 1 only. The lack of a unique definition is the
reason why I in effect force limma users to specify the contrasts
explicitly.
I have tried to explain the different parametrizations in Section 8.7
"Factorial Designs" of the limma User's Guide. Please have a look at
this. I'd be pleased for any feedback on how helpful it is.
Best wishes
Gordon
>Date: Wed, 16 Aug 2006 11:31:32 -0400
>From: "James W. MacDonald" <jmacdon at="" med.umich.edu="">
>Subject: Re: [BioC] limma and 2way anova
>To: Lisa Luo <lisaluo_bioc at="" yahoo.com="">
>Cc: bioconductor at stat.math.ethz.ch
>
>Hi Lisa,
>
>Lisa Luo wrote:
> > Hi List, I have a questions regarding Limma and 2way ANOVA. I
have a
> > data set containing 2 cell lines and a gene knockout. So in the
> > design file, I have cell1.KO, cell1.WT, cell2.KO and cell2.WT. I
> > want to get the differentially expressed genes between KO and WT.
Is
> > the contrast (0.5*(cell1.KO-cell1.WT+cell2.KO-cell2.WT)) right?
Is
> > this the same as looking knockout effect in 2way anova? When I
take a
> > look at the heatmap, the genes identified seemed to be
differentially
> > expressed in either one of the comparison?
>
>This is not the same as a conventional main effect in a two-way
ANOVA,
>but it _does_ measure the difference between KO and WT. Just not the
way
>that you might think.
>
>Note that the statistic you are constructing has the average
difference
>between KO and WT in the numerator, and a moderated measure of the
>standard error associated with each coefficient in the denominator.
>
>What this means is you will select those genes where the average
>difference between KO and WT is 'large' and the variability of each
>group (cell1.KO, cell2.KO, cell1.WT, cell2.WT) is low. Because of
this,
>you can get a significant contrast if e.g., cell1.KO - cell1.WT is
large
>(but cell2.KO - cell2.WT is very small), if the variance estimate for
>each term is small, which is what I think you are seeing.
>
>If you are really looking for a standard main effect (i.e., KO vs WT
>ignoring cell type) then you can do one of two things. First, you can
>set your parameterization up so you are explicitly fitting a model
with
>main effects, or you can fit a simpler model where you are just doing
a
>t-test comparing KO vs WT and pooling the cell types.
>
>As an example, let's say you have two replicates of each sample, and
the
>replicates look like this:
>
> > rep(c("cell1.KO","cell1.WT","cell2.KO","cell2.WT"), each=2)
>[1] "cell1.KO" "cell1.KO" "cell1.WT" "cell1.WT" "cell2.KO" "cell2.KO"
>"cell2.WT" "cell2.WT"
>
>Now you can set up your model like this:
>
> > KO <- factor(rep(1:2, each = 2, times = 2))
> > KO
>[1] 1 1 2 2 1 1 2 2
>Levels: 1 2
> > CELL <- factor(rep(1:2, each = 4))
> > CELL
>[1] 1 1 1 1 2 2 2 2
>Levels: 1 2
> > design <- model.matrix(~KO + CELL)
> > design
> (Intercept) KO2 CELL2
>1 1 0 0
>2 1 0 0
>3 1 1 0
>4 1 1 0
>5 1 0 1
>6 1 0 1
>7 1 1 1
>8 1 1 1
>attr(,"assign")
>[1] 0 1 2
>attr(,"contrasts")
>attr(,"contrasts")$KO
>[1] "contr.treatment"
>
>attr(,"contrasts")$CELL
>[1] "contr.treatment"
>
>Now your second coefficient measures the difference between KO and WT
>while ignoring cell type, just like in a conventional two-way ANOVA.
>
>HTH,
>
>Jim
>
>
> >
> > Thanks,
> >
> > Lisa
> >
> > ---------------------------------
> >
> >
> > [[alternative HTML version deleted]]
> >
> > _______________________________________________ Bioconductor
mailing
> > list Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
> > archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>--
>James W. MacDonald, M.S.
>Biostatistician
>Affymetrix and cDNA Microarray Core
>University of Michigan Cancer Center
>1500 E. Medical Center Drive
>7410 CCGC
>Ann Arbor MI 48109
>734-647-5623