Hello,
I am assisting in the setup of an experiment, in which 3 groups, each
consisting of 8 subjects, will be fed 3 diets:
Group 1 - Diet A
Group 2 - Diet B
Group 3 - Diet C
We plan on using limma to identify the differentially expressed genes.
Reading the limma users guide, a factorial design matrix seems to be
appropriate. I am, however, wondering if we, by using this setup, can
elucidate the differentially expressed genes for each diet, and not
just the ones between groups, e.g. when comparing Group 1 - Group 2.
What is your advice on this?
Thanks in advance!
Best regards,
David Westergaard
On Wed, Sep 5, 2012 at 4:55 AM, David Westergaard <david@harsk.dk>
wrote:
> Hello,
>
> I am assisting in the setup of an experiment, in which 3 groups,
each
> consisting of 8 subjects, will be fed 3 diets:
> Group 1 - Diet A
> Group 2 - Diet B
> Group 3 - Diet C
>
> We plan on using limma to identify the differentially expressed
genes.
> Reading the limma users guide, a factorial design matrix seems to be
> appropriate. I am, however, wondering if we, by using this setup,
can
> elucidate the differentially expressed genes for each diet, and not
> just the ones between groups, e.g. when comparing Group 1 - Group 2.
>
> What is your advice on this?
>
Hi, David.
Are groups 1, 2, and 3 different, or do they differ only in the diet
being
fed?
Sean
[[alternative HTML version deleted]]
On 05.09.2012 09:55, David Westergaard wrote:
> Hello,
>
> I am assisting in the setup of an experiment, in which 3 groups,
each
> consisting of 8 subjects, will be fed 3 diets:
> Group 1 - Diet A
> Group 2 - Diet B
> Group 3 - Diet C
>
> We plan on using limma to identify the differentially expressed
> genes.
> Reading the limma users guide, a factorial design matrix seems to be
> appropriate. I am, however, wondering if we, by using this setup,
can
> elucidate the differentially expressed genes for each diet, and not
> just the ones between groups, e.g. when comparing Group 1 - Group 2.
From your reply to Sean it's not clear what you mean by this last
sentence. What are the 'differentially expressed genes for each diet'?
Any differential expression analysis must compare groups of samples by
definition, no?
You could compare, say Diet A with the average of Diet B and Diet C
(or
even the average of all three). Is that what you mean? Whether that
makes any sense depends on your experimental design. Most obviously,
is
one of the the three diets a 'control' diet? If not then would it be
appropriate to consider an average of the three diets a kind of
meta-control (probably not a word, but hopefully you know what I
mean!)?
--
Alex Gutteridge
Hi Alex,
There is no control group as such. One of the diets is somewhat of a
control group, but not quite because it is still a diet that has some
'special' properties. I am used to working with experiments which has
atleast one control group, so this setup is a bit out of my domain,
which is the reason I'm asking this list for advice.
I guess what I meant by 'differentially expressed genes for each
diet', was a list of genes that can be attributed to this exact diet.
Now that I think about it, it may be more appropriate to collect mRNA
at the start, mid and end of the experiment, and measure the change in
each group, instead of comparing these. The experiment is set to run
for 4months. I have not before dealt with experiments which have ran
for so long.
Would the data collected be suited for microarray analysis? And if so,
when should the microarray analysis be performed? When each sample is
collected, or all together at the end?
Best,
David
2012/9/5 Alex Gutteridge <alexg at="" ruggedtextile.com="">:
> On 05.09.2012 09:55, David Westergaard wrote:
>>
>> Hello,
>>
>> I am assisting in the setup of an experiment, in which 3 groups,
each
>> consisting of 8 subjects, will be fed 3 diets:
>> Group 1 - Diet A
>> Group 2 - Diet B
>> Group 3 - Diet C
>>
>> We plan on using limma to identify the differentially expressed
genes.
>> Reading the limma users guide, a factorial design matrix seems to
be
>> appropriate. I am, however, wondering if we, by using this setup,
can
>> elucidate the differentially expressed genes for each diet, and not
>> just the ones between groups, e.g. when comparing Group 1 - Group
2.
>
>
> From your reply to Sean it's not clear what you mean by this last
sentence.
> What are the 'differentially expressed genes for each diet'? Any
> differential expression analysis must compare groups of samples by
> definition, no?
>
> You could compare, say Diet A with the average of Diet B and Diet C
(or even
> the average of all three). Is that what you mean? Whether that makes
any
> sense depends on your experimental design. Most obviously, is one of
the the
> three diets a 'control' diet? If not then would it be appropriate to
> consider an average of the three diets a kind of meta-control
(probably not
> a word, but hopefully you know what I mean!)?
>
> --
> Alex Gutteridge
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
On 05.09.2012 13:53, David Westergaard wrote:
> Hi Alex,
>
> There is no control group as such. One of the diets is somewhat of a
> control group, but not quite because it is still a diet that has
some
> 'special' properties. I am used to working with experiments which
has
> atleast one control group, so this setup is a bit out of my domain,
> which is the reason I'm asking this list for advice.
>
> I guess what I meant by 'differentially expressed genes for each
> diet', was a list of genes that can be attributed to this exact
diet.
> Now that I think about it, it may be more appropriate to collect
mRNA
> at the start, mid and end of the experiment, and measure the change
> in
> each group, instead of comparing these. The experiment is set to run
> for 4months. I have not before dealt with experiments which have ran
> for so long.
Collecting a baseline measure sounds sensible. If these are human
subjects you should expect a lot of variation (more than in an inbred
animal model), the baseline measure can help correct for that.
Your question is still quite hard though. It's often useful for me to
think through some scenarios for patterns of expresison that might
appear and plot them out before deciding which ones will be
interesting
and then how to design the experiment to find them. E.g: Say Gene X
goes
up two fold after 4 months of Diet A and eight fold after 4 months on
Diet B do you consider that a Diet B 'specific' gene or not? It goes
up
in both A and B, but much more in B, so either interpretation is
possible. If you do consider that gene Diet B specific then you could
do
a contrast like (DietBEnd - DietBStart) - (DietAEnd - DietAStart),
which
shows you genes where the effect was greater in diet B than A without
excluding genes that still showed an effect in A.
In experiments like these I am always quite wary of the temptation to
get differentially expressed gene sets and then do set subtraction.
I.e.
Diet A 'specific' genes = Diet A DE genes - Diet B DE genes - Diet C
DE
genes. I always find that approach is very sensitive to the cutoff
used
to define DE, but it can be easier to interpret I suppose. Again if
there is really no control diet then creating a mean 'meta-diet' might
simplify the analysis (at the cost of the interpretation being more
abstract). So something like: (DietAEnd - DietAStart) - ((DietAEnd -
DietAStart)+(DietBEnd - DietBStart)+(DietCEnd - DietCStart))/3.
> Would the data collected be suited for microarray analysis? And if
> so,
> when should the microarray analysis be performed? When each sample
is
> collected, or all together at the end?
I would go for altogether at the end. RNA is very prone to degradation
though, so you need to take all neccessary steps to preserve the
samples
(remove RNases and get to -80C) as soon as possible after collection.
> Best,
> David
>
>
> 2012/9/5 Alex Gutteridge <alexg at="" ruggedtextile.com="">:
>> On 05.09.2012 09:55, David Westergaard wrote:
>>>
>>> Hello,
>>>
>>> I am assisting in the setup of an experiment, in which 3 groups,
>>> each
>>> consisting of 8 subjects, will be fed 3 diets:
>>> Group 1 - Diet A
>>> Group 2 - Diet B
>>> Group 3 - Diet C
>>>
>>> We plan on using limma to identify the differentially expressed
>>> genes.
>>> Reading the limma users guide, a factorial design matrix seems to
>>> be
>>> appropriate. I am, however, wondering if we, by using this setup,
>>> can
>>> elucidate the differentially expressed genes for each diet, and
not
>>> just the ones between groups, e.g. when comparing Group 1 - Group
>>> 2.
>>
>>
>> From your reply to Sean it's not clear what you mean by this last
>> sentence.
>> What are the 'differentially expressed genes for each diet'? Any
>> differential expression analysis must compare groups of samples by
>> definition, no?
>>
>> You could compare, say Diet A with the average of Diet B and Diet C
>> (or even
>> the average of all three). Is that what you mean? Whether that
makes
>> any
>> sense depends on your experimental design. Most obviously, is one
of
>> the the
>> three diets a 'control' diet? If not then would it be appropriate
to
>> consider an average of the three diets a kind of meta-control
>> (probably not
>> a word, but hopefully you know what I mean!)?
>>
>> --
>> Alex Gutteridge
>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Alex Gutteridge
Hi Alex,
That was a very informative reply on a difficult question, I think you
very much for your advice. I might revive this mail at some point,
when we get closer to the actual statistical analysis.
Best,
David
2012/9/6 Alex Gutteridge <alexg at="" ruggedtextile.com="">:
> On 05.09.2012 13:53, David Westergaard wrote:
>>
>> Hi Alex,
>>
>> There is no control group as such. One of the diets is somewhat of
a
>> control group, but not quite because it is still a diet that has
some
>> 'special' properties. I am used to working with experiments which
has
>> atleast one control group, so this setup is a bit out of my domain,
>> which is the reason I'm asking this list for advice.
>>
>> I guess what I meant by 'differentially expressed genes for each
>> diet', was a list of genes that can be attributed to this exact
diet.
>> Now that I think about it, it may be more appropriate to collect
mRNA
>> at the start, mid and end of the experiment, and measure the change
in
>> each group, instead of comparing these. The experiment is set to
run
>> for 4months. I have not before dealt with experiments which have
ran
>> for so long.
>
>
> Collecting a baseline measure sounds sensible. If these are human
subjects
> you should expect a lot of variation (more than in an inbred animal
model),
> the baseline measure can help correct for that.
>
> Your question is still quite hard though. It's often useful for me
to think
> through some scenarios for patterns of expresison that might appear
and plot
> them out before deciding which ones will be interesting and then how
to
> design the experiment to find them. E.g: Say Gene X goes up two fold
after 4
> months of Diet A and eight fold after 4 months on Diet B do you
consider
> that a Diet B 'specific' gene or not? It goes up in both A and B,
but much
> more in B, so either interpretation is possible. If you do consider
that
> gene Diet B specific then you could do a contrast like (DietBEnd -
> DietBStart) - (DietAEnd - DietAStart), which shows you genes where
the
> effect was greater in diet B than A without excluding genes that
still
> showed an effect in A.
>
> In experiments like these I am always quite wary of the temptation
to get
> differentially expressed gene sets and then do set subtraction. I.e.
Diet A
> 'specific' genes = Diet A DE genes - Diet B DE genes - Diet C DE
genes. I
> always find that approach is very sensitive to the cutoff used to
define DE,
> but it can be easier to interpret I suppose. Again if there is
really no
> control diet then creating a mean 'meta-diet' might simplify the
analysis
> (at the cost of the interpretation being more abstract). So
something like:
> (DietAEnd - DietAStart) - ((DietAEnd - DietAStart)+(DietBEnd -
> DietBStart)+(DietCEnd - DietCStart))/3.
>
>
>> Would the data collected be suited for microarray analysis? And if
so,
>> when should the microarray analysis be performed? When each sample
is
>> collected, or all together at the end?
>
>
> I would go for altogether at the end. RNA is very prone to
degradation
> though, so you need to take all neccessary steps to preserve the
samples
> (remove RNases and get to -80C) as soon as possible after
collection.
>
>
>> Best,
>> David
>>
>>
>> 2012/9/5 Alex Gutteridge <alexg at="" ruggedtextile.com="">:
>>>
>>> On 05.09.2012 09:55, David Westergaard wrote:
>>>>
>>>>
>>>> Hello,
>>>>
>>>> I am assisting in the setup of an experiment, in which 3 groups,
each
>>>> consisting of 8 subjects, will be fed 3 diets:
>>>> Group 1 - Diet A
>>>> Group 2 - Diet B
>>>> Group 3 - Diet C
>>>>
>>>> We plan on using limma to identify the differentially expressed
genes.
>>>> Reading the limma users guide, a factorial design matrix seems to
be
>>>> appropriate. I am, however, wondering if we, by using this setup,
can
>>>> elucidate the differentially expressed genes for each diet, and
not
>>>> just the ones between groups, e.g. when comparing Group 1 - Group
2.
>>>
>>>
>>>
>>> From your reply to Sean it's not clear what you mean by this last
>>> sentence.
>>> What are the 'differentially expressed genes for each diet'? Any
>>> differential expression analysis must compare groups of samples by
>>> definition, no?
>>>
>>> You could compare, say Diet A with the average of Diet B and Diet
C (or
>>> even
>>> the average of all three). Is that what you mean? Whether that
makes any
>>> sense depends on your experimental design. Most obviously, is one
of the
>>> the
>>> three diets a 'control' diet? If not then would it be appropriate
to
>>> consider an average of the three diets a kind of meta-control
(probably
>>> not
>>> a word, but hopefully you know what I mean!)?
>>>
>>> --
>>> Alex Gutteridge
>>>
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> Alex Gutteridge