Entering edit mode
Dear Pedro,
The strategy you are proposing is to ignore experimental factors
which you think will have relatively small effects, so as to generate
some degrees of freedom for error. This is an ok strategy, long used
in statistics, as long as you understand clearly what you are testing
for. If you do this, limma will try to find genes which have
differential expression which stands out relative to the effects you
have ignored.
Power is not the issue here. This approach is actually conservative,
in that the residual variability will be larger than if you had true
replicate arrays, hence you will find fewer DE genes than you might
otherwise.
Best wishes
Gordon
>Date: Fri, 31 Mar 2006 12:48:20 +0200
>From: Pedro L?pez Romero <plopez at="" cnic.es="">
>Subject: [BioC] using limma with no replicates
>To: <bioconductor at="" stat.math.ethz.ch="">
>
>Dear list,
>
>I have been given with some data to analyze. Unfortunately they only
gave 1
>replicate per experimental condition, so I do not expect to draw
meaningful
>information from here. Anyway, I would like to use limma, since I
expect
>that this could be more powerful than the mere inspection of the log2
fold
>change.
>
>Despite I do not have "true biological replicates", I think that I
can
>group (in the design matrix) some arrays as if they were replicates
>according to the correlations that I expect from the experimental
conditions
>and how the data have been generated. For example, I can group 2
arrays that
>belong to the same strain, although they have been treated a bit
different,
>or I can group 2 arrays that belong to the same strain and treatment
but
>different age of the mouse. This "grouped data" are not going to be
part of
>the contrast. My intention (and I do not know if it is right) is to
group
>some correlated data to have some degrees of freedom available to
make it
>possible the estimates of the variance, and then to make contrasts
with
>other 2 non replicated arrays.- I think that this would be somehow
more
>powerful than the log2 fold change inspection, since the information
is
>better handled trough the empirical Bayes that limma implements, but
I would
>feel better if someone back me up, because I am not pretty sure if
this is a
>good idea.
>
>
>Some piece of my code:
>
>design= model.matrix(~ -1 + factor(c(1,2,3,3,5,6,7,8)))
>colnames(design) =c("WT","upa","g1","f5","f6","f7","f8")
>
> here g1 groups the same strain (and different from
other
>strains), and same age of the mouse but slight different
pharmacologicall
>treatment, and I will compare f5 vs f6 (this are the same strain and
>different from g1, are the same age, but treatment are different)
>
>CM= makeContrasts(f5-f6,levels=design)
>
>
>Doing this, the M values that I observe in the top list are quite
high (>
>6), but the differences are not significant. I think that this is due
to the
>absence of replication in a very noisy sistem.
>
>ID M A t P.Value B
>23620 mCG147262 -9.0828928978708
> 7.04453315872284 -20.6287756557693
> -0.823196144084987
>19275 mCG1047122 -6.22956426050092
>.91829704792039 -15.5769614644597 1
-0.940793980765775
>
>If I use genefilter to filter out some genes, some genes appear
significant
>DE though. Would it be possible to explain this just by saying that
fdr-like
>techniques becomes more sensitive as less comparison are done??
>
>ID M A t P.Value B
>263 mCG142389 -7.97481171094547
>.73475871266083 -5.3168578969303 0.00832939443377308
>6.57330274986848
>6756 BC027122 -7.40473059624002
>.77564203692944 -4.93678117706839 0.0313305586976585
>4.89829085664067
>
>
>I would appreciate any comment or suggestion very much.-
>Thank you.
>
>plr.-