Entering edit mode
You seem to have a major problem with your analysis, which is that you
don't seem to have incorporated the dye swaps into your design matrix
at
all. You define a vector call 'vector', but then make no use of it.
Perhaps
you should investigate the use of the functions in limma such as
modelMatrix() which use targets frame to construct design matrices.
As far as the 'block' argument is concerned, what makes you think that
it
is "supposed" to return simple averages? There wouldn't be much point
in
this functionality if it simply returned the same answers as you would
get
without it.
Gordon
>Date: Sun, 27 Feb 2005 16:12:53 -0800
>From: "xiaocui zhu" <xzhu@caltech.edu>
>Subject: [BioC] block design and how it handles missing values in
> Limma
>To: <bioconductor@stat.math.ethz.ch>
>
>Hello all,
>
>I have a cDNA array data sets collected from a time-course
experiment. The
>experiment design was similar to the following:
> 1)Treat cells with ligands A, ligand B, or a vector control
> 2)Harvest cells at 1h and 2h
> 3)Measure expression changes in treated cells relative to
>time-matched-controls using 2-color cDNA arrays with a dyeSwap design
(each
>treated and time-matched-control sample pairs were hybridized onto
two
>arrays with a dyeSwap).
>
>Step 1) to 3) were repeated three times, so that for each treatment
>condition, we have three biological and two technical repeats.
>
>I used Limma to identify differentially expressed genes in response
to each
>ligand, and genes differentially expressed in response to ligand A
vs. to
>ligand B at each time point. Since each time the experiment was
repeated,
>the cell preparation and other experimental conditions might vary
slightly,
>I thought that the data collected from one experiment can be
considered a
>block to account for the batch variance. Parts of the codes taking
into
>account the dyeSwap design and block factor are as the following:
>
>#Identify differentially expressed genes at each time point to each
ligand
>treatments <- factor(c(1,1,1,1,1,1, 2,2,2,2,2,2, 3,3,3,3,3,3,
4,4,4,4,4,4))
>vector<- c(1,-1,1,-1,1,-1, 1,-1,1,-1,1,-1, 1,-1,1,-1,1,-1,
>1,-1,1,-1,1,-1)
>design <- model.matrix(~ 0+treatments)
>colnames(design) <- c("A.1h","A.2h","B.1h","B.2h")
>fit<- lmFit(MA, design, block=c(rep(c(1,1,2,2,3,3),4)))
>efit <-eBayes(fit)
>for (i in 1:length(colnames(design))){
> output<-topTable(efit, coef = i, number=16200, adj="fdr")
> write.table(output, file = paste(colnames(design)[i],
".txt",
>sep=""), sep="\t")
> }
>
>When I examined the output files from the above codes, I was
concerned that
>the M value for some of the array features did not equal to the
average of
>the replicates, even though it's supposed to. This is only seen with
>features if a pair of its dyeSwap measurements had a "NA" value in
only one
>of the arrays. If both arrays of a dyeSwap pair gave a "NA" value for
the
>feature, the M value would still be equal to the average of
replicates as
>it's supposed to. This problem seemed to be caused by including the
block
>factor in the lmFit statement, because no such inconsistency was
found in
>the output if I removed the block factor. I don't know whether this
>inconsistency is due to some errors in my codes, or whether block
design
>somehow handles missing value in a dyeSwap pair differently.
>
>My questions are:
>1) Is it appropriate to use block design in my case?
>2) How does block design handles missing values of a dyeSwap pair?
Why do I
>see that in a block design, if a pair of dyeSwap measurements has
only one
>missing value for a feature, the M value of that feature does not
equal to
>the average of the replicates?
>
>Any help to this matter will be greatly appreciated!
>
>Xiaocui