Entering edit mode
Dear Gordon and All,
I have a question about creating the proper design matrix for my data.
I am using Affymetrix GeneChips and the experimental design is like
that:
- We have two groups, which are "normal" and "diseased". For each
person in
each group, we collected tissues from two zones of the same organ. So
my
data looks like this:
- normal
tissue1
tissue2
- diseased
tissue1
tissue2
So these are paired samples.
Our questions are as follows:
1- Which genes are expressed differentially because of being normal or
diseased. (the effect of being normal and diseased on gene expression)
2- Which genes are expressed differentially because of being tissue1
or
tissue2. (the effect of being tissue type 1 or 2 on gene expression)
3- Which genes are expressed differentially in normal group between
tissue1
and tissue2
4- Which genes are expressed differentially in diseased group between
tissue1 and tissue2
For this purpose, I prepared a targets file in which I encoded the
samples
like I indicated. Which looks like:
SlideNumber FileName Condition TissueType 1 AF01L.CEL AF LA 2
AF01R.CEL
AF RA 3 AF02L.CEL AF LA 4 AF02R.CEL AF RA 5 AF03L.CEL AF LA 6
AF03R.CEL
AF RA
.
.
.
42 SR10L.CEL SR LA 43 SR10R.CEL SR RA 44 SR11L.CEL SR LA
There are 45 CEL files, 22 with pairs and one single (would this be a
problem?Should I take out this sample?).
So I followed the code in limma users guide section 8.3 Paired
Samples,
which is:
SibShip <- factor(targets$TissueType)
Treat <- factor(targets$Condition, levels=c("AF","SR"))
design <- model.matrix(~SibShip+Treat)
fit <- lmFit(eset, design)
fit <- eBayes(fit)
With this, I think :) I formed a model with two factors: Condition as
the
main factor, and TissueType as the blocking factor.
After fitting, my coefficients in MArrayLM is
(Intercept) SibShipRA TreatSR
1007_s_at 7.828460 0.15501147 0.232908384
1053_at 3.868841 -0.01223435 -0.134041754
117_at 3.016848 0.08888230 0.197765562
121_at 4.721647 -0.04720335 0.052025910
1255_g_at 2.520547 -0.07208771 0.009031277
.
.
.
So;
1- For my first question, which questions the effect of the main
factor
"Condition", (saved as Treat <- factor(targets$Condition,
levels=c("AF","SR")))
topTable(fit, coef="TreatSR")
gives me the answer. Is it true?
2- Same for the blocking factor,
topTable(fit, coef="SibShipRA")
gives me the answer for TissueType effect. Is this also true?
>From previous correspondences I concluded that I could not get
answers for
my questions 3 and 4 with this model. And it might be meaningful to
add an
interaction term to the model. So for 3 and 4, I tried this:
design<-model.matrix(~Sibship + Treat + Sibship:Treat)
(because I thought Treat is varying faster, but I don't know how to
check
it.)
After applying this,
1- Results for Treat drastically changed. I got nearly completely
different
top genes with topTable
2- Results for SibShip also changed, not that much, but p values were
higher.
3- The fold change values for the same genes were different between
two
models.
So I am suspicious. Am I doing something wrong here, or am I missing
something?
Any help would be great!
Thank you in advance,
Zeynep
[[alternative HTML version deleted]]