Hello everybody,
I have started to use the limma package for a couple of two-colour
hybridization slides RNA/RNA... can anybody tell me if I can still use
it
When I have features in the array at one spacing (controls) and other
features at another spacing? Or do I have to modify my gpr file?
(generated by genepix)
And a second issue
I want to compare the above RNA(mutant)/RNA(wild type) direct method
with the option: RNA(mutant)/gDNA plus RNA(wildtype)/gDNA . Could
anyone
advice me in type of package to carry out this type of normalization?
(RNA/gDNA gives a very skewed kind of data)
Thank you very much in advance
Mar?a
Hello!
I have a question concerning data interpretation in LIMMA. First of
all I'll give you the commands:
contrast.matrix<-makeContrasts(APC-ISCH,IPC-ISCH,levels=design)
fitPRECOND2<-contrasts.fit(fit,contrast.matrix)
fitPRECOND2<-eBayes(fitPRECOND2)
clas3<-classifyTestsP(fitPRECOND2,p.value=0.01,method="fdr")
vennDiagram(clas3,include="up")
vennDiagram(clas3,include="down")
So far, no problem! I've got nice Venn diagrams. After that my input
was:
topTable(fitPRECOND2,number=20,genelist=NULL,coef=1,adjust="fdr")
which yielded:
M t P.Value B
2380 -1.0241489 -6.058845 0.01147946 5.0373154
379 -0.8440311 -5.464009 0.02976549 3.6259203
2382 -0.9576517 -5.261334 0.02976549 3.1387221
8436 -0.6746846 -5.216491 0.02976549 3.0306287
2383 -0.7343412 -5.064616 0.03638846 2.6639293
7974 -1.0281853 -4.889561 0.04943332 2.2404780
7925 -0.8519961 -4.833610 0.04952967 2.1050515
2384 -0.7728411 -4.698536 0.05865728 1.7781412
432 0.5732701 4.656867 0.05865728 1.6773365
7680 0.5717780 4.640318 0.05865728 1.6373117
7927 -0.7303740 -4.598890 0.05865728 1.5371476
2569 0.5033168 4.579322 0.05865728 1.4898512
1688 0.6904956 4.540636 0.06028835 1.3963860
6603 0.5590289 4.506030 0.06162635 1.3128274
5487 -0.7299432 -4.425216 0.07196002 1.1179075
3247 0.6442329 4.377510 0.07698329 1.0030059
5501 -0.8589314 -4.348990 0.07839810 0.9343814
7434 0.7573137 4.293568 0.07860314 0.8011835
7230 0.6309675 4.277457 0.07860314 0.7625059
8527 -0.8768646 -4.252795 0.07860314 0.7033388
It occurred to me that none of the p-palues in the ranking was below
p=0.01. Why were there significantly differentially regulated genes in
my Venn diagrams (constructed using p=0.01)?
Second question:
Do the numbers in the first column correspont to the row number in the
eset (expression set, I used RMA to do the analysis)?
THANKS IN ADVANCE!
ELIANA
===============================
Eliana Lucchinetti Zaugg, PhD
Institute of Pharmacology and Toxicology
Section of Cardiovascular Research-Room 17 J 28
University of Zurich
Winterthurerstr. 190
CH-8057 Zürich (Switzerland)
Phone: +41-1-635 59 18
Fax: +41-1-635 68 71
e-mail: eliana.zaugg@pharma.unizh.ch
[[alternative HTML version deleted]]
At 11:13 PM 23/03/2004, Eliana Lucchinetti Zaugg wrote:
>Hello!
>I have a question concerning data interpretation in LIMMA. First of
all
>I'll give you the commands:
>
>contrast.matrix<-makeContrasts(APC-ISCH,IPC-ISCH,levels=design)
>fitPRECOND2<-contrasts.fit(fit,contrast.matrix)
>fitPRECOND2<-eBayes(fitPRECOND2)
>clas3<-classifyTestsP(fitPRECOND2,p.value=0.01,method="fdr")
>vennDiagram(clas3,include="up")
>vennDiagram(clas3,include="down")
>
>So far, no problem! I've got nice Venn diagrams. After that my input
was:
>
>topTable(fitPRECOND2,number=20,genelist=NULL,coef=1,adjust="fdr")
>
>which yielded:
>
> M t P.Value B
>2380 -1.0241489 -6.058845 0.01147946 5.0373154
>379 -0.8440311 -5.464009 0.02976549 3.6259203
>2382 -0.9576517 -5.261334 0.02976549 3.1387221
>8436 -0.6746846 -5.216491 0.02976549 3.0306287
>2383 -0.7343412 -5.064616 0.03638846 2.6639293
>7974 -1.0281853 -4.889561 0.04943332 2.2404780
>7925 -0.8519961 -4.833610 0.04952967 2.1050515
>2384 -0.7728411 -4.698536 0.05865728 1.7781412
>432 0.5732701 4.656867 0.05865728 1.6773365
>7680 0.5717780 4.640318 0.05865728 1.6373117
>7927 -0.7303740 -4.598890 0.05865728 1.5371476
>2569 0.5033168 4.579322 0.05865728 1.4898512
>1688 0.6904956 4.540636 0.06028835 1.3963860
>6603 0.5590289 4.506030 0.06162635 1.3128274
>5487 -0.7299432 -4.425216 0.07196002 1.1179075
>3247 0.6442329 4.377510 0.07698329 1.0030059
>5501 -0.8589314 -4.348990 0.07839810 0.9343814
>7434 0.7573137 4.293568 0.07860314 0.8011835
>7230 0.6309675 4.277457 0.07860314 0.7625059
>8527 -0.8768646 -4.252795 0.07860314 0.7033388
>
>It occurred to me that none of the p-palues in the ranking was below
>p=0.01. Why were there significantly differentially regulated genes
in my
>Venn diagrams (constructed using p=0.01)?
This is a good question. The problem is that classifyTestsP() is doing
two
things which are probably unexpected. The first thing is really an
error on
my part. classifyTestsP() command is not extracting the degrees of
freedom
from the fitted model object, so it is computing p-values on the basis
of
normal distributions rather than t-distributions. You could give it
the
correct degrees of freedom as follows:
> df <- fitPRECOND2$df.residual + fitPRECOND2$df.prior
> clas3<-classifyTestsP(fitPRECOND2, df=df,
p.value=0.01,method="fdr")
The second thing is more important. The three functions
classifyTestsP(),
classifyTestsT() and classifyTest() are all designed to control the
false
discovery rate *across contrasts* rather than across genes. This means
that
the 'fdr' adjustment is applied across rows of the p-values rather
than
down the columns. The idea is that you can do false discovery rate
control
across genes separately and simply input the resulting p-value that
you
want to cut on. You are supposed to change the p-value that you give
to
classifyTestsP() until you get the number of significant results that
you
want to have, e.g., to make one of the columns agree with topTable().
The above means that you can't get topTable() and classifyTestsP() to
agree
exactly. I agree that this is unintuitive and I'll give some thought
to
making a version of classifyTestsP() that can agree with topTable().
>Second question:
>Do the numbers in the first column correspont to the row number in
the
>eset (expression set, I used RMA to do the analysis)?
Yes. And to get gene names you could use
topTable(fitPRECOND2,number=20,genelist=geneNames(eset),coef=1,adjust=
"fdr")
Gordon
>THANKS IN ADVANCE!
>ELIANA
>
>===============================
>Eliana Lucchinetti Zaugg, PhD
>Institute of Pharmacology and Toxicology
>Section of Cardiovascular Research-Room 17 J 28
>University of Zurich
>Winterthurerstr. 190
>CH-8057 Z?rich (Switzerland)
>
>Phone: +41-1-635 59 18
>Fax: +41-1-635 68 71
>e-mail: eliana.zaugg@pharma.unizh.ch
Dear all,
In order to perform eBayes command in R, I should log-transfrom the
raw
data so that to be normalized. However, I am not sure if I should also
standardize them in order to find significant expression with eb$t
values.
Is that the case?
Standardization is the correct procedure or it is enough just to
subtract
log(intensity)- row logmean (without dividing by std dev)?
The SAM procedure requires the same steps?
Thanks in advance,
Makis Motakis
----------------------
E Motakis, Mathematics
E.Motakis@bristol.ac.uk
Hi all,
It has been a while since I have used the affy package. I have done
everything in the help manual and installed the HGU133 plus2 cdf
environment. When I start the program I put in the correct directory
and
load the affy library successfully. When I try to import the cel files
using
the following command
data<-ReadAffy()
I get the error message:
Error in read.affybatch(filenames = filenames, phenoData = phenoData,
:
The file C:/Documents and Settings/smoore/Desktop/nat evbr cel
files/BR1.CEL does not look like a CEL file
These files were taken from GCOS, does anyone know what may be
happening
here, am I maybe doing something wrong that I haven't noticed? any
help
would be greatly appreciated.
Many Thanks
Steve.
Dr. Stephen Moore
Dept. of Oncology
Queens University of Belfast
U-Floor
Belfast City Hospital
Belfast
N.Ireland
BT9 7AB
Install version 1.3.28 of the affy package. It handles the binary
format
cel files (which is what you get from GCOS).
Ben
On Thu, 2004-03-25 at 09:11, Steve Moore wrote:
> Hi all,
>
> It has been a while since I have used the affy package. I have done
> everything in the help manual and installed the HGU133 plus2 cdf
> environment. When I start the program I put in the correct
directory and
> load the affy library successfully. When I try to import the cel
files using
> the following command
>
> data<-ReadAffy()
>
> I get the error message:
>
> Error in read.affybatch(filenames = filenames, phenoData =
phenoData, :
> The file C:/Documents and Settings/smoore/Desktop/nat evbr
cel
> files/BR1.CEL does not look like a CEL file
>
> These files were taken from GCOS, does anyone know what may be
happening
> here, am I maybe doing something wrong that I haven't noticed? any
help
> would be greatly appreciated.
>
> Many Thanks
>
> Steve.
>
>
>
> Dr. Stephen Moore
> Dept. of Oncology
> Queens University of Belfast
> U-Floor
> Belfast City Hospital
> Belfast
> N.Ireland
> BT9 7AB
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
--
Ben Bolstad <bolstad@stat.berkeley.edu>
http://www.stat.berkeley.edu/~bolstad
Hello everyone.
I would really appreciate some comments/hints/help with a pretty long
question.
I have an experiment consisting of 18 hybridizations. On the 30K cDNA
arrays knee joint bioipsies (from different patients) before and after
a
certain treatment is hybridized. What I want to find out is the effect
of
the treatment, not the difference between the patients. The problem is
how
to deal with different levels of replicates and how to create a
correct
target file since I have no common reference?
This is how the experimental set-up looks like.
Patient Hybridization Cy3 Cy5
1 1A Biopsy 1 before
treatment Biopsy 1 after treatment
1B Biopsy 1 after
treatment Biopsy 1 before treatment
3 2A Biopsy 1 before
treatment Biopsy 1 after treatment
2B Biopsy 1 after
treatment Biopsy 1 before treatment
3A Biopsy 2 before
treatment Biopsy 2 after treatment
3B Biopsy 2 after
treatment Biopsy 2 before treatment
4 4A Biopsy 1 before
treatment Biopsy 1 after treatment
4B Biopsy 1 after
treatment Biopsy 1 before treatment
5A Biopsy 2 before
treatment Biopsy 2 after treatment
5B Biopsy 2 after
treatment Biopsy 2 before treatment
5 6A Biopsy 1 before
treatment Biopsy 1 after treatment
6B Biopsy 1 after
treatment Biopsy 1 before treatment
6 7A Biopsy 1 before
treatment Biopsy 1 after treatment
7B Biopsy 1 after
treatment Biopsy 1 before treatment
7 8A Biopsy 1 before
treatment Biopsy 1 after treatment
8B Biopsy 1 after
treatment Biopsy 1 before treatment
10 9A Biopsy 1 before
treatment Biopsy 1 after treatment
9B Biopsy 1 after
treatment Biopsy 1 before treatment
As you can see different patients have one or two biopsies taken from
them.
Since I realize it would be a mistake to include all those into the
target
file because if I have more measurements of a certain patient that
would
bias the ranking of the B-stat towards the patient having the most
biopsies
in the end, right? Or?
Since the differentially expressed genes in the patient with more
biopsies
will get smaller variance?
My solution to the problem was just to create an artificial Mmatrix
twice
as long as the original MA object. For the patients with two biopsies
I
averaged over the technical replicates (dye-swaps) and put the values
from
biopsy one and then the values from biopsy two in the matrix. From
patients
with just a technical replicate I put the values from hybridization 1A
and
then hybridization 1B into the matrix.
The M-values of that matrix object would look something like:
patient
1 patient3 ....
Rows 1-30000 Hybridization 1A Average of hybridization 2A
and
2B ....
Rows 30001-60000 Hybridization 1B Average of
hybridization 3A
and 3B ....
After this I plan to use dupcor on the new matrix of M-values, as if I
would have a slide with replicate spots on it.
So far so good or? Is this a good way of treating replicates on
different
levels or has anyone else some better idea of how to do this. Comments
please.....
And now, how to create a correct targets file since I have no common
reference.
I guess it would look something like this:
SlideNumber Name FileName Cy3 Cy5
1 pat1_p test1.gpr Before_p1 After_p1
2 pat3_p test2.gpr Before_p2 After_p2
3 pat4_p test3.gpr Before_p3 After_p3
4 pat6_p test4.gpr Before_p4 After_p4
5 pat7_p test5.gpr Before_p5 After_p5
6 pat10_p test6.gpr Before_p6 After_p6
But when I want to make my contrast matrix I am lost since I do not
have
anything to write as ref.
design <- modelMatrix(targets, ref="????????")
If I redo the matrix to
SlideNumber Name FileName Cy3 Cy5
1 pat1_p test1.gpr Before_p After_p
2 pat3_p test2.gpr Before_p After_p
3 pat4_p test3.gpr Before_p After_p
4 pat6_p test4.gpr Before_p After_p
5 pat7_p test5.gpr Before_p After_p
6 pat10_p test6.gpr Before_p After_p
wouldnt that be the same as treating this as a common reference design
when
it is not? And wouldnt that effect the variance of the experiment? How
do I
do this in a correct way.
I looked at the Zebra fish example in the LIMMA user guide but isnt
that
wrong as well. Because technical and biological replicates are treated
the
same way in the targets file of the zebra fish.
I realize that many of these questions should have been considered
before
conducting the lab part but unfortunately they were not. So I will not
be
surprised if someone sends me the same quote as I got yesterday from a
friend:
"To consult a statistician after an experiment is finished is often
merely
to ask him to conduct a post mortem examination. He can perhaps say
what
the experiment died of."
- R.A. Fisher, Presidential Address to the First Indian Statistical
Congress, 1938
Best regards
/Johan Lindberg