Hi list;
I met a strange problem regarding the normalization methods,
For an experiment with 24 arrays (time order), I normalized the data
by
both RMA and GCRMA. Then I tested the correlation between the
normalized
data for each gene. Surprisingly, I found that about 25% genes with
correlation less than 0.7 between value normalized by RMA and GCRMA,
and
only less than 50% genes have correlation >0.9. I studies the profile
of
some genes, they look quite different under two methods.
Anybody met this problem before? Which method we should trust? Any
comments/idea is appreciated. Or is it possible that I did something
wrong, I couldn't find it myself.
Thanks a lot!
Fangxin
--
Fangxin Hong, Ph.D.
Plant Biology Laboratory
The Salk Institute
10010 N. Torrey Pines Rd.
La Jolla, CA 92037
E-mail: fhong@salk.edu
Hi Fangxin,
do you expect that 100% of the genes that are assayed by your chips
are
expressed all the time in the system you are investigating? (you never
told us which chips and which plant or animal)
And if not - say if only 50% of genes are expressed, then the data for
the remaining 50% should just be pure noise and there is no reason why
intensities from RMA and GCRMA should be correlated.
I think you have just learned something about your measurement
instrument (and this has little to do with normalization methods).
Best wishes
Wolfgang
Fangxin Hong wrote:
> Hi list;
> I met a strange problem regarding the normalization methods,
>
> For an experiment with 24 arrays (time order), I normalized the data
by
> both RMA and GCRMA. Then I tested the correlation between the
normalized
> data for each gene. Surprisingly, I found that about 25% genes with
> correlation less than 0.7 between value normalized by RMA and GCRMA,
and
> only less than 50% genes have correlation >0.9. I studies the
profile of
> some genes, they look quite different under two methods.
>
>
> Anybody met this problem before? Which method we should trust? Any
> comments/idea is appreciated. Or is it possible that I did something
> wrong, I couldn't find it myself.
-------------------------------------
Wolfgang Huber
European Bioinformatics Institute
European Molecular Biology Laboratory
Cambridge CB10 1SD
England
Phone: +44 1223 494642
Fax: +44 1223 494486
Http: www.ebi.ac.uk/huber
Thank you. Actually I just found this out from one of my tests, genes
with
low correlation are all in the low intensity end. I am thinking
actually
this give me clue to delecte those non-expressed genes from further
study.
This is a hrad evidence that we should filter genes first.
Thanks.
Fangxin
> Hi Fangxin,
>
> do you expect that 100% of the genes that are assayed by your chips
are
> expressed all the time in the system you are investigating? (you
never
> told us which chips and which plant or animal)
>
> And if not - say if only 50% of genes are expressed, then the data
for
> the remaining 50% should just be pure noise and there is no reason
why
> intensities from RMA and GCRMA should be correlated.
>
> I think you have just learned something about your measurement
> instrument (and this has little to do with normalization methods).
>
> Best wishes
> Wolfgang
>
> Fangxin Hong wrote:
>> Hi list;
>> I met a strange problem regarding the normalization methods,
>>
>> For an experiment with 24 arrays (time order), I normalized the
data by
>> both RMA and GCRMA. Then I tested the correlation between the
normalized
>> data for each gene. Surprisingly, I found that about 25% genes with
>> correlation less than 0.7 between value normalized by RMA and
GCRMA, and
>> only less than 50% genes have correlation >0.9. I studies the
profile of
>> some genes, they look quite different under two methods.
>>
>>
>> Anybody met this problem before? Which method we should trust? Any
>> comments/idea is appreciated. Or is it possible that I did
something
>> wrong, I couldn't find it myself.
>
>
> -------------------------------------
> Wolfgang Huber
> European Bioinformatics Institute
> European Molecular Biology Laboratory
> Cambridge CB10 1SD
> England
> Phone: +44 1223 494642
> Fax: +44 1223 494486
> Http: www.ebi.ac.uk/huber
> -------------------------------------
>
>
--
Fangxin Hong, Ph.D.
Plant Biology Laboratory
The Salk Institute
10010 N. Torrey Pines Rd.
La Jolla, CA 92037
E-mail: fhong@salk.edu
I know very little biology but my biologist collaborators are usually
more interested in low signal genes, so you might want to think
carefully before deleting the genes with low correlation.
Furthermore, if you compared expressions from RMA (or GCRMA) with MAS
5.0, I believe you might find similar results. i.e. Good correlation
among high signal genes but poor correlation for low signal genes.
You results might be simply saying that the RMA and GCRMA expression
measures are very similar for high signal genes but they differ for
low
signal genes.
Regards, Adai
On Fri, 2005-03-04 at 13:32 -0800, Fangxin Hong wrote:
> Thank you. Actually I just found this out from one of my tests,
genes with
> low correlation are all in the low intensity end. I am thinking
actually
> this give me clue to delecte those non-expressed genes from further
study.
>
> This is a hrad evidence that we should filter genes first.
>
> Thanks.
> Fangxin
>
>
>
> > Hi Fangxin,
> >
> > do you expect that 100% of the genes that are assayed by your
chips are
> > expressed all the time in the system you are investigating? (you
never
> > told us which chips and which plant or animal)
> >
> > And if not - say if only 50% of genes are expressed, then the data
for
> > the remaining 50% should just be pure noise and there is no reason
why
> > intensities from RMA and GCRMA should be correlated.
> >
> > I think you have just learned something about your measurement
> > instrument (and this has little to do with normalization methods).
> >
> > Best wishes
> > Wolfgang
> >
> > Fangxin Hong wrote:
> >> Hi list;
> >> I met a strange problem regarding the normalization methods,
> >>
> >> For an experiment with 24 arrays (time order), I normalized the
data by
> >> both RMA and GCRMA. Then I tested the correlation between the
normalized
> >> data for each gene. Surprisingly, I found that about 25% genes
with
> >> correlation less than 0.7 between value normalized by RMA and
GCRMA, and
> >> only less than 50% genes have correlation >0.9. I studies the
profile of
> >> some genes, they look quite different under two methods.
> >>
> >>
> >> Anybody met this problem before? Which method we should trust?
Any
> >> comments/idea is appreciated. Or is it possible that I did
something
> >> wrong, I couldn't find it myself.
> >
> >
> > -------------------------------------
> > Wolfgang Huber
> > European Bioinformatics Institute
> > European Molecular Biology Laboratory
> > Cambridge CB10 1SD
> > England
> > Phone: +44 1223 494642
> > Fax: +44 1223 494486
> > Http: www.ebi.ac.uk/huber
> > -------------------------------------
> >
> >
>
>
In our lab, we are using Affy ATH1 chip to study Arabidopsis circadian
pattern (time course data).
What we found out is, for genes with low intensities, the normalized
profile from RMA and GCRMA differ quite a lot. The peak time and
pattern
of change are so different that you won't believe that two profiles
are
actually from the same gene. Thus it is no way to draw a conclusion
about
this gene. However, is there any good way to delete the genes with low
intensities beside MAS5.0 call?
Bests;
Fangxin
> I know very little biology but my biologist collaborators are
usually
> more interested in low signal genes, so you might want to think
> carefully before deleting the genes with low correlation.
>
> Furthermore, if you compared expressions from RMA (or GCRMA) with
MAS
> 5.0, I believe you might find similar results. i.e. Good correlation
> among high signal genes but poor correlation for low signal genes.
>
> You results might be simply saying that the RMA and GCRMA expression
> measures are very similar for high signal genes but they differ for
low
> signal genes.
>
> Regards, Adai
>
>
>
> On Fri, 2005-03-04 at 13:32 -0800, Fangxin Hong wrote:
>> Thank you. Actually I just found this out from one of my tests,
genes
>> with
>> low correlation are all in the low intensity end. I am thinking
actually
>> this give me clue to delecte those non-expressed genes from further
>> study.
>>
>> This is a hrad evidence that we should filter genes first.
>>
>> Thanks.
>> Fangxin
>>
>>
>>
>> > Hi Fangxin,
>> >
>> > do you expect that 100% of the genes that are assayed by your
chips
>> are
>> > expressed all the time in the system you are investigating? (you
never
>> > told us which chips and which plant or animal)
>> >
>> > And if not - say if only 50% of genes are expressed, then the
data for
>> > the remaining 50% should just be pure noise and there is no
reason why
>> > intensities from RMA and GCRMA should be correlated.
>> >
>> > I think you have just learned something about your measurement
>> > instrument (and this has little to do with normalization
methods).
>> >
>> > Best wishes
>> > Wolfgang
>> >
>> > Fangxin Hong wrote:
>> >> Hi list;
>> >> I met a strange problem regarding the normalization methods,
>> >>
>> >> For an experiment with 24 arrays (time order), I normalized the
data
>> by
>> >> both RMA and GCRMA. Then I tested the correlation between the
>> normalized
>> >> data for each gene. Surprisingly, I found that about 25% genes
with
>> >> correlation less than 0.7 between value normalized by RMA and
GCRMA,
>> and
>> >> only less than 50% genes have correlation >0.9. I studies the
profile
>> of
>> >> some genes, they look quite different under two methods.
>> >>
>> >>
>> >> Anybody met this problem before? Which method we should trust?
Any
>> >> comments/idea is appreciated. Or is it possible that I did
something
>> >> wrong, I couldn't find it myself.
>> >
>> >
>> > -------------------------------------
>> > Wolfgang Huber
>> > European Bioinformatics Institute
>> > European Molecular Biology Laboratory
>> > Cambridge CB10 1SD
>> > England
>> > Phone: +44 1223 494642
>> > Fax: +44 1223 494486
>> > Http: www.ebi.ac.uk/huber
>> > -------------------------------------
>> >
>> >
>>
>>
>
>
>
--
Fangxin Hong, Ph.D.
Plant Biology Laboratory
The Salk Institute
10010 N. Torrey Pines Rd.
La Jolla, CA 92037
E-mail: fhong@salk.edu
Please help:
We have done a timecourse experiments (12 time points) using Affy ATH1
array (Arabidopsis) without replication.
The normalized expression profiles from RMA and GCRMA don't agree well
for
some genes, especially for genes with low intensity values. This might
means that those genes don't express or express at low level.
then I tried MAS5.0 P/A/M call, and delete genes with more than 30%
"A"
calls across time points. However, some of the remaining genes still
have
different profiles from RMA and GCRMA ( I use correlation between
normalized profiles by RMA and GCRMA as measurement).
Since I want to draw conclusion based on expression pattern for each
gene,
it seems that different normalization methods would change my
conclusion,
although the truth is only one.
Should I trust GCRMA better?
Or the disagreement between RAM and GCRMA means that this is no clear
pattern for that gene at all, thus only genes for which RMA and GCRMA
agree well should be identified.
Any suggestion is appreciated.
Fangxin
> I know very little biology but my biologist collaborators are
usually
> more interested in low signal genes, so you might want to think
> carefully before deleting the genes with low correlation.
>
> Furthermore, if you compared expressions from RMA (or GCRMA) with
MAS
> 5.0, I believe you might find similar results. i.e. Good correlation
> among high signal genes but poor correlation for low signal genes.
>
> You results might be simply saying that the RMA and GCRMA expression
> measures are very similar for high signal genes but they differ for
low
> signal genes.
>
> Regards, Adai
>
>
>
> On Fri, 2005-03-04 at 13:32 -0800, Fangxin Hong wrote:
>> Thank you. Actually I just found this out from one of my tests,
genes
>> with
>> low correlation are all in the low intensity end. I am thinking
actually
>> this give me clue to delecte those non-expressed genes from further
>> study.
>>
>> This is a hard evidence that we should filter genes first.
>>
>> Thanks.
>> Fangxin
>>
>>
>>
>> > Hi Fangxin,
>> >
>> > do you expect that 100% of the genes that are assayed by your
chips
>> are
>> > expressed all the time in the system you are investigating? (you
never
>> > told us which chips and which plant or animal)
>> >
>> > And if not - say if only 50% of genes are expressed, then the
data for
>> > the remaining 50% should just be pure noise and there is no
reason why
>> > intensities from RMA and GCRMA should be correlated.
>> >
>> > I think you have just learned something about your measurement
>> > instrument (and this has little to do with normalization
methods).
>> >
>> > Best wishes
>> > Wolfgang
>> >
>> > Fangxin Hong wrote:
>> >> Hi list;
>> >> I met a strange problem regarding the normalization methods,
>> >>
>> >> For an experiment with 24 arrays (time order), I normalized the
data
>> by
>> >> both RMA and GCRMA. Then I tested the correlation between the
>> normalized
>> >> data for each gene. Surprisingly, I found that about 25% genes
with
>> >> correlation less than 0.7 between value normalized by RMA and
GCRMA,
>> and
>> >> only less than 50% genes have correlation >0.9. I studies the
profile
>> of
>> >> some genes, they look quite different under two methods.
>> >>
>> >>
>> >> Anybody met this problem before? Which method we should trust?
Any
>> >> comments/idea is appreciated. Or is it possible that I did
something
>> >> wrong, I couldn't find it myself.
>> >
>> >
>> > -------------------------------------
>> > Wolfgang Huber
>> > European Bioinformatics Institute
>> > European Molecular Biology Laboratory
>> > Cambridge CB10 1SD
>> > England
>> > Phone: +44 1223 494642
>> > Fax: +44 1223 494486
>> > Http: www.ebi.ac.uk/huber
>> > -------------------------------------
>> >
>> >
>>
>>
>
>
>
--
Fangxin Hong, Ph.D.
Plant Biology Laboratory
The Salk Institute
10010 N. Torrey Pines Rd.
La Jolla, CA 92037
E-mail: fhong@salk.edu
Can somebody share some experience on how to use the
"globlatest" function in "globaltest" package?
Specifically I would like to use it to test the
association between genes in a pathway and survival. I
also have 6 covariates (phenotype variables, the
esprSet object already created with these variables)
for adjustment. After several tries, I could not make
it work (the documentation for the package does not
give details on this kind of analysis) and really
appreciate someone's help.
Thank you in advance.
Jeff Sun