Dear All,
While looking at the Limma user guide, I came across
the following example
> targets <- readTargets("SwirlSample.txt")
> RG <- read.maimages(targets$FileName, source="spot")
> RG$genes <- readGAL()
> RG$printer <- getLayout(RG$genes)
> MA <- normalizeWithinArrays(RG)
> MA <- normalizeBetweenArrays(MA)
> fit <- lmFit(MA, design=c(-1,1,-1,1))
> fit <- eBayes(fit)
> options(digits=3)
> topTable(fit, n=30, adjust="fdr")
ID Name M A t P.Value B
control BMP2 -2.21 12.1 -21.1 0.000357 7.96
control BMP2 -2.30 13.1 -20.3 0.000357 7.78
control Dlx3 -2.18 13.3 -20.0 0.000357 7.71
control Dlx3 -2.18 13.5 -19.6 0.000357 7.62
fb94h06 20-L12 1.27 12.0 14.1 0.002067 5.78
fb40h07 7-D14 1.35 13.8 13.5 0.002067 5.54
I have omitted a few rows and columns.
Here we see that after all the data transformations,
we get an output where the ranking for the probes in
an array is done on the basis of the B value.
Notice that there are reapeating names for genes,
therefore for a set of replicates, within and across
arrays, each spot is reported separately as an
individual entity.
In the case of BMP2 from the above example, which
result do I consider?
Is there a way in which I can get a single result for
a set of replicates.
I am new to this package, so please do let me know if
there is a problem in my understanding the concept.
Thank you,
-Ankit
There are ways of combining replicate spots in limma, and it is all in
the user guide :-)
However, many people, myself included, prefer things reported on a
spot-by-spot basis. If all replicate spots for a particular gene are
reported as significant, I take that as further proof that i) the gene
is differentially expressed, ii) my arrays are of good quality, iii)
my experimental procedure was of good quality. Think about the case
where only one out of two spots is reported - is that because one of
the spots was of poor quality? Or because the values for each spot
differ by a lot? You would lose this valuable information if you just
took the average between replicates.
If you *really* want an average value for each spot, simply take the
average M value from the output of toTapble.
Mick
-----Original Message-----
From: bioconductor-bounces@stat.math.ethz.ch on behalf of Ankit Pal
Sent: Tue 10/05/2005 6:15 AM
To: bioconductor@stat.math.ethz.ch
Cc:
Subject: [BioC] Limma final gene expression report
Dear All,
While looking at the Limma user guide, I came across
the following example
> targets <- readTargets("SwirlSample.txt")
> RG <- read.maimages(targets$FileName, source="spot")
> RG$genes <- readGAL()
> RG$printer <- getLayout(RG$genes)
> MA <- normalizeWithinArrays(RG)
> MA <- normalizeBetweenArrays(MA)
> fit <- lmFit(MA, design=c(-1,1,-1,1))
> fit <- eBayes(fit)
> options(digits=3)
> topTable(fit, n=30, adjust="fdr")
ID Name M A t P.Value B
control BMP2 -2.21 12.1 -21.1 0.000357 7.96
control BMP2 -2.30 13.1 -20.3 0.000357 7.78
control Dlx3 -2.18 13.3 -20.0 0.000357 7.71
control Dlx3 -2.18 13.5 -19.6 0.000357 7.62
fb94h06 20-L12 1.27 12.0 14.1 0.002067 5.78
fb40h07 7-D14 1.35 13.8 13.5 0.002067 5.54
I have omitted a few rows and columns.
Here we see that after all the data transformations,
we get an output where the ranking for the probes in
an array is done on the basis of the B value.
Notice that there are reapeating names for genes,
therefore for a set of replicates, within and across
arrays, each spot is reported separately as an
individual entity.
In the case of BMP2 from the above example, which
result do I consider?
Is there a way in which I can get a single result for
a set of replicates.
I am new to this package, so please do let me know if
there is a problem in my understanding the concept.
Thank you,
-Ankit
_______________________________________________
Bioconductor mailing list
Bioconductor@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Dear Mick,
Thanks a lot for the reply.
I am interested in the spots individually but for
further analysis of the spots I need a single
representative value for each gene.
I have looked up the manual, I did not find a way to
combine replicate spots into a single value.
Could you tell me what is the method or which section
of the manual is it present in.
t will be of great help to me.
Thank you,
-Ankit
--- "michael watson (IAH-C)"
<michael.watson@bbsrc.ac.uk> wrote:
> There are ways of combining replicate spots in
> limma, and it is all in the user guide :-)
>
> However, many people, myself included, prefer things
> reported on a spot-by-spot basis. If all replicate
> spots for a particular gene are reported as
> significant, I take that as further proof that i)
> the gene is differentially expressed, ii) my arrays
> are of good quality, iii) my experimental procedure
> was of good quality. Think about the case where
> only one out of two spots is reported - is that
> because one of the spots was of poor quality? Or
> because the values for each spot differ by a lot?
> You would lose this valuable information if you just
> took the average between replicates.
>
> If you *really* want an average value for each spot,
> simply take the average M value from the output of
> toTapble.
>
> Mick
>
>
> -----Original Message-----
> From: bioconductor-bounces@stat.math.ethz.ch on
> behalf of Ankit Pal
> Sent: Tue 10/05/2005 6:15 AM
> To: bioconductor@stat.math.ethz.ch
> Cc:
> Subject: [BioC] Limma final gene expression report
>
> Dear All,
> While looking at the Limma user guide, I came across
> the following example
>
> > targets <- readTargets("SwirlSample.txt")
> > RG <- read.maimages(targets$FileName,
> source="spot")
>
> > RG$genes <- readGAL()
> > RG$printer <- getLayout(RG$genes)
> > MA <- normalizeWithinArrays(RG)
> > MA <- normalizeBetweenArrays(MA)
> > fit <- lmFit(MA, design=c(-1,1,-1,1))
> > fit <- eBayes(fit)
> > options(digits=3)
> > topTable(fit, n=30, adjust="fdr")
> ID Name M A t P.Value B
> control BMP2 -2.21 12.1 -21.1 0.000357 7.96
> control BMP2 -2.30 13.1 -20.3 0.000357 7.78
> control Dlx3 -2.18 13.3 -20.0 0.000357 7.71
> control Dlx3 -2.18 13.5 -19.6 0.000357 7.62
> fb94h06 20-L12 1.27 12.0 14.1 0.002067 5.78
> fb40h07 7-D14 1.35 13.8 13.5 0.002067 5.54
>
> I have omitted a few rows and columns.
> Here we see that after all the data transformations,
> we get an output where the ranking for the probes in
> an array is done on the basis of the B value.
> Notice that there are reapeating names for genes,
> therefore for a set of replicates, within and across
> arrays, each spot is reported separately as an
> individual entity.
> In the case of BMP2 from the above example, which
> result do I consider?
> Is there a way in which I can get a single result
> for
> a set of replicates.
> I am new to this package, so please do let me know
> if
> there is a problem in my understanding the concept.
> Thank you,
> -Ankit
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>
>
>
One other question--are these replicate spots (i.e., the same DNA) or
different oligos/clones for the same gene?
Sean
On May 10, 2005, at 5:59 AM, Ankit Pal wrote:
> Dear Mick,
> Thanks a lot for the reply.
> I am interested in the spots individually but for
> further analysis of the spots I need a single
> representative value for each gene.
> I have looked up the manual, I did not find a way to
> combine replicate spots into a single value.
> Could you tell me what is the method or which section
> of the manual is it present in.
> t will be of great help to me.
> Thank you,
> -Ankit
>
>
> --- "michael watson (IAH-C)"
> <michael.watson@bbsrc.ac.uk> wrote:
>> There are ways of combining replicate spots in
>> limma, and it is all in the user guide :-)
>>
>> However, many people, myself included, prefer things
>> reported on a spot-by-spot basis. If all replicate
>> spots for a particular gene are reported as
>> significant, I take that as further proof that i)
>> the gene is differentially expressed, ii) my arrays
>> are of good quality, iii) my experimental procedure
>> was of good quality. Think about the case where
>> only one out of two spots is reported - is that
>> because one of the spots was of poor quality? Or
>> because the values for each spot differ by a lot?
>> You would lose this valuable information if you just
>> took the average between replicates.
>>
>> If you *really* want an average value for each spot,
>> simply take the average M value from the output of
>> toTapble.
>>
>> Mick
>>
>>
>> -----Original Message-----
>> From: bioconductor-bounces@stat.math.ethz.ch on
>> behalf of Ankit Pal
>> Sent: Tue 10/05/2005 6:15 AM
>> To: bioconductor@stat.math.ethz.ch
>> Cc:
>> Subject: [BioC] Limma final gene expression report
>>
>> Dear All,
>> While looking at the Limma user guide, I came across
>> the following example
>>
>>> targets <- readTargets("SwirlSample.txt")
>>> RG <- read.maimages(targets$FileName,
>> source="spot")
>>
>>> RG$genes <- readGAL()
>>> RG$printer <- getLayout(RG$genes)
>>> MA <- normalizeWithinArrays(RG)
>>> MA <- normalizeBetweenArrays(MA)
>>> fit <- lmFit(MA, design=c(-1,1,-1,1))
>>> fit <- eBayes(fit)
>>> options(digits=3)
>>> topTable(fit, n=30, adjust="fdr")
>> ID Name M A t P.Value B
>> control BMP2 -2.21 12.1 -21.1 0.000357 7.96
>> control BMP2 -2.30 13.1 -20.3 0.000357 7.78
>> control Dlx3 -2.18 13.3 -20.0 0.000357 7.71
>> control Dlx3 -2.18 13.5 -19.6 0.000357 7.62
>> fb94h06 20-L12 1.27 12.0 14.1 0.002067 5.78
>> fb40h07 7-D14 1.35 13.8 13.5 0.002067 5.54
>>
>> I have omitted a few rows and columns.
>> Here we see that after all the data transformations,
>> we get an output where the ranking for the probes in
>> an array is done on the basis of the B value.
>> Notice that there are reapeating names for genes,
>> therefore for a set of replicates, within and across
>> arrays, each spot is reported separately as an
>> individual entity.
>> In the case of BMP2 from the above example, which
>> result do I consider?
>> Is there a way in which I can get a single result
>> for
>> a set of replicates.
>> I am new to this package, so please do let me know
>> if
>> there is a problem in my understanding the concept.
>> Thank you,
>> -Ankit
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor@stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>
>>
>>
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
There are different oliogos of the same gene and
replicates with the same sequence present on the
array.
Splice variants have been mapped to the same Accession
number but I'm working on that to separate out the
same by naming them differently.
-Ankit
--- Sean Davis <sdavis2@mail.nih.gov> wrote:
> One other question--are these replicate spots (i.e.,
> the same DNA) or
> different oligos/clones for the same gene?
>
> Sean
>
> On May 10, 2005, at 5:59 AM, Ankit Pal wrote:
>
> > Dear Mick,
> > Thanks a lot for the reply.
> > I am interested in the spots individually but for
> > further analysis of the spots I need a single
> > representative value for each gene.
> > I have looked up the manual, I did not find a way
> to
> > combine replicate spots into a single value.
> > Could you tell me what is the method or which
> section
> > of the manual is it present in.
> > t will be of great help to me.
> > Thank you,
> > -Ankit
> >
> >
> > --- "michael watson (IAH-C)"
> > <michael.watson@bbsrc.ac.uk> wrote:
> >> There are ways of combining replicate spots in
> >> limma, and it is all in the user guide :-)
> >>
> >> However, many people, myself included, prefer
> things
> >> reported on a spot-by-spot basis. If all
> replicate
> >> spots for a particular gene are reported as
> >> significant, I take that as further proof that i)
> >> the gene is differentially expressed, ii) my
> arrays
> >> are of good quality, iii) my experimental
> procedure
> >> was of good quality. Think about the case where
> >> only one out of two spots is reported - is that
> >> because one of the spots was of poor quality? Or
> >> because the values for each spot differ by a lot?
> >> You would lose this valuable information if you
> just
> >> took the average between replicates.
> >>
> >> If you *really* want an average value for each
> spot,
> >> simply take the average M value from the output
> of
> >> toTapble.
> >>
> >> Mick
> >>
> >>
> >> -----Original Message-----
> >> From: bioconductor-bounces@stat.math.ethz.ch on
> >> behalf of Ankit Pal
> >> Sent: Tue 10/05/2005 6:15 AM
> >> To: bioconductor@stat.math.ethz.ch
> >> Cc:
> >> Subject: [BioC] Limma final gene expression
> report
> >>
> >> Dear All,
> >> While looking at the Limma user guide, I came
> across
> >> the following example
> >>
> >>> targets <- readTargets("SwirlSample.txt")
> >>> RG <- read.maimages(targets$FileName,
> >> source="spot")
> >>
> >>> RG$genes <- readGAL()
> >>> RG$printer <- getLayout(RG$genes)
> >>> MA <- normalizeWithinArrays(RG)
> >>> MA <- normalizeBetweenArrays(MA)
> >>> fit <- lmFit(MA, design=c(-1,1,-1,1))
> >>> fit <- eBayes(fit)
> >>> options(digits=3)
> >>> topTable(fit, n=30, adjust="fdr")
> >> ID Name M A t P.Value B
> >> control BMP2 -2.21 12.1 -21.1 0.000357
> 7.96
> >> control BMP2 -2.30 13.1 -20.3 0.000357
> 7.78
> >> control Dlx3 -2.18 13.3 -20.0 0.000357
> 7.71
> >> control Dlx3 -2.18 13.5 -19.6 0.000357
> 7.62
> >> fb94h06 20-L12 1.27 12.0 14.1 0.002067
> 5.78
> >> fb40h07 7-D14 1.35 13.8 13.5 0.002067
> 5.54
> >>
> >> I have omitted a few rows and columns.
> >> Here we see that after all the data
> transformations,
> >> we get an output where the ranking for the probes
> in
> >> an array is done on the basis of the B value.
> >> Notice that there are reapeating names for genes,
> >> therefore for a set of replicates, within and
> across
> >> arrays, each spot is reported separately as an
> >> individual entity.
> >> In the case of BMP2 from the above example, which
> >> result do I consider?
> >> Is there a way in which I can get a single result
> >> for
> >> a set of replicates.
> >> I am new to this package, so please do let me
> know
> >> if
> >> there is a problem in my understanding the
> concept.
> >> Thank you,
> >> -Ankit
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor@stat.math.ethz.ch
> >>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>
> >>
> >>
> >>
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor@stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>
Discover Yahoo!
Stay in touch with email, IM, photo sharing and more. Check it out!
On May 10, 2005, at 6:54 AM, Ankit Pal wrote:
> There are different oliogos of the same gene and
> replicates with the same sequence present on the
> array.
> Splice variants have been mapped to the same Accession
> number but I'm working on that to separate out the
> same by naming them differently.
> -Ankit
>
So, for the different oligos/splice variant spots, averaging is
probably NOT a good idea in most situations--they could be measuring
quite different things. When there is a discrepancy, it MAY be due to
array issues (quality of one spot, for instance), but it could also be
a true finding; determining which is which often requires some human
intervention.
Perhaps others will weigh in on the issue, but I don't think ad-hoc
averaging of truly duplicated spots really serves a purpose, either.
Unless your array design allows you to treat the duplicated spots at
the level of the analysis (lmFit, in limma) and not post-processing, I
don't think you benefit much from averaging (i.e., you don't have more
"confidence" in a gene that has been averaged).
Sean
Dear Ankit,
In addition to Mick's arguments, you also need to be careful when
averaging
over spots if you use the empirical bayes methods in limma, and you
have
different number of replicates for different genes. If, for example,
gene A
has 1 spot but gene B has four replicated spots, then the variance of
the
mean of the replicates is very different for gene B and for gene A,
and would
violate some of the assumptions behind the Empirical Bayes procedure
in
limma. I think this is mentioned in the user guide (and/or Gordon
Smyth's
paper), and has also been mentioned in this list.
If you really want an average, you can do as Mick suggests:
> > If you *really* want an average value for each spot,
> > simply take the average M value from the output of
> > toTapble.
you can use something like
tapply(the.top.table.M.values, the.top.table.gene.identifiers, mean)
Best,
R.
On Tuesday 10 May 2005 11:59, Ankit Pal wrote:
> Dear Mick,
> Thanks a lot for the reply.
> I am interested in the spots individually but for
> further analysis of the spots I need a single
> representative value for each gene.
> I have looked up the manual, I did not find a way to
> combine replicate spots into a single value.
> Could you tell me what is the method or which section
> of the manual is it present in.
> t will be of great help to me.
> Thank you,
> -Ankit
>
>
> --- "michael watson (IAH-C)"
>
> <michael.watson@bbsrc.ac.uk> wrote:
> > There are ways of combining replicate spots in
> > limma, and it is all in the user guide :-)
> >
> > However, many people, myself included, prefer things
> > reported on a spot-by-spot basis. If all replicate
> > spots for a particular gene are reported as
> > significant, I take that as further proof that i)
> > the gene is differentially expressed, ii) my arrays
> > are of good quality, iii) my experimental procedure
> > was of good quality. Think about the case where
> > only one out of two spots is reported - is that
> > because one of the spots was of poor quality? Or
> > because the values for each spot differ by a lot?
> > You would lose this valuable information if you just
> > took the average between replicates.
> >
> > If you *really* want an average value for each spot,
> > simply take the average M value from the output of
> > toTapble.
> >
> > Mick
> >
> >
> > -----Original Message-----
> > From: bioconductor-bounces@stat.math.ethz.ch on
> > behalf of Ankit Pal
> > Sent: Tue 10/05/2005 6:15 AM
> > To: bioconductor@stat.math.ethz.ch
> > Cc:
> > Subject: [BioC] Limma final gene expression report
> >
> > Dear All,
> > While looking at the Limma user guide, I came across
> > the following example
> >
> > > targets <- readTargets("SwirlSample.txt")
> > > RG <- read.maimages(targets$FileName,
> >
> > source="spot")
> >
> > > RG$genes <- readGAL()
> > > RG$printer <- getLayout(RG$genes)
> > > MA <- normalizeWithinArrays(RG)
> > > MA <- normalizeBetweenArrays(MA)
> > > fit <- lmFit(MA, design=c(-1,1,-1,1))
> > > fit <- eBayes(fit)
> > > options(digits=3)
> > > topTable(fit, n=30, adjust="fdr")
> >
> > ID Name M A t P.Value B
> > control BMP2 -2.21 12.1 -21.1 0.000357 7.96
> > control BMP2 -2.30 13.1 -20.3 0.000357 7.78
> > control Dlx3 -2.18 13.3 -20.0 0.000357 7.71
> > control Dlx3 -2.18 13.5 -19.6 0.000357 7.62
> > fb94h06 20-L12 1.27 12.0 14.1 0.002067 5.78
> > fb40h07 7-D14 1.35 13.8 13.5 0.002067 5.54
> >
> > I have omitted a few rows and columns.
> > Here we see that after all the data transformations,
> > we get an output where the ranking for the probes in
> > an array is done on the basis of the B value.
> > Notice that there are reapeating names for genes,
> > therefore for a set of replicates, within and across
> > arrays, each spot is reported separately as an
> > individual entity.
> > In the case of BMP2 from the above example, which
> > result do I consider?
> > Is there a way in which I can get a single result
> > for
> > a set of replicates.
> > I am new to this package, so please do let me know
> > if
> > there is a problem in my understanding the concept.
> > Thank you,
> > -Ankit
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor@stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
--
Ram?n D?az-Uriarte
Bioinformatics Unit
Centro Nacional de Investigaciones Oncol?gicas (CNIO)
(Spanish National Cancer Center)
Melchor Fern?ndez Almagro, 3
28029 Madrid (Spain)
Fax: +-34-91-224-6972
Phone: +-34-91-224-6900
http://ligarto.org/rdiaz
PGP KeyID: 0xE89B3462
(http://ligarto.org/rdiaz/0xE89B3462.asc)
**NOTA DE CONFIDENCIALIDAD** Este correo electr?nico, y en su caso los
ficheros adjuntos, pueden contener informaci?n protegida para el uso
exclusivo de su destinatario. Se proh?be la distribuci?n, reproducci?n
o cualquier otro tipo de transmisi?n por parte de otra persona que no
sea el destinatario. Si usted recibe por error este correo, se ruega
comunicarlo al remitente y borrar el mensaje recibido.
**CONFIDENTIALITY NOTICE** This email communication and any
attachments may contain confidential and privileged information for
the sole use of the designated recipient named above. Distribution,
reproduction or any other use of this transmission by any party other
than the intended recipient is prohibited. If you are not the intended
recipient please contact the sender and delete all copies.
I can see where it may be handy though - e.g. for input into
clustering.
Really, at that stage, most people would prefer one gene - one value.
There's no way I would average over the values for different oligos
though, even if they were for the same gene!
In limma, in "usersguide.pdf", section 14 "Within Array Replicate
Spots"
deals with the issue :-)
Mick
-----Original Message-----
From: Sean Davis [mailto:sdavis2@mail.nih.gov]
Sent: 10 May 2005 12:11
To: Ankit Pal
Cc: michael watson (IAH-C); bioconductor@stat.math.ethz.ch
Subject: Re: [BioC] Limma final gene expression report
On May 10, 2005, at 6:54 AM, Ankit Pal wrote:
> There are different oliogos of the same gene and
> replicates with the same sequence present on the
> array.
> Splice variants have been mapped to the same Accession
> number but I'm working on that to separate out the
> same by naming them differently.
> -Ankit
>
So, for the different oligos/splice variant spots, averaging is
probably NOT a good idea in most situations--they could be measuring
quite different things. When there is a discrepancy, it MAY be due to
array issues (quality of one spot, for instance), but it could also be
a true finding; determining which is which often requires some human
intervention.
Perhaps others will weigh in on the issue, but I don't think ad-hoc
averaging of truly duplicated spots really serves a purpose, either.
Unless your array design allows you to treat the duplicated spots at
the level of the analysis (lmFit, in limma) and not post-processing, I
don't think you benefit much from averaging (i.e., you don't have more
"confidence" in a gene that has been averaged).
Sean
Dear Sean,
I agree averaging is not a good idea.
So how do I get a single value for a set of replicate
probes?
In case of different values for the same gene which
result do I consider to be representative of the
whole?
It would definately help at the clustering level.
-Ankit
--- Sean Davis <sdavis2@mail.nih.gov> wrote:
>
>
> On May 10, 2005, at 6:54 AM, Ankit Pal wrote:
>
> > There are different oliogos of the same gene and
> > replicates with the same sequence present on the
> > array.
> > Splice variants have been mapped to the same
> Accession
> > number but I'm working on that to separate out the
> > same by naming them differently.
> > -Ankit
> >
>
> So, for the different oligos/splice variant spots,
> averaging is
> probably NOT a good idea in most situations--they
> could be measuring
> quite different things. When there is a
> discrepancy, it MAY be due to
> array issues (quality of one spot, for instance),
> but it could also be
> a true finding; determining which is which often
> requires some human
> intervention.
>
> Perhaps others will weigh in on the issue, but I
> don't think ad-hoc
> averaging of truly duplicated spots really serves a
> purpose, either.
> Unless your array design allows you to treat the
> duplicated spots at
> the level of the analysis (lmFit, in limma) and not
> post-processing, I
> don't think you benefit much from averaging (i.e.,
> you don't have more
> "confidence" in a gene that has been averaged).
>
> Sean
>
>
Stay connected, organized, and protected. Take the tour:
On May 10, 2005, at 7:45 AM, Ankit Pal wrote:
> Dear Sean,
> I agree averaging is not a good idea.
> So how do I get a single value for a set of replicate
> probes?
> In case of different values for the same gene which
> result do I consider to be representative of the
> whole?
That is the problem, isn't it. If you have duplicate spots (same
sequence), you will need to look at the quality, etc., to see which
you
believe. If you have oligos that map to the same gene but behave
differently, you will need to look at spot quality as well as other
issues like the cross-hybridization potential (which often requires
blasting), location in the gene (3' bias?), and splice variants that
may be tissue specific. As I said, all of these require a bit of
human
intervention. In practice, though, you have to validate array results
biologically--that is the real answer to your question. The
"representative" spot is the one that validates; sometimes that will
be
the one that suggests differential expression, and sometimes not.
Sean
> It would definately help at the clustering level.
> -Ankit
> Date: Mon, 9 May 2005 22:15:17 -0700 (PDT)
> From: Ankit Pal <pal_ankit2000@yahoo.com>
> Subject: [BioC] Limma final gene expression report
> To: bioconductor@stat.math.ethz.ch
>
> Dear All,
> While looking at the Limma user guide, I came across
> the following example
>
>> targets <- readTargets("SwirlSample.txt")
>> RG <- read.maimages(targets$FileName, source="spot")
>
>> RG$genes <- readGAL()
>> RG$printer <- getLayout(RG$genes)
>> MA <- normalizeWithinArrays(RG)
>> MA <- normalizeBetweenArrays(MA)
>> fit <- lmFit(MA, design=c(-1,1,-1,1))
>> fit <- eBayes(fit)
>> options(digits=3)
>> topTable(fit, n=30, adjust="fdr")
> ID Name M A t P.Value B
> control BMP2 -2.21 12.1 -21.1 0.000357 7.96
> control BMP2 -2.30 13.1 -20.3 0.000357 7.78
> control Dlx3 -2.18 13.3 -20.0 0.000357 7.71
> control Dlx3 -2.18 13.5 -19.6 0.000357 7.62
> fb94h06 20-L12 1.27 12.0 14.1 0.002067 5.78
> fb40h07 7-D14 1.35 13.8 13.5 0.002067 5.54
>
> I have omitted a few rows and columns.
> Here we see that after all the data transformations,
> we get an output where the ranking for the probes in
> an array is done on the basis of the B value.
> Notice that there are reapeating names for genes,
> therefore for a set of replicates, within and across
> arrays, each spot is reported separately as an
> individual entity.
> In the case of BMP2 from the above example, which
> result do I consider?
> Is there a way in which I can get a single result for
> a set of replicates.
No, there isn't. The limma facility to handle duplicate spots applies
only when every single
probe on your array is replicated the same number of times in a
regular pattern. (The intention
is to accommodate repeating printing from the same DNA wells, not
irregularly repeated occurance
of similar DNA in different wells of the DNA plates.) For the Swirl
dataset which you're using
here, the only probes which are repeated are control probes. There
seems to me to be no purpose
in averaging results for repeated control probes because then they
would be treated differently
from library probes and hence would no longer be comparable to the
library probes in the
statistical analysis. Similar treatment is necessary for them to be
truly control probes. The
fact that you get both copies of the BMP2 control probe at the top in
the above list is useful
information -- it shows that the top ranking is no fluke.
Gordon
> I am new to this package, so please do let me know if
> there is a problem in my understanding the concept.
> Thank you,
> -Ankit