Hi, All -
James MacDonald (I think) answered my previous posting, and I promptly
lost the message. Thanks, James, and apologies.
The issue at hand was strange and (I believe) incorrect reporting of
probe affinities from justPlier (plier package). At James' suggestion
I have update to R 2.2.0 and plier 1.2.0, but the affinities are still
coming back in a sparse <# probe pairs> X <# arrays> matrix, rather
than
as a useful vector. The colnames of this matrix are the sampleNames
from
the eset provided to justPlier; the rownames are the probeNames.
James, you said this doesn't happen to you. How do you retrieve the
affinities? Maybe I'm just looking at the wrong slot (see below).
Looking at the justPlier source code, though, I don't see any other
way to get them.
Also, does justPlier allow one to pass the affinities back to another
invocation of the method, rather than computing them from the current
data?
Thanks,
- Jeremy Gollub
The session (edited for readability):
#
---------------------------------------------------------------------
> sessionInfo())
R version 2.2.0, 2005-10-06, i386-pc-mingw32
attached base packages:
[1] "tools" "methods" "stats" "graphics" "grDevices"
"utils"
[7] "datasets" "base"
other attached packages:
rae230acdf plier affy Biobase qvalue
"1.10.0" "1.2.0" "1.8.1" "1.8.0" "1.4.0"
> data <- ReadAffy()
> data
AffyBatch object
size of arrays=602x602 features (50972 kb)
cdf=RAE230A (15923 affyids)
number of samples=18
number of genes=15923
annotation=rae230a
> res <- justPlier(data, get.affinities = TRUE)
> dim(res at description@preprocessing$affinity)
[1] 175477 18
> sum(res at description@preprocessing$affinity != 0)
[1] 175407
#
----------------------------------------------------------------------
--
Jeremy Gollub
jeremy at gollub.net
Hi, James -
Thanks for clarifying that.
Looking at the source code for justPlier, it seems there's a
misapprehension
about the probe affinity values - the justPlier code expects to get
back
a value for each probe (note, not probe set) in each sample, whereas
the
algorithm/C++ code actually returns a single value for each probe
pair,
which is applied to all samples. I think the padding with many extra
zeros is done by justPlier, not by the C++ code...?
Crispin, unless I'm missing something, this should probably be
considered
a bug. In the meantime, I should be able to deconvolute the matrix
back
into the expected vector, but I'm not sure which probe pair to
associate
with each value.
- Jeremy
James W. MacDonald wrote:
>
> Hi Jeremy,
>
> Seems I misunderstood your earlier email. I somehow thought you were
> talking about expression values, not the affinities. I get the same
sort
> of result for the affinities that you are reporting.
>
> However it doesn't appear to be a sparse matrix per se, but a matrix
> with a bunch of zeros padded on the end.
>
> > pset <- justPlier(dat, get.affinities = TRUE)
> > pset
> Expression Set (exprSet) with
> 54675 genes
> 10 samples
> phenoData object with 1 variables and 10 cases
> varLabels
> sample: arbitrary numbering
> > dim(pset at description@preprocessing$affinity)
> [1] 604258 10
> > sum(rowSums(pset at description@preprocessing$affinity) != 0)
> [1] 54676
> > a <- which(rowSums(pset at description@preprocessing$affinity) !=
0)
> > range(a)
> [1] 1 54676
>
> This is what I get with a HG-U133Plus_2 chip. It looks to me like
there
> are indeed affinities for each probeset (rather than each probe),
but
> the vector of affinities that is output by the C++ code is padded
with a
> bunch of zeros. Maybe the result is different for other chips?
>
> Anyway, this is probably a question for Crispin Miller, who
maintains
> the package.
>
> Best,
>
> Jim
>
>
>
> Jeremy Gollub wrote:
> > Hi, All -
> >
> > James MacDonald (I think) answered my previous posting, and I
promptly
> > lost the message. Thanks, James, and apologies.
> >
> > The issue at hand was strange and (I believe) incorrect reporting
of
> > probe affinities from justPlier (plier package). At James'
suggestion
> > I have update to R 2.2.0 and plier 1.2.0, but the affinities are
still
> > coming back in a sparse <# probe pairs> X <# arrays> matrix,
rather than
> > as a useful vector. The colnames of this matrix are the
sampleNames from
> > the eset provided to justPlier; the rownames are the probeNames.
> >
> > James, you said this doesn't happen to you. How do you retrieve
the
> > affinities? Maybe I'm just looking at the wrong slot (see below).
> > Looking at the justPlier source code, though, I don't see any
other
> > way to get them.
> >
> > Also, does justPlier allow one to pass the affinities back to
another
> > invocation of the method, rather than computing them from the
current
> > data?
> >
> > Thanks,
> >
> > - Jeremy Gollub
> >
> >
> > The session (edited for readability):
> >
> > #
---------------------------------------------------------------------
> >
> >
> >>sessionInfo())
> >
> > R version 2.2.0, 2005-10-06, i386-pc-mingw32
> >
> > attached base packages:
> > [1] "tools" "methods" "stats" "graphics" "grDevices"
"utils"
> > [7] "datasets" "base"
> >
> > other attached packages:
> > rae230acdf plier affy Biobase qvalue
> > "1.10.0" "1.2.0" "1.8.1" "1.8.0" "1.4.0"
> >
> >
> >>data <- ReadAffy()
> >>data
> >
> > AffyBatch object
> > size of arrays=602x602 features (50972 kb)
> > cdf=RAE230A (15923 affyids)
> > number of samples=18
> > number of genes=15923
> > annotation=rae230a
> >
> >
> >>res <- justPlier(data, get.affinities = TRUE)
> >>dim(res at description@preprocessing$affinity)
> >
> > [1] 175477 18
> >
> >
> >>sum(res at description@preprocessing$affinity != 0)
> >
> > [1] 175407
> >
> > #
----------------------------------------------------------------------
> >
>
>
> --
> James W. MacDonald
> Affymetrix and cDNA Microarray Core
> University of Michigan Cancer Center
> 1500 E. Medical Center Drive
> 7410 CCGC
> Ann Arbor MI 48109
> 734-647-5623
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
--
Jeremy Gollub
jeremy at gollub.net
Jeremy Gollub wrote:
> Hi, James -
>
> Thanks for clarifying that.
>
> Looking at the source code for justPlier, it seems there's a
misapprehension
> about the probe affinity values - the justPlier code expects to get
back
> a value for each probe (note, not probe set) in each sample, whereas
the
> algorithm/C++ code actually returns a single value for each probe
pair,
> which is applied to all samples. I think the padding with many
extra
> zeros is done by justPlier, not by the C++ code...?
No, there is nothing in the code of justPlier that would do that
(anyway, I checked it out). The C++ code returns a vector of length
<number of="" probes=""> x <number of="" chips=""> that has affinity values for
the
first <number of="" *probesets*=""> x <number of="" chips="">, followed by the
zeros.
The R code in justPlier simply puts this vector into a matrix and then
transposes (note that the size of the matrix will be determined by the
length of r$affinity, since no ncol argument is given).
if (get.affinities) {
a <- t(matrix(r$affinity, nrow = num_exp))
colnames(a) <- sampleNames(eset)
rownames(a) <- probeNames(eset)
res at description@preprocessing$affinity = a
}
Best,
Jim
>
> Crispin, unless I'm missing something, this should probably be
considered
> a bug. In the meantime, I should be able to deconvolute the matrix
back
> into the expected vector, but I'm not sure which probe pair to
associate
> with each value.
>
> - Jeremy
--
James W. MacDonald
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
Hi Jeremy,
Seems I misunderstood your earlier email. I somehow thought you were
talking about expression values, not the affinities. I get the same
sort
of result for the affinities that you are reporting.
However it doesn't appear to be a sparse matrix per se, but a matrix
with a bunch of zeros padded on the end.
> pset <- justPlier(dat, get.affinities = TRUE)
> pset
Expression Set (exprSet) with
54675 genes
10 samples
phenoData object with 1 variables and 10 cases
varLabels
sample: arbitrary numbering
> dim(pset at description@preprocessing$affinity)
[1] 604258 10
> sum(rowSums(pset at description@preprocessing$affinity) != 0)
[1] 54676
> a <- which(rowSums(pset at description@preprocessing$affinity) !=
0)
> range(a)
[1] 1 54676
This is what I get with a HG-U133Plus_2 chip. It looks to me like
there
are indeed affinities for each probeset (rather than each probe), but
the vector of affinities that is output by the C++ code is padded with
a
bunch of zeros. Maybe the result is different for other chips?
Anyway, this is probably a question for Crispin Miller, who maintains
the package.
Best,
Jim
Jeremy Gollub wrote:
> Hi, All -
>
> James MacDonald (I think) answered my previous posting, and I
promptly
> lost the message. Thanks, James, and apologies.
>
> The issue at hand was strange and (I believe) incorrect reporting of
> probe affinities from justPlier (plier package). At James'
suggestion
> I have update to R 2.2.0 and plier 1.2.0, but the affinities are
still
> coming back in a sparse <# probe pairs> X <# arrays> matrix, rather
than
> as a useful vector. The colnames of this matrix are the sampleNames
from
> the eset provided to justPlier; the rownames are the probeNames.
>
> James, you said this doesn't happen to you. How do you retrieve the
> affinities? Maybe I'm just looking at the wrong slot (see below).
> Looking at the justPlier source code, though, I don't see any other
> way to get them.
>
> Also, does justPlier allow one to pass the affinities back to
another
> invocation of the method, rather than computing them from the
current
> data?
>
> Thanks,
>
> - Jeremy Gollub
>
>
> The session (edited for readability):
>
> #
---------------------------------------------------------------------
>
>
>>sessionInfo())
>
> R version 2.2.0, 2005-10-06, i386-pc-mingw32
>
> attached base packages:
> [1] "tools" "methods" "stats" "graphics" "grDevices"
"utils"
> [7] "datasets" "base"
>
> other attached packages:
> rae230acdf plier affy Biobase qvalue
> "1.10.0" "1.2.0" "1.8.1" "1.8.0" "1.4.0"
>
>
>>data <- ReadAffy()
>>data
>
> AffyBatch object
> size of arrays=602x602 features (50972 kb)
> cdf=RAE230A (15923 affyids)
> number of samples=18
> number of genes=15923
> annotation=rae230a
>
>
>>res <- justPlier(data, get.affinities = TRUE)
>>dim(res at description@preprocessing$affinity)
>
> [1] 175477 18
>
>
>>sum(res at description@preprocessing$affinity != 0)
>
> [1] 175407
>
> #
----------------------------------------------------------------------
>
--
James W. MacDonald
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
Hi, Matt -
This was indeed useful.
Examining r$affinity for a varying number of samples on RAE230A
arrays,
I find the following behavior:
- length(r$affinity) = <# of samples> * 175,477
- I believe 175,477 is the number of useful probe pairs on the
array.
- sum(r$affinity != 0) <= min(<# probe sets> * <# samples>, <# probe
pairs>)
That is, the number of non-zero values increases to a maximum of the
number
of probe pairs, but is always less than the number of probe sets *
samples.
This explains why your results suggest probe sets * samples - if you
used
more samples, you'd see the non-zero affinities hit a maximum count of
the number of probe pairs.
It seems pretty clear that this behavior arises from the C++ code, not
from justPlier as I previously suggested. I still suspect it's
incorrect,
though, since my understanding of the PLIER algorithm is that there
should
always be one affinity value per probe pair (so I should always be
seeing
175,477 non-zero values, minus the values that are legitimately zero).
I'll
see whether I can confirm that, and report back if so. Comments from
those
more knowledgeable than I would be greatly appreciated.
Thanks,
- Jeremy
Matthew Hannah wrote:
>
> Jeremy,
>
> I got distracted whilst still in draft and in the meantime Jim has
responded. Anyway to add -
>
> I pasted your code into my fresh BioC install and it doesn't work
with the ATH1121501 array either, it's inherent in the justPlier code.
> Looking at justPlier code shows that it returns a matrix of
probeNames(eset) x sampleNames(eset) which is consistent with what we
find. After investigating it seems the r$affinity (within justPlier)
is 159683 values followed by zeros until length 1757546. This is
~100000 short of what is needed for 1 affinity per probepair but if
you divide it by the number of arrays I had = 159683/7 = 22811.86 it
is pretty close to the number of probesets on the array(22810). So
this would imply a probesets x samples output matrix.
>
> Not sure if this is still helpful after Jim's response.
>
> Cheers,
> Matt
>
> Dr. Matt Hannah
> Max-Planck Institute of Molecular Plant Physiology
> Am Mühlenburg 1
> 14476 Golm
> Germany
>
> + 49 (331) 567 8255 (phone)
> + 49 (331) 567 8250 (fax)
>
>
>
>
>
>
> >>>>>>>>>>>>>>>>>>
> Hi, All -
>
> James MacDonald (I think) answered my previous posting, and I
promptly
> lost the message. Thanks, James, and apologies.
>
> The issue at hand was strange and (I believe) incorrect reporting of
> probe affinities from justPlier (plier package). At James'
suggestion
> I have update to R 2.2.0 and plier 1.2.0, but the affinities are
still
> coming back in a sparse <# probe pairs> X <# arrays> matrix, rather
than
> as a useful vector. The colnames of this matrix are the sampleNames
from
> the eset provided to justPlier; the rownames are the probeNames.
>
> James, you said this doesn't happen to you. How do you retrieve the
> affinities? Maybe I'm just looking at the wrong slot (see below).
> Looking at the justPlier source code, though, I don't see any other
> way to get them.
>
> Also, does justPlier allow one to pass the affinities back to
another
> invocation of the method, rather than computing them from the
current
> data?
>
> Thanks,
>
> - Jeremy Gollub
>
>
> The session (edited for readability):
>
> #
---------------------------------------------------------------------
>
> > sessionInfo())
> R version 2.2.0, 2005-10-06, i386-pc-mingw32
>
> attached base packages:
> [1] "tools" "methods" "stats" "graphics" "grDevices"
"utils"
> [7] "datasets" "base"
>
> other attached packages:
> rae230acdf plier affy Biobase qvalue
> "1.10.0" "1.2.0" "1.8.1" "1.8.0" "1.4.0"
>
> > data <- ReadAffy()
> > data
> AffyBatch object
> size of arrays=602x602 features (50972 kb)
> cdf=RAE230A (15923 affyids)
> number of samples=18
> number of genes=15923
> annotation=rae230a
>
> > res <- justPlier(data, get.affinities = TRUE)
> > dim(res at description
<https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">
@preprocessing$affinity)
> [1] 175477 18
>
> > sum(res at description
<https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">
@preprocessing$affinity != 0)
> [1] 175407
>
> #
----------------------------------------------------------------------
>
> --
> Jeremy Gollub
> jeremy at gollub.net
<https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>
--
Jeremy Gollub
jeremy at gollub.net
Hi all,
Sorry for not being part of the conversation earlier...
Let me take a look at things and put a patch together...
Thanks for pointing this out,
Crispin
-----Original Message-----
From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-
bounces@stat.math.ethz.ch] On Behalf Of Jeremy Gollub
Sent: 18 October 2005 19:12
To: Matthew Hannah
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] PLIER affinities redux
Hi, Matt -
This was indeed useful.
Examining r$affinity for a varying number of samples on RAE230A
arrays, I find the following behavior:
- length(r$affinity) = <# of samples> * 175,477
- I believe 175,477 is the number of useful probe pairs on the
array.
- sum(r$affinity != 0) <= min(<# probe sets> * <# samples>, <# probe
pairs>)
That is, the number of non-zero values increases to a maximum of the
number of probe pairs, but is always less than the number of probe
sets * samples.
This explains why your results suggest probe sets * samples - if you
used more samples, you'd see the non-zero affinities hit a maximum
count of the number of probe pairs.
It seems pretty clear that this behavior arises from the C++ code, not
from justPlier as I previously suggested. I still suspect it's
incorrect, though, since my understanding of the PLIER algorithm is
that there should always be one affinity value per probe pair (so I
should always be seeing
175,477 non-zero values, minus the values that are legitimately zero).
I'll see whether I can confirm that, and report back if so. Comments
from those more knowledgeable than I would be greatly appreciated.
Thanks,
- Jeremy
Matthew Hannah wrote:
>
> Jeremy,
>
> I got distracted whilst still in draft and in the meantime Jim has
> responded. Anyway to add -
>
> I pasted your code into my fresh BioC install and it doesn't work
with the ATH1121501 array either, it's inherent in the justPlier code.
> Looking at justPlier code shows that it returns a matrix of
probeNames(eset) x sampleNames(eset) which is consistent with what we
find. After investigating it seems the r$affinity (within justPlier)
is 159683 values followed by zeros until length 1757546. This is
~100000 short of what is needed for 1 affinity per probepair but if
you divide it by the number of arrays I had = 159683/7 = 22811.86 it
is pretty close to the number of probesets on the array(22810). So
this would imply a probesets x samples output matrix.
>
> Not sure if this is still helpful after Jim's response.
>
> Cheers,
> Matt
>
> Dr. Matt Hannah
> Max-Planck Institute of Molecular Plant Physiology Am M?hlenburg 1
> 14476 Golm
> Germany
>
> + 49 (331) 567 8255 (phone)
> + 49 (331) 567 8250 (fax)
>
>
>
>
>
>
> >>>>>>>>>>>>>>>>>>
> Hi, All -
>
> James MacDonald (I think) answered my previous posting, and I
promptly
> lost the message. Thanks, James, and apologies.
>
> The issue at hand was strange and (I believe) incorrect reporting of
> probe affinities from justPlier (plier package). At James'
suggestion
> I have update to R 2.2.0 and plier 1.2.0, but the affinities are
still
> coming back in a sparse <# probe pairs> X <# arrays> matrix, rather
> than as a useful vector. The colnames of this matrix are the
> sampleNames from the eset provided to justPlier; the rownames are
the probeNames.
>
> James, you said this doesn't happen to you. How do you retrieve the
> affinities? Maybe I'm just looking at the wrong slot (see below).
> Looking at the justPlier source code, though, I don't see any other
> way to get them.
>
> Also, does justPlier allow one to pass the affinities back to
another
> invocation of the method, rather than computing them from the
current
> data?
>
> Thanks,
>
> - Jeremy Gollub
>
>
> The session (edited for readability):
>
> #
>
---------------------------------------------------------------------
>
> > sessionInfo())
> R version 2.2.0, 2005-10-06, i386-pc-mingw32
>
> attached base packages:
> [1] "tools" "methods" "stats" "graphics" "grDevices"
"utils"
> [7] "datasets" "base"
>
> other attached packages:
> rae230acdf plier affy Biobase qvalue
> "1.10.0" "1.2.0" "1.8.1" "1.8.0" "1.4.0"
>
> > data <- ReadAffy()
> > data
> AffyBatch object
> size of arrays=602x602 features (50972 kb) cdf=RAE230A (15923
affyids)
> number of samples=18 number of genes=15923 annotation=rae230a
>
> > res <- justPlier(data, get.affinities = TRUE) dim(res at
description
> > <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">
> > @preprocessing$affinity)
> [1] 175477 18
>
> > sum(res at description
> > <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">
> > @preprocessing$affinity != 0)
> [1] 175407
>
> #
>
----------------------------------------------------------------------
>
> --
> Jeremy Gollub
> jeremy at gollub.net
> <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor="">
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>
--
Jeremy Gollub
jeremy at gollub.net
--------------------------------------------------------
This email is confidential and intended solely for the use
o...{{dropped}}