The slowdown you are observing is due to just a few probesets on the
array. These probesets contain many 1000's of probes. In the current
implementation when you use the command that you specified (fitting
the
default model) fitPLM uses a procedure optimized for probesets with
relatively few probes across many arrays and so is pretty quick most
of
the time (my experience is that is is not completely unacceptable even
up to about 1000 probes across a large number of arrays, at least on
my
machine).
eg both of the following contain same number of datapoints
Case I: 11 probes and 1000 arrays
Case II: 1000 probes and 11 probes
but case I will be a lot quicker than case II in the current
implementation.
Demonstration code
> library(affyPLM)
### note to any developers out there, the following is UNSUPPORTED
### and subject to change. DO NOT USE.
> rlm.default.rma.model <- function(y,PsiCode=0,PsiK=1.345){
+ .Call("R_rlm_rma_default_model",y,PsiCode,PsiK,PACKAGE="affyPLM")
+ }
#Case I
> y <- matrix(rnorm(11*1000),11,1000)
> system.time(test <- rlm.default.rma.model(y))
[1] 0.735 0.032 0.788 0.000 0.000
#Case II
> y <- matrix(rnorm(11*1000),1000,11)
> system.time(test <- rlm.default.rma.model(y))
[1] 19.776 0.508 21.730 0.000 0.000
As for workarounds, I am pretty sure that these extremely large
probesets are control probesets of some kind that could be safely
ignored and it is possible to pass a vector of probeset names
specifying
a subset to use for fitPLM.
Best,
Ben
On Tue, 2007-05-01 at 12:36 -0700, Allen Day wrote:
> I suspect so, although I haven't tried running rma() directly.
> Just.rma() works fine, and fitPLM is able to RMA normalize
internally.
>
> I was able to move this a little further along by patching the mm()
> function to return empty list in the case of a dimensionless pset
> variable. Apparently it is usually a two-column matrix with pm in
> psets[,1] and mm in psets[,2]. Heres the patch.
>
http://paste.turbogears.org/paste/1253/plain
>
> This allows me to successfully background correct and normalize with
> RMA through wrapper function fitPLM from the affyPLM library. It's
> taking forever though, even running with minimal options. Here's my
> call:
>
> fitPLM(ab, output.param=list(residuals=FALSE,weights=FALSE,resid.SE=
FALSE),verbosity.level=10);
>
> Any advice?
>
> -Allen
>
> On 5/1/07, Crispin Miller <cmiller at="" picr.man.ac.uk=""> wrote:
> > Hi Allen,
> > Does rma() work with your cdf?
> >
> > We've also produced one that works OK with rma() (see the
'exonmap'
> > package vignette for more details, including how to get it).
Don't know
> > if that helps?
> >
> > Crispin
> >
> >
> >
> >
> > > -----Original Message-----
> > > From: bioconductor-bounces at stat.math.ethz.ch
> > > [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of
Allen Day
> > > Sent: 01 May 2007 01:32
> > > To: bioconductor at stat.math.ethz.ch
> > > Subject: [BioC] affyPLM and exon array question
> > >
> > > Hi,
> > >
> > > I've been trying to get NUSE, RLE, and RMA values for
> > > HuEx-1_0-st-v2 (Human "all exon") Affymetrix arrays.
> > >
> > > So far I have successfully read the arrays into an affybatch
object.
> > > This required creating the CDF environment, which I have
> > > already done with makecdfenv. I'll be submitting that for
> > > inclusion shortly, but that's another topic.
> > >
> > > After creating the AffyBatch, I try to use affyPLM to do an
> > > RMA model fit. R = 2.4.1, affyPLM = 1.12.0, affy = 1.12.2.
> > > That's where there's trouble, and it appears to be caused by
> > > the lack of mismatch probes on the array. Here's code
> > > illustrating the problem:
> > >
> > > > library( 'affy' );
> > > > library( 'affyPLM' );
> > > > ab = read.affybatch(
> > > filenames='/home/allenday/cel/0001.CEL' ); ab; #
> > > > works, output omitted pm( ab ); # works, output omitted mm( ab
); #
> > > > fails!
> > > Error in FUN(X[[1411190]], ...) : subscript out of bounds
> > > > plm = fitPLM( ab ); #same failure in fitPLM, caused by a
> > > call to mm()
> > > > on variable ab;
> > > Error in FUN(X[[1411190]], ...) : subscript out of bounds
> > >
> > > I'm only proficient enough in R and C to track this down --
> > > I'm don't know R or Bioconductor well enough to know how to
> > > fix it. If I can get this going I will submit a new package
> > > that provides just.nuse() and just.rle() functions. Can
> > > someone give me a pointer for how to make this work?
> > >
> > > Thanks.
> > >
> > > -Allen
> > >
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at stat.math.ethz.ch
> > >
https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > Search the archives:
> > >
http://news.gmane.org/gmane.science.biology.informatics.conductor
> > >
> >
> > --------------------------------------------------------
> >
> >
> > This email is confidential and intended solely for the use of the
person(s) ('the intended recipient') to whom it was addressed. Any
views or opinions presented are solely those of the author and do not
necessarily represent those of the Paterson Institute for Cancer
Research or the University of Manchester. It may contain information
that is privileged & confidential within the meaning of applicable
law. Accordingly any dissemination, distribution, copying, or other
use of this message, or any of its contents, by any person other than
the intended recipient may constitute a breach of civil or criminal
law and is strictly prohibited. If you are NOT the intended recipient
please contact the sender and dispose of this e-mail as soon as
possible.
> >
> >
--