Entering edit mode
Dear Bill - thanks for the interesting work related to scalable
microarray preprocessing.
We have recently submitted a closely related manuscript for review.
Similar to your work, the proposed Online-RPA algorithm reads CEL
files in batches to update the hyperparameters of a probabilistic
probe-level model. This yields a fully scalable algorithm (linear time
wrt. sample size) which systematically outperforms the standard RMA in
various benchmarking tests, is readily applicable to all Affymetrix
and other short oligo arrays (in contrast to fRMA), and has been used
to preprocess data sets with tens of thousands of arrays. The
preprint is available through arXiv arxiv.org/abs/1212.5932 "A fully
scalable online-preprocessing algorithm for short oligonucleotide
microarray atlases.").
The implementation (function rpa.online) is available through
Bioconductor RPA package:
http://www.bioconductor.org/packages/devel/bioc/html/RPA.html
Would be interesting to compare the two approaches experimentally.
best regards,
Leo Lahti, Finland
http://www.iki.fi/Leo.Lahti
> Date: Mon, 31 Dec 2012 11:20:18 -0800 (PST)
> From: "wlangdon [guest]" <guest at="" bioconductor.org="">
> To: bioconductor at r-project.org, w.langdon at cs.ucl.ac.uk
> Cc: affy Maintainer <rafa at="" jhu.edu="">
> Subject: [BioC] normalise many cel files TCBB-2007-11-0161_noCEL.tar
> Message-ID: <20121231192018.BDD99133105 at mamba.fhcrc.org>
>
>
> Today I wrote to Rafael Irizarry about this and he suggested I post
my message here.
>
> Some time back I wrote some R code to normalise from
> one to several tens of thousand Affymetrix cel files on a
> (Linux) PC.
>
> The advantage is that it does not keep all cel files in memory
> all the time and so the usual memory limits which restrict the
> number of cel files do not apply.
> http://www.cs.ucl.ac.uk/staff/W.Langdon/ftp/gp-
code/R/TCBB-2007-11-0161_noCEL.tar
>
> The R-code also reports spatial defects:
> A Survey of Spatial Defects in Homo Sapiens Affymetrix GeneChips,
> W. B. Langdon and G. J. G. Upton and R. da Silva Camargo and A. P.
> Harrison, IEEE/ACM Transactions on Computational Biology and
> Bioinformatics, 7(4) 647-653 oct-dec 2010. PubMed 21030732
>
> If we could incorporate this into your bioconductor affy package
> that would be great.
>
> Bill
>
> Dr. W. B. Langdon,
> Department of Computer Science,
> University College London
> Gower Street, London WC1E 6BT, UK
> http://www.cs.ucl.ac.uk/staff/W.Langdon/