I would like to analyze a set of 216 HTA2.0 arrays using affyPLM
(for QC-ing and normalization). The files are loaded into R with a remapped CDF using ReadAffy()
, mainly because a 'remapped' PdInfo database isn't (yet?) available for this array. However, even when running this on our server with 128GB memory this fails because of a memory allocation problem. This is a surprise to me because I recently analyzed a set of ~1100 Human Gene ST1.1 arrays on the same system without any issue... On the other hand I realize that these are really high-density arrays, and thus require a lot of memory, but since the error states apparently (at least) 1800 Petabyte working memory would be required (~1.8 million Terabyte), which IMO is rather unrealistic, I wondered something else may be going on. Memory leak?? Or...??
Alternatively, could someone suggest how to process this dataset in separate 'batches'? In other words, how to run fitPLM() such way always the same (fixed) reference distribution values are used? Analogous to e.g. refPLUS RMA?
Thanks,
Guido
> library(affyPLM) > affy.data <- ReadAffy(cdfname = 'hta20hsentrezg') > affy.data AffyBatch object size of arrays=2572x2680 features (377020 kb) cdf=hta20hsentrezg (26876 affyids) number of samples=216 number of genes=26876 annotation=hta20hsentrezg > > x.norm <- fitPLM(affy.data) Error in fitPLM(affy.data) : 'Realloc' could not re-allocate memory (18446744061507524608 bytes) >