Reading 80 CEL files.
3
0
Entering edit mode
Pure Lu ▴ 10
@pure-lu-2488
Last seen 10.3 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20071114/ 9a3b902a/attachment.pl
• 1.6k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 4 months ago
United States
On Nov 14, 2007 4:27 AM, Pure Lu <ppure379 at="" yahoo.com.tw=""> wrote: > > > Hello~~~ > > > > I am trying to read 80 HG-U133A arrays so I did as follows: > > > > > memory.limit(size=4095) > > > options(object.size=10000000000, digits=8, scipen=100, memory=3200483647, contrasts=c("contr.treatment", "contr.poly")) > > > library(affy) > > > cel.file <- ReadAffy(celfile.path = "D://CEL) > > > > However, it showed > > > > Error: cannot allocate vector of size 309.4 Mb > > > > I have tried adding the --max-mem-size=2Gb tag onto my shortcut. > > The machine has 2G RAM and a 3.30GHz processor. > > Is there any idea to let R use more memory anywhere? Hi, Pure. Since the machine you are using has only 2Gb of RAM and is a Windows machine (it appears), it is unlikely that you will be able to load all 80 of the CEL files at once using ReadAffy. You can load then in a few chunks if you just want to check QC measures, etc. If you simply want to normalize all those arrays, try using the justRMA() function, which is much less memory-intensive than ReadAffy. Sean
ADD COMMENT
0
Entering edit mode
@michal-blazejczyk-2231
Last seen 10.3 years ago
Hi, Yeah, typical... Try using function just.rma instead of calling ReadAffy. just.rma operates directly on files and requires way less memory. Or run it on Linux, I think there's less memory issues there. Good luck, Best, Michal Blazejczyk FlexArray Lead Developer http://genomequebec.mcgill.ca/FlexArray/ > Hello~~~ > > > > I am trying to read 80 HG-U133A arrays so I did as follows: > > > >> memory.limit(size=4095) > >> options(object.size=10000000000, digits=8, scipen=100, >> memory=3200483647, contrasts=c("contr.treatment", "contr.poly")) > >> library(affy) > >> cel.file <- ReadAffy(celfile.path = ??D://CEL) > > > > However, it showed > > > > Error: cannot allocate vector of size 309.4 Mb > > > > I have tried adding the --max-mem-size=2Gb tag onto my shortcut. > > The machine has 2G RAM and a 3.30GHz processor. > > Is there any idea to let R use more memory anywhere? > > Thank you~ > > > > Best Regards, > > Pure
ADD COMMENT
0
Entering edit mode
You can also use R package aroma.affymetrix: http://groups.google.com/group/aroma-affymetrix/ It is cross platform and you should be able to process any number of arrays with roughly 1-2GB of RAM. All you need is CEL files and a CDF file. Cheers Henrik On 11/14/07, Michal Blazejczyk <michal.blazejczyk at="" mail.mcgill.ca=""> wrote: > Hi, > > Yeah, typical... > > Try using function just.rma instead of calling ReadAffy. > just.rma operates directly on files and requires way less > memory. > > Or run it on Linux, I think there's less memory issues > there. > > Good luck, > > Best, > Michal Blazejczyk > FlexArray Lead Developer > http://genomequebec.mcgill.ca/FlexArray/ > > > > Hello~~~ > > > > > > > > I am trying to read 80 HG-U133A arrays so I did as follows: > > > > > > > >> memory.limit(size=4095) > > > >> options(object.size=10000000000, digits=8, scipen=100, > >> memory=3200483647, contrasts=c("contr.treatment", "contr.poly")) > > > >> library(affy) > > > >> cel.file <- ReadAffy(celfile.path = ??D://CEL) > > > > > > > > However, it showed > > > > > > > > Error: cannot allocate vector of size 309.4 Mb > > > > > > > > I have tried adding the --max-mem-size=2Gb tag onto my shortcut. > > > > The machine has 2G RAM and a 3.30GHz processor. > > > > Is there any idea to let R use more memory anywhere? > > > > Thank you~ > > > > > > > > Best Regards, > > > > Pure > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
olsen ▴ 20
@olsen-2491
Last seen 10.3 years ago
> > Hello~~~ > > > > I am trying to read 80 HG-U133A arrays so I did as > follows: > > > > > memory.limit(size=4095) > > > options(object.size=10000000000, digits=8, > scipen=100, memory=3200483647, > contrasts=c("contr.treatment", "contr.poly")) > > > library(affy) > > > cel.file <- ReadAffy(celfile.path = ??D://CEL) > > > > However, it showed > > > > Error: cannot allocate vector of size 309.4 Mb > > > > I have tried adding the --max-mem-size=2Gb tag onto > my shortcut. > > The machine has 2G RAM and a 3.30GHz processor. > > Is there any idea to let R use more memory anywhere? > > Thank you~ > > > > Best Regards, > > Pure > Pure, My advice: run them in Linux. We have run up to 1000 arrays in a linux cluster. Oscar Puig _____________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ________________________________________________________________ ____________________ Never miss a thing. Make Yahoo your home page.
ADD COMMENT
0
Entering edit mode
Oscar, I am curious how long it runs and the configuration of the linux cluster. weiwei On 11/17/07, olsen <olsen2002 at="" yahoo.com=""> wrote: > > > > > Hello~~~ > > > > > > > > I am trying to read 80 HG-U133A arrays so I did as > > follows: > > > > > > > > > memory.limit(size=4095) > > > > > options(object.size=10000000000, digits=8, > > scipen=100, memory=3200483647, > > contrasts=c("contr.treatment", "contr.poly")) > > > > > library(affy) > > > > > cel.file <- ReadAffy(celfile.path = ??D://CEL) > > > > > > > > However, it showed > > > > > > > > Error: cannot allocate vector of size 309.4 Mb > > > > > > > > I have tried adding the --max-mem-size=2Gb tag onto > > my shortcut. > > > > The machine has 2G RAM and a 3.30GHz processor. > > > > Is there any idea to let R use more memory anywhere? > > > > Thank you~ > > > > > > > > Best Regards, > > > > Pure > > > Pure, > > My advice: run them in Linux. We have run up to 1000 > arrays in a linux cluster. > > Oscar Puig > > > > > _____________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > ______________________________________________________________ ______________________ > Never miss a thing. Make Yahoo your home page. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III
ADD REPLY
0
Entering edit mode
Since so many people replied already, please let me mention package xps which allows you to do RMA for 80 CEL files on laptops with 1-2 GB RAM only. You can even run Exon arrays or HuGene arrays. xps is availble from the Bioconductor development branch, but works with R-2.5.0 and R-2.6.0, too. However, currently xps is supported only for Linux and MacOS X. Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._ Weiwei Shi wrote: > Oscar, > > I am curious how long it runs and the configuration of the linux cluster. > > > weiwei > > On 11/17/07, olsen <olsen2002 at="" yahoo.com=""> wrote: > >>> Hello~~~ >>> >>> >>> >>> I am trying to read 80 HG-U133A arrays so I did as >>> follows: >>> >>> >>> >>> >>>> memory.limit(size=4095) >>>> >>>> options(object.size=10000000000, digits=8, >>>> >>> scipen=100, memory=3200483647, >>> contrasts=c("contr.treatment", "contr.poly")) >>> >>> >>>> library(affy) >>>> >>>> cel.file <- ReadAffy(celfile.path = ??D://CEL) >>>> >>> >>> However, it showed >>> >>> >>> >>> Error: cannot allocate vector of size 309.4 Mb >>> >>> >>> >>> I have tried adding the --max-mem-size=2Gb tag onto >>> my shortcut. >>> >>> The machine has 2G RAM and a 3.30GHz processor. >>> >>> Is there any idea to let R use more memory anywhere? >>> >>> Thank you~ >>> >>> >>> >>> Best Regards, >>> >>> Pure >>> >>> >> Pure, >> >> My advice: run them in Linux. We have run up to 1000 >> arrays in a linux cluster. >> >> Oscar Puig >> >> >> >> >> _____________________________________________ >> >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> >> _____________________________________________________________ _______________________ >> Never miss a thing. Make Yahoo your home page. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > > >
ADD REPLY
0
Entering edit mode
> I am curious how long it runs and the configuration of the linux cluster. I've run approximately 1400 hgu133plus2 CEL files on a machine w/ 16GB of memory. IIRC it took a few days (its been a while since I've done anything that large). The CPUs (2 of them, although this process is clearly not threaded) were 2.4g Opterons
ADD REPLY
0
Entering edit mode
Weiwei, 1000 hgu133plus2 arrays take 1 hr by RMA, 16 hr by GCRMA the configuration is. 8 cores AMD opteron 885, 64 bit processor 1.8 Mhtz 64 GB RAM SuSe Linux Version 9 SP3. Any big institution has this kind of computer power. Oscar Puig --- Weiwei Shi <helprhelp at="" gmail.com=""> wrote: > Oscar, > > I am curious how long it runs and the configuration > of the linux cluster. > > > weiwei > > On 11/17/07, olsen <olsen2002 at="" yahoo.com=""> wrote: > > > > > > > > Hello~~~ > > > > > > > > > > > > I am trying to read 80 HG-U133A arrays so I did > as > > > follows: > > > > > > > > > > > > > memory.limit(size=4095) > > > > > > > options(object.size=10000000000, digits=8, > > > scipen=100, memory=3200483647, > > > contrasts=c("contr.treatment", "contr.poly")) > > > > > > > library(affy) > > > > > > > cel.file <- ReadAffy(celfile.path = ??D://CEL) > > > > > > > > > > > > However, it showed > > > > > > > > > > > > Error: cannot allocate vector of size 309.4 Mb > > > > > > > > > > > > I have tried adding the --max-mem-size=2Gb tag > onto > > > my shortcut. > > > > > > The machine has 2G RAM and a 3.30GHz processor. > > > > > > Is there any idea to let R use more memory > anywhere? > > > > > > Thank you~ > > > > > > > > > > > > Best Regards, > > > > > > Pure > > > > > Pure, > > > > My advice: run them in Linux. We have run up to > 1000 > > arrays in a linux cluster. > > > > Oscar Puig > > > > > > > > > > _____________________________________________ > > > Bioconductor mailing list > > > Bioconductor at stat.math.ethz.ch > > > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > Search the archives: > > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > > > > ______________________________________________________________________ ______________ > > Never miss a thing. Make Yahoo your home page. > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > -- > Weiwei Shi, Ph.D > Research Scientist > GeneGO, Inc. > > "Did you always know?" > "No, I did not. But I believed..." > ---Matrix III >
ADD REPLY
0
Entering edit mode
Pure, The error message tells that at *some point in time* during the execution of "ReadAffy" a vector of given size cannot be allocated because there is not enough memory free on your system. This does not mean that you R session cannot handle anything larger than 309.4 Mb. To check how much memory is currently used by your session, you can call the garbage collector ("gc()", a side-effect is that the memory used will be printed. It is also a good idea to monitor what other processes on your machine are using significant amounts of memory. So to get R use more memory, the obvious is to get more RAM (nowadays, 2Gb is getting quickly limiting) and if get a lot more than 2Gb make sure that your OS/hardware are capable of making use of it. If you are only planning to process your data with RMA, the function "justRMA" is using tricks to use less memory, and should do the job on a system with 2Gb. If you really want to look at probe-level data with that much memory, there are other strategies but they will currently require extra efforts (I think). Hoping this helps, Laurent > >> >> Hello~~~ >> >> >> >> I am trying to read 80 HG-U133A arrays so I did as >> follows: >> >> >> >> > memory.limit(size=4095) >> >> > options(object.size=10000000000, digits=8, >> scipen=100, memory=3200483647, >> contrasts=c("contr.treatment", "contr.poly")) >> >> > library(affy) >> >> > cel.file <- ReadAffy(celfile.path = ??D://CEL) >> >> >> >> However, it showed >> >> >> >> Error: cannot allocate vector of size 309.4 Mb >> >> >> >> I have tried adding the --max-mem-size=2Gb tag onto >> my shortcut. >> >> The machine has 2G RAM and a 3.30GHz processor. >> >> Is there any idea to let R use more memory anywhere? >> >> Thank you~ >> >> >> >> Best Regards, >> >> Pure >> > Pure, > > My advice: run them in Linux. We have run up to 1000 > arrays in a linux cluster. > > Oscar Puig > > > > > _____________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > ______________________________________________________________ ______________________ > Never miss a thing. Make Yahoo your home page. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 453 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6