parallel mods to affy package
2
0
Entering edit mode
@warnes-gregory-r-43
Last seen 10.3 years ago
I'm just starting to look at integrating Luke Tierney's 'snow' package with the 'affy' package in order to parallelize the work. Initially, I'm planning on modifying 'express' by adding a new parameter "cl" for cluster. Next I'll probably tackle ReadAffy and friends. When 'cl' is a valid snow cluster, parRapply, parCapply, clusterAppy, .. will be called instead of 'apply( ..., 1, ...)', 'apply(.., 2, ..)', lapply, etc. 1) Comments on the plan? 2) Can I get CVS access (I'll do my work in a branch until OK'ed by the group)? This will save me the trouble of creating a local CVS archive for this and trying to keep it in sync. -Greg LEGAL NOTICE Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately.
• 892 views
ADD COMMENT
0
Entering edit mode
@vincent-j-carey-jr-4
Last seen 3 days ago
United States
> > I'm just starting to look at integrating Luke Tierney's 'snow' package with > the 'affy' package in order to parallelize the work. > > Initially, I'm planning on modifying 'express' by adding a new parameter > "cl" for cluster. Next I'll probably tackle ReadAffy and friends. > > When 'cl' is a valid snow cluster, parRapply, parCapply, clusterAppy, .. > will be called instead of 'apply( ..., 1, ...)', 'apply(.., 2, ..)', > lapply, etc. > > 1) Comments on the plan? seems a worthy endeavor, but does the source of the affy package really need to be modified for this? can't wrappers be written that break up the problem and reassemble the results? keep the package distinct from the various modes of execution
ADD COMMENT
0
Entering edit mode
@warnes-gregory-r-43
Last seen 10.3 years ago
> -----Original Message----- > From: Vincent Carey 525-2265 [mailto:stvjc@channing.harvard.edu] [...] > > > > I'm just starting to look at integrating Luke Tierney's > 'snow' package with > > the 'affy' package in order to parallelize the work. > > > > Initially, I'm planning on modifying 'express' by adding a > new parameter > > "cl" for cluster. Next I'll probably tackle ReadAffy and friends. > > [...] > > 1) Comments on the plan? > > seems a worthy endeavor, but > does the source of the affy package really need to be modified > for this? can't wrappers be written that break up the > problem and reassemble the results? keep the package distinct > from the various modes of execution Actually, it does look simplest to modify the source of the affy package. 'apply' and friends are already being used in the right places, and the changes are simple substitutions like: if( missing(cl) ) # do the normal apply thing else # do the parallel apply thing There would need to be quite a bit more thought -- and probably synchronization -- required to properly split up the data before, run the affy functions on the subsets, then reassemble the data. The basic problem is knowing which functions have data dependencies that prevent parallelization and which don't. It would be painful, from the outside, to do split data for fun1 run fun1 in parallel join the data for un-parallelizable fun2 run fun2 on all the data split the data for fun3 run fun 3 in parallel especially since some of the alternative approaches a particular step allow easy parallelization, and some don't. So, for instance, quantile normalization isn't trivially parellizable by splitting along chips, while globally scaling the trimmed mean to, say, 300 is trivially parallelizable. Knowing when to split and join will be a problem that requires examining the code for each potential function. Once you hit that level, its easier to just modify the functions themselves. -Greg -Greg LEGAL NOTICE Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately.
ADD COMMENT

Login before adding your answer.

Traffic: 612 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6