Decomposing and recomposing BioConductor data sets
1
0
Entering edit mode
@jarle-snertingdalen-593
Last seen 10.5 years ago
Hi We're working on a project which uses R and other similar programs/languages to all sorts of computing. The project is written in Python, and we use RPy to communicate with R. However, data sets (like swirl) from BioConductor are not compatible with RPy. Through some research we found that they are made up of empty lists with a range of attributes attached to them. I have to take apart the datasets, create 'clones' in python and then reconstruct them in R. The solution I have come up with so far is to recursively retrieve the attributes using 'attributes(foo)' and the 'foo$bar' syntax, place the retrieved attributes in a nested list, and convert the list to Python with RPy. My questions are: Why isn't the information put in nested lists in the first place, is my suggested approach valid for all BioConductor data sets, and is it the easiest way for generic decomposition of the data sets? regards, Jarle Snertingdalen Software Developer http://www.zherlock.org NTNU Norway (Norwegian University of Science and Technology) PS: Some background information about the project; It is called SciCraft (formerly known as Zherlock), and is a graphical data analasys tool using third party software (like R and Octave) for computation. A typical use of the program could be to read data in R-format, use some Octave- function on the data read, then sending the data back into R for further work, and finally plot the results with some plot-tool and perhaps exporting it to some format. Whether a function is ran in R og Octave is invisible to the user.
• 627 views
ADD COMMENT
0
Entering edit mode
rgentleman ★ 5.5k
@rgentleman-7725
Last seen 9.8 years ago
United States
On Fri, Jan 09, 2004 at 05:00:36PM +0100, Jarle Snertingdalen wrote: > Hi > > We're working on a project which uses R and other similar > programs/languages to all sorts of computing. The project is written in > Python, and we use RPy to communicate with R. However, data sets (like > swirl) from BioConductor are not compatible with RPy. Through some > research we found that they are made up of empty lists with a range of > attributes attached to them. > > I have to take apart the datasets, create 'clones' in python and then > reconstruct them in R. The solution I have come up with so far is to > recursively retrieve the attributes using 'attributes(foo)' and the > 'foo$bar' syntax, place the retrieved attributes in a nested list, and > convert the list to Python with RPy. They are S4 objects and you are not using the class and method definitions. You might want to look into the methods package a bit more, there is also John Chambers book, Programming with Data, and some notes on the developer page for the Bioconductor project. Robert > > My questions are: Why isn't the information put in nested lists in the > first place, is my suggested approach valid for all BioConductor data > sets, and is it the easiest way for generic decomposition of the data > sets? > > > regards, > > Jarle Snertingdalen > Software Developer http://www.zherlock.org > NTNU Norway (Norwegian University of Science and Technology) > > > PS: Some background information about the project; It is called SciCraft > (formerly known as Zherlock), and is a graphical data analasys tool using > third party software (like R and Octave) for computation. A typical use of > the program could be to read data in R-format, use some Octave- function on > the data read, then sending the data back into R for further work, and > finally plot the results with some plot-tool and perhaps exporting it to > some format. Whether a function is ran in R og Octave is invisible to the > user. > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor -- +--------------------------------------------------------------------- ------+ | Robert Gentleman phone : (617) 632-5250 | | Associate Professor fax: (617) 632-2444 | | Department of Biostatistics office: M1B20 | | Harvard School of Public Health email: rgentlem@jimmy.harvard.edu | +--------------------------------------------------------------------- ------+
ADD COMMENT

Login before adding your answer.

Traffic: 1034 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6