Experiment export in Gene Expression Omnibus (GEO) SOFT format
2
0
Entering edit mode
@henrik-hornshj-jensen-1553
Last seen 10.1 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20060524/ 42d715fe/attachment.pl
• 912 views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 5 weeks ago
United States
On 5/24/06 5:23 AM, "Henrik Hornsh?j Jensen" <henrikh.jensen at="" agrsci.dk=""> wrote: > Hi, > > Anyone know if there is an R script or package for exporting microarray > experiments in SOFT file format for submission to GEO? > Could be from expression data set objects, MA objects or other. > I know there is GEOquery package, but I believe this is only for retrieving > data from GEO. Henrik, You are correct in assuming that GEOquery only retrieves data from GEO. I have thought about trying to make some tools for submission, but I don't see an easy way to make these general. In addition, much of the data that we store in Bioc data structures is already processed; GEO benefits from including as much raw data as possible and these data are not available in an expression data set. In practice, we use a set of scripts (perl, in this case, but R would work just fine) to produce the SOFT format files from a set of "spreadsheets" that describe the files, their subsets, etc. The GEO website describes the formats necessary to produce--they are not that complicated. For each project and array format, we modify things slightly, but the gist remains the same. However, there are enough variations in file formats and experimental designs that producing a "fully automated" set of scripts for doing GEO submissions is quite challenging. Sean
ADD COMMENT
0
Entering edit mode
@henrik-hornshj-jensen-1553
Last seen 10.1 years ago
Thank you for clearing this up. To me it seems obvious to do the SOFT export in R as well. Perhaps you could send the perl/R scripts you have been using. Henrik -----Oprindelig meddelelse----- Fra: Sean Davis [mailto:sdavis2 at mail.nih.gov] Sendt: Wednesday, May 24, 2006 12:55 PM Til: Henrik Hornsh?j Jensen; Bioconductor Emne: Re: [BioC] Experiment export in Gene Expression Omnibus (GEO) SOFT format On 5/24/06 5:23 AM, "Henrik Hornsh?j Jensen" <henrikh.jensen at="" agrsci.dk=""> wrote: > Hi, > > Anyone know if there is an R script or package for exporting > microarray experiments in SOFT file format for submission to GEO? > Could be from expression data set objects, MA objects or other. > I know there is GEOquery package, but I believe this is only for > retrieving data from GEO. Henrik, You are correct in assuming that GEOquery only retrieves data from GEO. I have thought about trying to make some tools for submission, but I don't see an easy way to make these general. In addition, much of the data that we store in Bioc data structures is already processed; GEO benefits from including as much raw data as possible and these data are not available in an expression data set. In practice, we use a set of scripts (perl, in this case, but R would work just fine) to produce the SOFT format files from a set of "spreadsheets" that describe the files, their subsets, etc. The GEO website describes the formats necessary to produce--they are not that complicated. For each project and array format, we modify things slightly, but the gist remains the same. However, there are enough variations in file formats and experimental designs that producing a "fully automated" set of scripts for doing GEO submissions is quite challenging. Sean
ADD COMMENT
0
Entering edit mode
On 5/26/06 3:17 AM, "Henrik Hornsh?j Jensen" <henrikh.jensen at="" agrsci.dk=""> wrote: > Thank you for clearing this up. > To me it seems obvious to do the SOFT export in R as well. The main problem with doing so is that the raw data will typically not be included if done from R. The raw data is, in my mind, much more important than any normalized or processed data, as re-normalization of raw data is easy, while the usefulness of the normalized data is very limited (likely limited to only the project at hand). > Perhaps you could send the perl/R scripts you have been using. I could, but they are not in a "distributable form". We have plans to make them slightly more useful and general, but we don't really have a goal of releasing them. Again, generality is a difficult-to-attain goal. Essentially, what we do is to construct the SOFT format header based on a template and fill the template from an Excel spreadsheet--R or perl could be used for this. After the header, we concatenate the raw tab-delimited text file, then do the same for all the datafiles associated with an experiment. SOFT is nice in that all of this text is simply concatenated. There are examples of the types of headers that one needs to fill located on the batch deposit guide on the GEO website. Sean
ADD REPLY

Login before adding your answer.

Traffic: 381 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6