getGEO vs GEOmetadb
0
0
Entering edit mode
Jack Zhu ▴ 170
@jack-zhu-3338
Last seen 7.1 years ago
Hi Thomas, Sorry that I missed your posts on the bioconductor mailing list. We did have issues with updating recent GEO data and that seem has been fixed: ----------------------- > con <- dbConnect(SQLite(), "GEOmetadb.sqlite") > dat <- dbGetQuery(con, "select * from gds where gds = 'GDS4252'") > dat ID gds title 1 3354 GDS4252 Cystic fibrosis bronchial epithelial cells exposure to Pseudomonas aeruginosa PA01 biofilms description 1 Analysis of cystic fibrosis (CF) bronchial epithelial CFBE41o- cells exposed to Pseudomonas aeruginosa PA01 biofilms. Cells overexpressing 508del-CFTR and cells rescued with wild type CFTR were examined. CFTR mutations enhance the inflammatory response in the lung to PA01 infection. type pubmed_id gpl platform_organism platform_technology_type feature_count sample_organism sample_type channel_count 1 Expression profiling by array 22821996 GPL570 Homo sapiens in situ oligonucleotide 54675 Homo sapiens RNA 1 sample_count value_type gse order update_date 1 16 transformed count GSE30439 none 2013-04-23 ---------------------------- Could you redonwload the GEOmetadb.sqlite.gz and try again? Please don't hesitate to contact me directly if you still see any problems. Thanks. Jack On Thu, Jun 13, 2013 at 9:05 AM, Thomas H. Hampton <thomas.h.hampton at="" dartmouth.edu=""> wrote: > Hi Sean and Jack, > > Sorry to pester you with this. I posted it to BioC twice and got no response so I thought I should try contacting you more directly. > > > The following getGEO query retrieves data files and meta data for a recent GEO submission of mine, > one that has been curated: > > GDS4252 <- getGEO("GDS4252") > Columns(GDS4252) >> str(Columns(GDS4252)) > 'data.frame': 16 obs. of 4 variables: > $ sample : Factor w/ 16 levels "GSM754979","GSM754980",..: 5 6 7 8 1 2 3 4 13 14 ... > $ genotype/variation: Factor w/ 2 levels "CFTR mutant",..: 1 1 1 1 1 1 1 1 2 2 ... > $ agent : Factor w/ 2 levels "PA01","unexposed": 1 1 1 1 2 2 2 2 1 1 ... > > The folks at NCBI have correctly created two factors with two levels to describe the 16 samples in my experiment. > > I am interested in retrieving similar information using GEOmetadb, but this has proved problematic. > > getSQLiteFile(destdir = getwd(), destfile = "GEOmetadb.sqlite.gz") > > con <- dbConnect(SQLite(), "GEOmetadb.sqlite") > dat <- dbGetQuery(con, "select * from gds where gds = 'GDS4252'") > >> dat > [1] ID gds title > [4] description type pubmed_id > [7] gpl platform_organism platform_technology_type > [10] feature_count sample_organism sample_type > [13] channel_count sample_count value_type > [16] gse order update_date > <0 rows> (or 0-length row.names) > > It seems, for starters, that this GDS identifier for my particular submission isn't accounted for in the current > database. > > Others are, so it looks like my syntax and so forth is ok: > >> dat <- dbGetQuery(con, "select gds from gds limit 10") >> dat > gds > 1 GDS5 > 2 GDS6 > 3 GDS10 > 4 GDS12 > 5 GDS15 > 6 GDS16 > 7 GDS17 > 8 GDS18 > 9 GDS19 > 10 GDS20 > > > There is also the question of where a set of fields (variable in number) describing sample factors and their levels would actually "live" > in the SQLite database. > > This information does not seem to be an attribute of the GDS in any case: > >> dat <- dbGetQuery(con, "select fieldname from geodb_column_desc where TableName = 'gds'") >> dat > FieldName > 1 ID > 2 channel_count > 3 description > 4 feature_count > 5 gds > 6 order > 7 platform > 8 platform_organism > 9 platform_technology_type > 10 pubmed_id > 11 reference_series > 12 sample_count > 13 sample_organism > 14 sample_type > 15 title > 16 type > 17 update_date > 18 value_type > > Nor does it seem to be a feature stored in the samples: > >> dat <- dbGetQuery(con, "select fieldname from geodb_column_desc where TableName = 'gsm'") >> dat > FieldName > 1 ID > 2 channel_count > 3 characteristics_ch1 > 4 characteristics_ch2 > 5 contact > 6 data_processing > 7 data_row_count > 8 description > 9 extract_protocol_ch1 > 10 extract_protocol_ch2 > 11 gpl > 12 gse > 13 gsm > 14 hyb_protocol > 15 label_ch1 > 16 label_ch2 > 17 label_protocol_ch1 > 18 label_protocol_ch2 > 19 last_update_date > 20 molecule_ch1 > 21 molecule_ch2 > 22 organism_ch1 > 23 organism_ch2 > 24 source_name_ch1 > 25 source_name_ch2 > 26 status > 27 submission_date > 28 supplementary_file > 29 title > 30 treatment_protocol_ch1 > 31 treatment_protocol_ch2 > 32 type > > > Any advice greatly appreciated. > > > Tom >
Lung Homo sapiens Pseudomonas aeruginosa GEOmetadb Lung Homo sapiens Pseudomonas aeruginosa • 1.5k views
ADD COMMENT

Login before adding your answer.

Traffic: 828 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6