Entering edit mode
Hi all,
I am having some trouble with expression sets created by the GEOquery package from GEO series matrix files. There seems to be a problem with parsing featureNames.
I begin by downloading a series matrix file:
data <- getGEO(GEO="GSE63252",destdir=getwd()) > eset <- data[[1]] > eset ExpressionSet (storageMode: lockedEnvironment) assayData: 54675 features, 27 samples element names: exprs protocolData: none phenoData sampleNames: GSM1544474 GSM1544475 ... GSM1544500 (27 total) varLabels: title geo_accession ... data_row_count (33 total) varMetadata: labelDescription featureData featureNames: 1007_s_at 1053_at ... NA.17590 (54675 total) fvarLabels: ID GB_ACC ... Gene Ontology Molecular Function (16 total) fvarMetadata: Column Description labelDescription experimentData: use 'experimentData(object)' Annotation: GPL570
The expression sets created in this way have missing feature names after row 37085.
> featureNames(eset)[37084] [1] "227829_at" > featureNames(eset)[37085] [1] "NA" > featureNames(eset)[37086] [1] "NA.1" > featureNames(eset)[50000] [1] "NA.12915"
This happens with all series matrix files, not just with this one, BUT everything is fine when creating expression sets from GDS records. Thanks so much in advance for any help.
Kamila
> sessionInfo() R version 3.1.3 (2015-03-09) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.10.3 (Yosemite) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] splines parallel stats graphics grDevices utils datasets methods base other attached packages: [1] affy_1.44.0 siggenes_1.40.0 multtest_2.22.0 GEOquery_2.32.0 Biobase_2.26.0 BiocGenerics_0.12.1 loaded via a namespace (and not attached): [1] affyio_1.34.0 BiocInstaller_1.16.4 bitops_1.0-6 MASS_7.3-40 preprocessCore_1.28.0 RCurl_1.95-4.5 stats4_3.1.3 survival_2.38-1 [9] tools_3.1.3 XML_3.98-1.1 zlibbioc_1.12.0
Can you do me a favor and let me know the output of:
Hi Sean - so sorry for the late reply! I did not realize my email notifications were turned off, and I did not see your post until today. Thanks so much for your help.
> file.info('GPL570.soft')$size
[1] 51941825
Looks like the GPL570.soft file is probably truncated. I would suggest removing it and refetching a new copy. I get a file size of 65028051.
Wonderful, problem solved! Thank you so much!