GEOquery: Missing featureNames after creating expression set from series matrix file
0
1
Entering edit mode
knaxerova ▴ 10
@knaxerova-7541
Last seen 3.6 years ago
United States

Hi all, 

 

 

I am having some trouble with expression sets created by the GEOquery package from GEO series matrix files. There seems to be a problem with parsing featureNames.

I begin by downloading a series matrix file:

data <- getGEO(GEO="GSE63252",destdir=getwd())

> eset <- data[[1]]
> eset
ExpressionSet (storageMode: lockedEnvironment)
assayData: 54675 features, 27 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: GSM1544474 GSM1544475 ... GSM1544500 (27 total)
  varLabels: title geo_accession ... data_row_count (33 total)
  varMetadata: labelDescription
featureData
  featureNames: 1007_s_at 1053_at ... NA.17590 (54675 total)
  fvarLabels: ID GB_ACC ... Gene Ontology Molecular Function (16 total)
  fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL570 

The expression sets created in this way have missing feature names after row 37085.

> featureNames(eset)[37084]
[1] "227829_at"
> featureNames(eset)[37085]
[1] "NA"
> featureNames(eset)[37086]
[1] "NA.1"
> featureNames(eset)[50000]
[1] "NA.12915"

This happens with all series matrix files, not just with this one, BUT everything is fine when creating expression sets from GDS records. Thanks so much in advance for any help.

Kamila

> sessionInfo()
R version 3.1.3 (2015-03-09)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.3 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines   parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] affy_1.44.0         siggenes_1.40.0     multtest_2.22.0     GEOquery_2.32.0     Biobase_2.26.0      BiocGenerics_0.12.1

loaded via a namespace (and not attached):
 [1] affyio_1.34.0         BiocInstaller_1.16.4  bitops_1.0-6          MASS_7.3-40           preprocessCore_1.28.0 RCurl_1.95-4.5        stats4_3.1.3          survival_2.38-1      
 [9] tools_3.1.3           XML_3.98-1.1          zlibbioc_1.12.0  
geoquery • 1.7k views
ADD COMMENT
0
Entering edit mode

Can you do me a favor and let me know the output of:

file.info('GPL570.soft')$size
ADD REPLY
0
Entering edit mode

Hi Sean - so sorry for the late reply! I did not realize my email notifications were turned off, and I did not see your post until today. Thanks so much for your help.

> file.info('GPL570.soft')$size
[1] 51941825

ADD REPLY
1
Entering edit mode

Looks like the GPL570.soft file is probably truncated.  I would suggest removing it and refetching a new copy.  I get a file size of 65028051.

ADD REPLY
0
Entering edit mode

Wonderful, problem solved! Thank you so much!

ADD REPLY

Login before adding your answer.

Traffic: 620 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6