error reading GSE file
2
0
Entering edit mode
Reema Singh ▴ 570
@reema-singh-4373
Last seen 10.3 years ago
Dear all I am trying to read a GSE file in R using GEOquery package but i am getting following error.Kindly tell me why i am getting this error. I have tried to find out on google. But no luck... u <- getGEO(filename="GSE1106_family.soft",GSEMatrix=TRUE) Parsing.... Found 22 entities... GPL199 (1 of 22 entities) GSM18235 (2 of 22 entities) GSM18236 (3 of 22 entities) Error in substr(x, start = matches + patlen, stop = 1e+07) : invalid multibyte string at '<92>s pre' In addition: There were 20 warnings (use warnings() to see them) Regards~ Reema Singh [[alternative HTML version deleted]]
GEOquery GEOquery • 1.7k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 4 months ago
United States
On Tue, Jul 26, 2011 at 6:38 AM, Reema Singh <reema28sep at="" gmail.com=""> wrote: > Dear all > > I am trying to read a GSE file in R using GEOquery package but i am getting > following error.Kindly tell me why i am getting this error. I have tried to > find out on google. But no luck... > > u <- getGEO(filename="GSE1106_family.soft",GSEMatrix=TRUE) > Parsing.... > Found 22 entities... > GPL199 (1 of 22 entities) > GSM18235 (2 of 22 entities) > GSM18236 (3 of 22 entities) > Error in substr(x, start = matches + patlen, stop = 1e+07) : > ?invalid multibyte string at '<92>s pre' Hi, Reema. This is caused by an invalid character in the data from NCBI. I have contacted them to fix the problem. In the meantime, you can try: u = getGEO('GSE1106') This will grab the GSEMatrix file which is apparently unaffected. Sean
ADD COMMENT
0
Entering edit mode
On Tue, Jul 26, 2011 at 8:39 AM, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > On Tue, Jul 26, 2011 at 6:38 AM, Reema Singh <reema28sep at="" gmail.com=""> wrote: >> Dear all >> >> I am trying to read a GSE file in R using GEOquery package but i am getting >> following error.Kindly tell me why i am getting this error. I have tried to >> find out on google. But no luck... >> >> u <- getGEO(filename="GSE1106_family.soft",GSEMatrix=TRUE) >> Parsing.... >> Found 22 entities... >> GPL199 (1 of 22 entities) >> GSM18235 (2 of 22 entities) >> GSM18236 (3 of 22 entities) >> Error in substr(x, start = matches + patlen, stop = 1e+07) : >> ?invalid multibyte string at '<92>s pre' > > Hi, Reema. > > This is caused by an invalid character in the data from NCBI. ?I have > contacted them to fix the problem. One has to love the GEO staff. They've already fixed the problem. Thanks for the report, Reema. Sean > In the meantime, you can try: > > u = getGEO('GSE1106') > > This will grab the GSEMatrix file which is apparently unaffected. > > Sean >
ADD REPLY
0
Entering edit mode
acd13 • 0
@acd13-7209
Last seen 10.0 years ago
United States

Sorry to pick up an old thread here, but I'm having the same issue, and also unable to find much of an answer after some searching.  

Thanks for any and all help!

 

I'm running the command below

getGEO('GSE3118') 

In the following setting

> sessionInfo()

R version 3.1.2 (2014-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] affy_1.44.0         ArrayExpress_1.26.0 GEOquery_2.32.0
[4] Biobase_2.26.0      BiocGenerics_0.12.0 RMySQL_0.9-3
[7] DBI_0.3.1

loaded via a namespace (and not attached):
[1] affyio_1.34.0         BiocInstaller_1.16.1  limma_3.22.1
[4] preprocessCore_1.28.0 RCurl_1.95-4.3        XML_3.98-1.1
[7] zlibbioc_1.12.0

ADD COMMENT
0
Entering edit mode

You will need to give us more than that to go on. This works for me:

> geo <- getGEO('GSE3118')
ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE3nnn/GSE3118/matrix/
Found 1 file(s)
GSE3118_series_matrix.txt.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  150k  100  150k    0     0  98938      0  0:00:01  0:00:01 --:--:-- 98890
File stored at:
/data3/tmp/RtmpC8WWiS/GPL2750.soft
> geo[[1]]
ExpressionSet (storageMode: lockedEnvironment)
assayData: 36541 features, 3 samples
  element names: exprs
protocolData: none
phenoData
  sampleNames: GSM69721 GSM69722 GSM69723
  varLabels: title geo_accession ... data_row_count (32 total)
  varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation: GPL2750
> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] GEOquery_2.32.0     Biobase_2.26.0      BiocGenerics_0.12.1

loaded via a namespace (and not attached):
[1] bitops_1.0-6   RCurl_1.95-4.5 XML_3.98-1.1  

 

ADD REPLY

Login before adding your answer.

Traffic: 1052 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6