Can't read in CEL files
1
0
Entering edit mode
OE • 0
@1b6d9c0f
Last seen 2.4 years ago
Morocco

Dear Bioinformaticists,

I'm new to Bioconductor and having issue dowloading some raw data by getGEOSuppFiles(). I stuck exactly in the step of reading in celfiles with read.celfiles().

** This is my code below :

getGEOSuppFiles( "GSE68849")

untar("GSE68849/GSE68849_RAW.tar", exdir = "GSE68849/CEL") list.files("GSE68849/CEL")

celf <- list.files("GSE68849/CEL", full = T)

read.celfiles(celf)

** It outputs the following error :

Error in read.celfile.header(x) : Is GSE68849/CEL/GPL10558_HumanHT-12_V4_0_R1_15002873_B.txt.gz really a CEL file? tried reading as text, gzipped text, binary, gzipped binary, command console and gzipped command console formats

Any help would be much appreciated !

sessionInfo( )

R version 4.1.3 (2022-03-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale: [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252
[3] LC_MONETARY=French_France.1252 LC_NUMERIC=C
[5] LC_TIME=French_France.1252

attached base packages: [1] stats graphics grDevices utils datasets methods
[7] base

other attached packages: [1] GEOquery_2.62.2 Biobase_2.54.0 BiocGenerics_0.40.0

loaded via a namespace (and not attached): [1] compiler_4.1.3 pillar_1.7.0 BiocManager_1.30.18 [4] prettyunits_1.1.1 remotes_2.4.2 tools_4.1.3
[7] testthat_3.1.4 pkgbuild_1.3.1 pkgload_1.2.4
[10] memoise_2.0.1 lifecycle_1.0.1 tibble_3.1.7
[13] pkgconfig_2.0.3 rlang_1.0.2 DBI_1.1.2
[16] cli_3.3.0 curl_4.3.2 fastmap_1.1.0
[19] xml2_1.3.3 withr_2.5.0 dplyr_1.0.9
[22] hms_1.1.1 desc_1.4.1 generics_0.1.2
[25] fs_1.5.2 vctrs_0.4.1 devtools_2.4.3
[28] tidyselect_1.1.2 rprojroot_2.0.3 glue_1.6.2
[31] data.table_1.14.2 R6_2.5.1 processx_3.5.3
[34] fansi_1.0.3 sessioninfo_1.2.2 limma_3.50.3
[37] tidyr_1.2.0 tzdb_0.3.0 readr_2.1.2
[40] callr_3.7.0 purrr_0.3.4 magrittr_2.0.3
[43] ps_1.7.0 ellipsis_0.3.2 usethis_2.1.6
[46] assertthat_0.2.1 utf8_1.2.2 cachem_1.0.6
[49] crayon_1.5.1 brio_1.1.3

GEO GEOquery • 1.4k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 10 hours ago
United States

That's an Illumina array study, so there won't be any celfiles (which are Affymetrix arrays).

ADD COMMENT
0
Entering edit mode

If you are new to Bioconductor, you should probably just use getGEO to read in the data, and then use the limma package to process and compare the samples. See the limma User's Guide for more information.

ADD REPLY
0
Entering edit mode

Thank you for replying Sir !

I've already used getGEO and done comparing the samples by limma package.

Now if possible I want to download the raw data by getsuppfiles () in order to processing it and comparing the samples again so that finally I can compare the results of the data downloaded by getGeo() and the one downloaded by getsuppfiles ().

This is my idea. In fact I'm now wondering if there is any way to get the raw data from an Illumina array study ?

ADD REPLY
0
Entering edit mode
> z <- read.ilmn("GSE68849_non-normalized.txt.gz")
Reading file GSE68849_non-normalized.txt.gz ... ...
> z
An object of class "EListRaw"
$source
[1] "illumina"

$E
     5522887032_E 5522887032_F 5522887032_I 5522887032_J 5455178010_A
[1,]     89.86488    163.44520     82.05317    138.51570    124.28760
[2,]    115.68020     97.23217    106.04080    132.30830    139.93170
[3,]     96.33978     90.08559     87.47095     88.99722     99.52484
[4,]     95.74127     88.69660     91.60001     94.42528    122.61510
[5,]     84.73225     93.76852     82.92064     96.32566    112.70710
     5455178010_B 5455178010_E 5455178010_F 5455178010_I 5455178010_J
[1,]     159.7212    107.31450     190.2620     115.3796     200.1683
[2,]     160.8593    114.49950     119.8016     142.8275     148.5468
[3,]     114.2166    105.07950     109.3282     124.8171     116.1298
[4,]     101.6195     95.41341     100.5267     117.9327     106.8634
[5,]     106.2227    108.77680     111.7508     108.9232     124.2282
47318 more rows ...

$other
$Detection
     5522887032_E 5522887032_F 5522887032_I 5522887032_J 5455178010_A
[1,]  0.268831200    0.0000000   0.72207790  0.005194805   0.20649350
[2,]  0.009090909    0.2519481   0.03636364  0.007792208   0.03896104
[3,]  0.107792200    0.5103896   0.51298700  0.770129900   0.78181820
[4,]  0.119480500    0.5454546   0.33766230  0.558441600   0.23896100
[5,]  0.483116900    0.3610390   0.69480520  0.479220800   0.47272730
     5455178010_B 5455178010_E 5455178010_F 5455178010_I 5455178010_J
[1,]  0.005194805    0.2857143   0.00000000   0.37402600   0.00000000
[2,]  0.005194805    0.1389610   0.05194805   0.01948052   0.02597403
[3,]  0.505194800    0.3415585   0.19090910   0.16623380   0.45714290
[4,]  0.787013000    0.6415585   0.38961040   0.30259740   0.71428570
[5,]  0.687013000    0.2571429   0.15454550   0.55454550   0.25584410
47318 more rows ...
ADD REPLY
0
Entering edit mode

It works wonders. Thank you again sir !

ADD REPLY

Login before adding your answer.

Traffic: 563 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6