a problem in reading in cel files
1
0
Entering edit mode
@manuela-di-russo-4778
Last seen 10.3 years ago
Dear all, I am learning to analyse Affymetrix microarray data but I have a problem in reading .cel files in. I downloaded from GEO the raw data provided as supplementary files (GSE12345_RAW.tar), than I have extracted the cel files in a directory which I have set as my working directory. Here is the R code I used: > setwd("C:/BACKUP/Dati/Progetti/Landi/meta-analisi MPM/GSE12345_RAW") > library(affy) Carico il pacchetto richiesto: Biobase Welcome to Bioconductor Vignettes contain introductory material. To view, type 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation("pkgname")'. > dir() [1] "data analysis.txt" "E-GEOD-12345.sdrf.txt" "E-GEOD-12345.sdrf.xls" [4] "GSM309986.CEL" "GSM309987.CEL" "GSM309988.CEL" [7] "GSM309989.CEL" "GSM309990.CEL" "GSM309991.CEL" [10] "GSM310012.CEL" "GSM310013.CEL" "GSM310014.CEL" [13] "GSM310015.CEL" "GSM310016.CEL" "GSM310068.CEL" [16] "GSM310070.CEL" "target.txt" "target.xls" > pd <- read.AnnotatedDataFrame("target.txt",header=TRUE,row.names=1,a s.is=TRUE) > pData(pd) FileName Target N1 GSM309986.CEL pleural tissue N2 GSM309987.CEL pleural tissue N3 GSM309988.CEL pleural tissue N4 GSM309989.CEL pleural tissue MM1 GSM309990.CEL mesothelioma tissue MM2 GSM309991.CEL mesothelioma tissue MM3 GSM310012.CEL mesothelioma tissue MM4 GSM310013.CEL mesothelioma tissue MM5 GSM310014.CEL mesothelioma tissue MM6 GSM310015.CEL mesothelioma tissue MM7 GSM310016.CEL mesothelioma tissue MM8 GSM310068.CEL mesothelioma tissue MM9 GSM310070.CEL mesothelioma tissue > rawData <- read.affybatch(filenames=pData(pd)$FileName,phenoData=pd) Error in try(.Call("ReadHeaderDetailed", filename, PACKAGE = "affyio")) : Is GSM310016.CEL really a CEL file? tried reading as text, gzipped text, binary, gzipped binary, command console and gzipped command console formats. Errore in read.celfile.header(filenames[i], info = "full") : Failed to get full header information for GSM310016.CEL > rawData1<-ReadAffy() Error in try(.Call("ReadHeaderDetailed", filename, PACKAGE = "affyio")) : Is C:/BACKUP/Dati/Progetti/Landi/meta-analisi MPM/GSE12345_RAW/GSM310016.CEL really a CEL file? tried reading as text, gzipped text, binary, gzipped binary, command console and gzipped command console formats. Errore in read.celfile.header(filenames[i], info = "full") : Failed to get full header information for C:/BACKUP/Dati/Progetti/Landi/meta-analisi MPM/GSE12345_RAW/GSM310016.CEL > sessionInfo() R version 2.14.1 (2011-12-22) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252 [3] LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C [5] LC_TIME=Italian_Italy.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] affy_1.32.1 Biobase_2.14.0 loaded via a namespace (and not attached): [1] affyio_1.22.0 BiocInstaller_1.2.1 preprocessCore_1.16.0 [4] zlibbioc_1.0.0 > traceback() 7: stop("Failed to get full header information for ", filename) 6: read.celfile.header(filenames[i], info = "full") 5: FUN(1:13[[11L]], ...) 4: lapply(X = X, FUN = FUN, ...) 3: sapply(seq_len(length(filenames)), function(i) { sdate <- read.celfile.header(filenames[i], info = "full")[["ScanDate"]] if (is.null(sdate) || length(sdate) == 0) NA_character_ else sdate }) 2: read.affybatch(filenames = l$filenames, phenoData = l$phenoData, description = l$description, notes = notes, compress = compress, rm.mask = rm.mask, rm.outliers = rm.outliers, rm.extra = rm.extra, verbose = verbose, sd = sd, cdfname = cdfname) 1: ReadAffy() May be there is a problem in reading the cel file header, so I opened one of the cel files with a text-editor but it seems correct. Can anyone help me? Thank you very much! Manuela ---------------------------------------------------------------------- ------------------ Manuela Di Russo, Ph.D. Student Department of Experimental Pathology, MBIE University of Pisa Pisa, Italy e-mail: manuela.dirusso@for.unipi.it tel: +39050993538 [[alternative HTML version deleted]]
Microarray Microarray • 4.5k views
ADD COMMENT
0
Entering edit mode
James F. Reid ▴ 610
@james-f-reid-3148
Last seen 10.3 years ago
Hi Manuela, it looks like GSM310016.CEL starts with a blank line before the [CEL] header, no idea why this is so ?!? Removing this first empty line solves the issue, maybe check the other CEL files too. J. On 10/02/12 10:40, Manuela Di Russo wrote: > Dear all, > I am learning to analyse Affymetrix microarray data but I have a problem in reading .cel files in. > I downloaded from GEO the raw data provided as supplementary files (GSE12345_RAW.tar), than I have extracted the cel files in a directory which I have set as my working directory. > Here is the R code I used: > >> setwd("C:/BACKUP/Dati/Progetti/Landi/meta-analisi MPM/GSE12345_RAW") >> library(affy) > Carico il pacchetto richiesto: Biobase > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'browseVignettes()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation("pkgname")'. > >> dir() > [1] "data analysis.txt" "E-GEOD-12345.sdrf.txt" "E-GEOD-12345.sdrf.xls" > [4] "GSM309986.CEL" "GSM309987.CEL" "GSM309988.CEL" > [7] "GSM309989.CEL" "GSM309990.CEL" "GSM309991.CEL" > [10] "GSM310012.CEL" "GSM310013.CEL" "GSM310014.CEL" > [13] "GSM310015.CEL" "GSM310016.CEL" "GSM310068.CEL" > [16] "GSM310070.CEL" "target.txt" "target.xls" >> pd<- read.AnnotatedDataFrame("target.txt",header=TRUE,row.names=1,a s.is=TRUE) >> pData(pd) > FileName Target > N1 GSM309986.CEL pleural tissue > N2 GSM309987.CEL pleural tissue > N3 GSM309988.CEL pleural tissue > N4 GSM309989.CEL pleural tissue > MM1 GSM309990.CEL mesothelioma tissue > MM2 GSM309991.CEL mesothelioma tissue > MM3 GSM310012.CEL mesothelioma tissue > MM4 GSM310013.CEL mesothelioma tissue > MM5 GSM310014.CEL mesothelioma tissue > MM6 GSM310015.CEL mesothelioma tissue > MM7 GSM310016.CEL mesothelioma tissue > MM8 GSM310068.CEL mesothelioma tissue > MM9 GSM310070.CEL mesothelioma tissue >> rawData<- read.affybatch(filenames=pData(pd)$FileName,phenoData=pd) > Error in try(.Call("ReadHeaderDetailed", filename, PACKAGE = "affyio")) : > Is GSM310016.CEL really a CEL file? tried reading as text, gzipped text, binary, gzipped binary, command console and gzipped command console formats. > > Errore in read.celfile.header(filenames[i], info = "full") : > Failed to get full header information for GSM310016.CEL >> rawData1<-ReadAffy() > Error in try(.Call("ReadHeaderDetailed", filename, PACKAGE = "affyio")) : > Is C:/BACKUP/Dati/Progetti/Landi/meta-analisi MPM/GSE12345_RAW/GSM310016.CEL really a CEL file? tried reading as text, gzipped text, binary, gzipped binary, command console and gzipped command console formats. > > Errore in read.celfile.header(filenames[i], info = "full") : > Failed to get full header information for C:/BACKUP/Dati/Progetti/Landi/meta-analisi MPM/GSE12345_RAW/GSM310016.CEL >> sessionInfo() > R version 2.14.1 (2011-12-22) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252 > [3] LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C > [5] LC_TIME=Italian_Italy.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] affy_1.32.1 Biobase_2.14.0 > > loaded via a namespace (and not attached): > [1] affyio_1.22.0 BiocInstaller_1.2.1 preprocessCore_1.16.0 > [4] zlibbioc_1.0.0 >> traceback() > 7: stop("Failed to get full header information for ", filename) > 6: read.celfile.header(filenames[i], info = "full") > 5: FUN(1:13[[11L]], ...) > 4: lapply(X = X, FUN = FUN, ...) > 3: sapply(seq_len(length(filenames)), function(i) { > sdate<- read.celfile.header(filenames[i], info = "full")[["ScanDate"]] > if (is.null(sdate) || length(sdate) == 0) > NA_character_ > else sdate > }) > 2: read.affybatch(filenames = l$filenames, phenoData = l$phenoData, > description = l$description, notes = notes, compress = compress, > rm.mask = rm.mask, rm.outliers = rm.outliers, rm.extra = rm.extra, > verbose = verbose, sd = sd, cdfname = cdfname) > 1: ReadAffy() > > May be there is a problem in reading the cel file header, so I opened one of the cel files with a text-editor but it seems correct. > Can anyone help me? > Thank you very much! > Manuela > > -------------------------------------------------------------------- -------------------- > Manuela Di Russo, Ph.D. Student > Department of Experimental Pathology, MBIE > University of Pisa > Pisa, Italy > e-mail: manuela.dirusso at for.unipi.it > tel: +39050993538 > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT

Login before adding your answer.

Traffic: 773 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6