Error when trying to read cel affymetrix file with custom platform
1
0
Entering edit mode
rajkk1 • 0
@rajkk1-9211
Last seen 9.0 years ago
United States

Hello!

I am trying to get an Expression Set from cel files. The files were made using 2 different platforms, one of which is a custom platform. I have the latest version of the 'ff' and the 'ArrayExpress' packages. Here is my code:

#Load Libraries
require(ArrayExpress)
require(gcrma)

filename <- 'E-GEOD-34289' #Example: GSE34289 corresponds to E-GEOD-34289 in Arrray Express

rawData = getAE(filename,type="raw") 
rawExpressionSet = ae2bioc(rawData)

The raw data reads in fine with getAE but when I try to make an Expression Set, R can read in the first 49 cel files fine (i.e. the ones that are made with the regular platform) but as soon as it gets to the first one from the custom array, it spits out this error:

Error in read.celfile.header(x) : 
  Cel file C:/Users/Rajiv/Dropbox (CS229bawssteam)/Classes/CS229/CS229SwagProject/GSE34289_RAW/GSM846728.CEL does not seem to be have cdf information
Error in readAEdata(path = path, files = dataFiles, dataCols = dataCols,  : 
  Unable to read cel files inC:/Users/Rajiv/Dropbox (CS229bawssteam)/Classes/CS229/CS229SwagProject/GSE34289_RAW
Error in ae2bioc(rawData) : ArrayExpress: Unable to read assay data

Could someone help me out? Thanks!

 

affymetrix microarrays ae2bioc getAE • 1.4k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 28 minutes ago
United States

Please don't cross-post. Either ask here or on stackoverflow, but not both places.

There are two problems here. First, you are expecting R to magically figure out that you have two different kinds of arrays, and then process the first 49 separately from the last how many ever. This isn't how things work - you have to process the arrays separately yourself, because R isn't an AI.

Secondly, as you note, the second set of arrays are custom. This will be a bit of a problem for you, as GEO only has the mps, pgf, and clf files. You can hypothetically use the oligo package to analyze these data, but the 'stock' method of building a pdInfoPackage for oligo requires both the probeset and transcript csv files as well, and I don't see them there. The xps package requires the transcript csv file as well, so that's a problem. Maybe you could contact the submitters and see if they will give you that.

There is some new code in the pdInfoBuilder package that is intended to use a flat-file structure as input to make a pdInfoPackage, and if you really need to do this, then you might look at Benilton's github page. But that will be tough sledding to get things going, especially if you are new to R or coding in general. This isn't something that comes up regularly, so it isn't something that either Benilton nor I am likely to put near the top of the list of things to get done, but it does look like a fun problem, so maybe if I get time I can add something to pdInfoBuilder.

ADD COMMENT

Login before adding your answer.

Traffic: 945 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6