Processing agilent data by limma
1
0
Entering edit mode
@agaz-hussain-wani-7620
Last seen 6.7 years ago
India

I am trying to process Agilent data by using limma R package. For GSE10469, I used the following code

raw_data <- read.maimages(pdata[,1], source = "agilent") # pdata file is having group information
I get the error:
Error in readGenericHeader(fullname, columns = columns, sep = sep) :
  Specified column headings not found in file

 

When I try

raw_data <- read.maimages(pdata[,1], source = "agilent", green.only = TRUE)
Read GSM264878.txt
Error in RG[[a]][, i] <- obj[, columns[[a]]] :
  number of items to replace is not a multiple of replacement length

 

And also

raw_data <- read.maimages(pdata[,1], source = "agilent", green.only = FALSE)
Error in readGenericHeader(fullname, columns = columns, sep = sep) :
  Specified column headings not found in file

 

For  GSE32006

raw_data <- read.maimages(pdata[,1], source = "agilent")
Error in readGenericHeader(fullname, columns = columns, sep = sep) :
  Specified column headings not found in file

 

And

raw_data <- read.maimages(pdata[,1], source = "agilent", green.only = TRUE)
Read GSM792633.txt
Read GSM792634.txt
Read GSM792635.txt
Read GSM792636.txt
Read GSM792637.txt
Read GSM792638.txt
Error in readGenericHeader(fullname, columns = columns) :
  Specified column headings not found in file

The files which are read from GSE32006 are gene expression, where as other failed files are from exon array.

So how can I deal with all these issues.

 

 

limma agilent microarrays • 3.0k views
ADD COMMENT
0
Entering edit mode

Code snippets are not useful! Unless you show exactly what you did, you are expecting people to guess at what you might have done, and most people are too busy to bother with such things. You need to show a short, self-contained (e.g., anybody can run) bit of code to show exactly what you did and where the error is.

ADD REPLY
1
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

The first problem occurs when you try to read in single-channel data as if it was two color. The read.maimages() function requires that you tell it explicitly to read in the green channel only by specifying green.only=TRUE.

 

The second problem occurs when people upload "raw" data files to GEO that have been edited or corrupted, and are therefore no longer in proper Agilent format.

In the case of GSE10469, it appears that someone (one of the authors presumably) has opened the first file GSM264878.txt in Excel, then written it out again but now with extra rows and an extra column. The other files are ok. You can fix the problem simply by changing the order of the files when you read them in, so that GSM264878 is not the first file:

> files
 [1] "GSM264878.txt.gz" "GSM264879.txt.gz" "GSM264880.txt.gz" "GSM264881.txt.gz"
 [5] "GSM264882.txt.gz" "GSM264883.txt.gz" "GSM264884.txt.gz" "GSM264885.txt.gz"
 [9] "GSM264886.txt.gz" "GSM264887.txt.gz" "GSM264888.txt.gz" "GSM264889.txt.gz"
> x <- read.maimages(files[c(2,1,3:12)], source="agilent", green.only=TRUE)
Read GSM264879.txt.gz 
Read GSM264878.txt.gz 
Read GSM264880.txt.gz 
Read GSM264881.txt.gz 
Read GSM264882.txt.gz 
Read GSM264883.txt.gz 
Read GSM264884.txt.gz 
Read GSM264885.txt.gz 
Read GSM264886.txt.gz 
Read GSM264887.txt.gz 
Read GSM264888.txt.gz 
Read GSM264889.txt.gz 

 

In the case of GSE32006, you can't expect to read in gene expression and exon arrays with the same read command because they have different probe sets. You naturally have to read and analyse the gene arrays and the exon arrays separately.

ADD COMMENT
0
Entering edit mode

Thank you very much for the answer. I tried reading the exon arrays from GSE32006 seprately,

raw_data <- read.maimages(pdata[,1], source = "agilent", green.only = TRUE)

but I get the same issue

Error in readGenericHeader(fullname, columns = columns, sep = sep) :
  Specified column headings not found in file

 

However, when I run the gene arrays seprately, it worked fine.

 

ADD REPLY
1
Entering edit mode

For some reason, the exon arrays have been hybridized using the Cy5 (red) channel instead of Cy3 (green). To read them, you'll have to use a trick to tell read.maimages() to use the red channel only:

x <- read.maimages(files, source="agilent", green.only=TRUE,
                   columns=list(G="rMedianSignal",Gb="rBGMedianSignal"))

 

ADD REPLY

Login before adding your answer.

Traffic: 580 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6