I am relatively new to preprocessing microarray data, and am trying to analyze the GEO dataset "GSE56045". I downloaded the supplementary RAW files to manipulate with lumi, however the file format does not seem to be compatible with the lumiR function. The header of the RAW file is as follows, if this helps:
? Illumina, Inc.
[Heading]
Date 15/4/2010
ContentVersion 4.0
FormatVersion 1.0.0
Number of Probes 47231
Number of Controls 887
[Probes]
When i call the lumiR function, the error message is:
"Error in gregexpr("\t", dataLine1)[[1]] : subscript out of bounds"
This confuses me because the file appears to be a tab separated document.
Is this data in a format readable by lumi? should I use a different package instead?
The naming of the GEO series supplementary files is somewhat misleading. I guess you are trying to read the file GSE56045_RAW.tar, but that actually contains Illumina Bead Manifest files, which give probe annotation rather than expression data. The raw expression data is instead in the file GSE56045_non_normalized.txt.gz.
I was able to read the data using the limma package:
Thank you so much Gordon! I really appreciate it!
Dear Gordon,
Sorry to bother you but if you can take a look at my post and can give me any suggestion, should be helpful. https://support.bioconductor.org/p/125225/
Thank you,