I'm trying to add annotation to an RMA normalized spreadsheet of microarrray expression data (probe ID in the first column, sample names in the first row). The arrays are from Affymetrix.
Combing through the forums it looks like I need to use the AnnotationDbi package. I've looked at the package documentation, but can't figure out how to use it. I'm new to R and programming.
So far, I have succeeded in transforming my data into an Expressionset object:
You have to know the type of array, and from that divine the annotation package name. Usually the annotation package name is based on taking the array name, going to all lower case, and stripping off all the non-letters. So as an example, the HG-U133_plus2 array's annotation package is hgu133plus2.db. It is a bit more complex for the more recent arrays, as they can be summarized at different levels. As an example, the HuGene 1.0 ST array has two annotation packages, the hugene10sttranscriptcluster.db and hugene10sttprobeset.db package, reflecting the summarization at the transcript and probeset levels, respectively.
Once you know the annotation package, you simply install, and use select() to get what you want. For most arrays this will result in a one-to-many mapping, as lots of annotations are not unique. You will have to decide how you want to deal with that complexity. One possibility is to add the annotations one at a time, using mapIds(), which by default will just take the first of any multi-hits.