Entering edit mode
I asked a similar question yesterday - wanted to clarify and give more
information. I am using limma to analyze microarray data from
Affymetrix
HuGene 1.0 ST arrays. I'm reading in the CEL files using ReadAffy.
Both
sources of annotation confirm that I'm using the hugene1.0st array:
> affybatch@cdfName
[1] "HuGene-1_0-st-v1"
> eset@annotation
[1] "hugene10stv1"
I fit a model, and now I want to annotate the results with gene
symbols
rather than the probeset IDs:
> fit <- lmFit(eset, design)
> head(fit$genes)
ID
1 7892501
2 7892502
3 7892503
4 7892504
5 7892505
6 7892506
When I try to use getSYMBOL (as per Gordon's suggestion from a
previous
post:https://stat.ethz.ch/pipermail/bioconductor/2011-February/037866.
html),
none of these symbols map:
> getSYMBOL(head(fit$genes$ID), "hugene10stprobeset.db")
7892501 7892502 7892503 7892504 7892505 7892506
NA NA NA NA NA NA
In fact, of my 32,321 probeset IDs, only 150 match up with the IDs in
the
hugene10stprobeset.db package:
> mapped_probes <- mappedkeys(hugene10stprobesetSYMBOL)
> head(mapped_probes)
[1] "7896741" "7896743" "7896745" "7896755" "7896757" "7896758"
> length(fit$genes$ID)
[1] 32321
> length(mapped_probes)
[1] 238111
> sum(fit$genes$ID %in% mapped_probes)
[1] 150
Thanks in advance for any help!
Stephen
> sessionInfo()
R version 2.14.0 (2011-10-31)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] C/en_US.UTF-8/C/C/C/C
attached base packages:
[1] grid stats graphics grDevices utils datasets
methods
base
other attached packages:
[1] hugene10stv1probe_2.9.0 BiocInstaller_1.2.1
hugene10stv1cdf_2.9.1 hugene10stprobeset.db_8.0.1
[5] org.Hs.eg.db_2.6.4 RSQLite_0.11.1 DBI_0.2-5
annotate_1.32.1
[9] AnnotationDbi_1.16.10 pvclust_1.2-2
calibrate_1.7
gplots_2.10.1
[13] KernSmooth_2.23-7 caTools_1.12
bitops_1.0-4.1
gdata_2.8.2
[17] gtools_2.6.2 limma_3.10.1
arrayQualityMetrics_3.10.0 affy_1.32.0
[21] Biobase_2.14.0
[[alternative HTML version deleted]]