Dear Gurus,
I am doing an Illumina microarray analysis. The study design is a 2x2
(i.e. varying on two different conditions). As part of the analysis
I'm doing a GO analysis. There are a few GO categories of special
interest, so I want to extract data for the probes identified in these
categories and cluster the data.
The problem is that after performing the GO analysis, I essentially
cannot figure out how to extract the data for these probes. I have
done lots of googling and have figured out that "geneIdsByCategory"
(e.g. geneIdsByCategory(mfOver1)[["GO:0001077"]]) will tell me the
EntrezIDs for the genes, but I cannot figure out how to map those back
to the probeIDs.
I also came across "probeSetSummary," which maps between EntrezID and
ProbeID, but the data from this method does not seem to match that
from "geneIdsByCategory." Specifically, the number of unique EntrezIDs
in each GO category are different. Here is some example output (only
showing results from one GO category):
EntrezID ProbeSetID selected
1 16600 0khLe85Huv0juQw.sQ 0
2 16600 35LRC1Xd1PNCJ05Ras 0
3 16600 rpUHFdf15SFI5LRC1U 0
4 18124 BteTYfS5fYo.qi6dh0 0
5 18124 TnIofrF1F97TYQnfX4 0
6 18124 rQIi6KJzkUI0QknwKE 0
7 21420 NVwtViinW54gHvi7Eg 0
8 21420 NZWVEWR3oXld_i3_4c 0
9 21420 xvEFZWVEWR3oXld_i0 0
[1] "13653" "16600" "18124" "21420" "22038"
Can anyone give me guidance on how to get from the GO analysis to
clustering? I know how to cluster, but getting from EntrezIDs back to
probeIDs is my problem. Well, I think that's my problem anyway. If you
know of a better way to do it, I'd love to hear it!
Thanks in advance!
-- output of sessionInfo():
R version 2.15.1 (2012-06-22)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] GO.db_2.8.0 GOstats_2.24.0 graph_1.36.1
Category_2.24.0 limma_3.14.1 annotate_1.36.0
lumiMouseAll.db_1.18.0 org.Mm.eg.db_2.8.0
[9] RSQLite_0.11.2 DBI_0.2-5
AnnotationDbi_1.20.3 xtable_1.7-0 lumi_2.10.0
nleqslv_1.9.4 Biobase_2.18.0 BiocGenerics_0.4.0
[17] vimcom_0.9-5 setwidth_1.0-2 lattice_0.20-10
loaded via a namespace (and not attached):
[1] affy_1.36.0 affyio_1.26.0 AnnotationForge_1.0.2
BiocInstaller_1.8.3 colorspace_1.2-0 genefilter_1.40.0
grid_2.15.1 GSEABase_1.20.0
[9] IRanges_1.16.4 KernSmooth_2.23-8 MASS_7.3-22
Matrix_1.0-10 methylumi_2.4.0 mgcv_1.7-22
nlme_3.1-105 parallel_2.15.1
[17] preprocessCore_1.20.0 RBGL_1.34.0 splines_2.15.1
stats4_2.15.1 survival_2.36-14 tcltk_2.15.1
tools_2.15.1 XML_3.95-0.1
[25] zlibbioc_1.4.0
