Entering edit mode
Hi everyone,
A colleague of mine has been using BioC to do his analyses on his
custom stickleback array, and asked me to check to make sure he was
doing everything correctly. I'm not sure about one part, so I'm
asking the list. The stickleback genome is not well annotated in
regards to GO terms, but he has mapped all the stickleback microarray
probes to zebrafish EntrezGene IDs. Instead of making his own
chip-specific AnnotationDBI package, he's doing GO testing using the
zebrafish.db packages from BioC. Everything seems to be running fine
and there are no error messages, but I'm not sure if there aren't
some hidden problems.
For example, what if some of his zebrafish EntrezGene IDs are not on
the Affy zebrafish array? Would it be fine as long as they were in
the org.Dr.eg.db based data package? According to the GOstats
vignette on Hypergeometric Tests, when you set up your GOHyperGParams
object, you have to specify a chip-specific AnnotationDBI package,
but you input your selected genes and universe as (usually)
EntrezGene IDs, not probe set IDs. This makes me wonder if hyperGTest
is doing everything through the zebrafish.db package, or going
directly to the org.Dr.eg.db package?
He hasn't checked yet to see if all of his EntrezGene IDs are or are
not on the Affy zebrafish array, and if they are then I guess there
is no problem in simply using the zebrafish.db package. Mainly my
question is why does the GOstats hyperGTest() require a chip-specific
annotation package when you have to input EntrezGene IDs (the
mapping key for the org.Xx.eg.db packages), not probe set
IDs? Could hyperGTest() be (easily) modified to directly accept the
org.Xx.eg.db packages?
Thanks,
Jenny
Jenny Drnevich, Ph.D.
Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign
330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA
ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at illinois.edu