GO annotation inconsistency
1
0
Entering edit mode
Daniel Gatti ▴ 70
@daniel-gatti-1721
Last seen 10.2 years ago
O/S: Windows XP R: 2.3.1 Bioconductor: 1.8 I'm trying to get a list of all probes in a given GO category. In the Bioconductor annotation libraries there are mapping from GO category to probe ID and from probe ID to GO category. I'm finding that they do not match in terms of annotation. Here's a sample script: library(hgu95av2) library(GO) # Get list of probe -> GO mappings. hgu95av2GO.list = as.list(hgu95av2GO) hgu95av2GO.list = lapply(hgu95av2GO.list, names) # Work with GO category 7031. GO.7031.probes = unique(get("GO:0007031", hgu95av2GO2ALLPROBES)) length(GO.7031.probes) [1] 16 probe2GO.7031 = hgu95av2GO.list[match(GO.7031.probes, names(hgu95av2GO.list))] length(grep("GO:0007031", probe2GO.7031)) [1] 11 Note that the GO -> probe list gives me 16 probes in category 7031 while the probe -> GO list gives me 11 probes. This happens for a lot of categories. Am I missing some key concept or is there something else going on? Thanks, Dan Gatti UNC-CH
Annotation GO probe Category Annotation GO probe Category • 1.2k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
Hi Daniel, Daniel Gatti wrote: > O/S: Windows XP > R: 2.3.1 > Bioconductor: 1.8 > > I'm trying to get a list of all probes in a given GO category. In the > Bioconductor annotation libraries there are mapping from GO category to > probe ID and from probe ID to GO category. I'm finding that they do not > match in terms of annotation. Here's a sample script: > > library(hgu95av2) > library(GO) > > # Get list of probe -> GO mappings. > hgu95av2GO.list = as.list(hgu95av2GO) > hgu95av2GO.list = lapply(hgu95av2GO.list, names) > > # Work with GO category 7031. > GO.7031.probes = unique(get("GO:0007031", hgu95av2GO2ALLPROBES)) > The problem here is that you are using the wrong environment. If you look at the man page for this env, you will see that this maps the GO term in question _and_ all of its children to the probe ID (you actually want the hgu95av2GO2PROBE environment). In contrast, the hgu95av2GO environment maps probe IDs only to the GO terms, excluding the children. If you use the correct environment, things work out. > library(hgu95av2) > hgu95av2GO.list = as.list(hgu95av2GO) > hgu95av2GO.list = lapply(hgu95av2GO.list, names) > GO.7031.probes = unique(get("GO:0007031", hgu95av2GO2PROBE)) > length(GO.7031.probes) [1] 11 > > probe2GO.7031 = hgu95av2GO.list[match(GO.7031.probes, + names(hgu95av2GO.list))] > length(grep("GO:0007031", probe2GO.7031)) [1] 11 HTH, Jim > length(GO.7031.probes) > [1] 16 > probe2GO.7031 = hgu95av2GO.list[match(GO.7031.probes, > names(hgu95av2GO.list))] > length(grep("GO:0007031", probe2GO.7031)) > [1] 11 > > Note that the GO -> probe list gives me 16 probes in category 7031 while > the probe -> GO list gives me 11 probes. This happens for a lot of > categories. Am I missing some key concept or is there something else > going on? > > Thanks, > Dan Gatti > UNC-CH > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
ADD COMMENT

Login before adding your answer.

Traffic: 490 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6