Hi,
Acutally, I have two question.
- I want to extract a GO term realted gene, and I find two solution:
> AnnotationDbi::select(org.At.tair.db,
+ keys = "GO:0048449",
+ keytype = "GO", columns = "TAIR") %>%
+ as_tibble()
'select()' returned 1:many mapping between keys and columns
# A tibble: 74 x 4
GO EVIDENCE ONTOLOGY TAIR
<chr> <chr> <chr> <chr>
1 GO:0048449 RCA BP AT1G08130
2 GO:0048449 RCA BP AT1G09450
3 GO:0048449 RCA BP AT1G13030
4 GO:0048449 RCA BP AT1G13980
5 GO:0048449 RCA BP AT1G20410
6 GO:0048449 RCA BP AT1G21690
7 GO:0048449 RCA BP AT1G21880
8 GO:0048449 RCA BP AT1G23000
9 GO:0048449 RCA BP AT1G26190
10 GO:0048449 RCA BP AT1G33410
# … with 64 more rows
> AnnotationDbi::select(org.At.tair.db,
+ keys = "GO:0048449",
+ keytype = "GOALL", columns = "TAIR") %>%
+ as_tibble()
'select()' returned 1:many mapping between keys and columns
# A tibble: 183 x 4
GOALL EVIDENCEALL ONTOLOGYALL TAIR
<chr> <chr> <chr> <chr>
1 GO:0048449 RCA BP AT1G01370
2 GO:0048449 RCA BP AT1G02800
3 GO:0048449 RCA BP AT1G04050
4 GO:0048449 RCA BP AT1G04760
5 GO:0048449 RCA BP AT1G05440
6 GO:0048449 RCA BP AT1G06420
7 GO:0048449 RCA BP AT1G08130
8 GO:0048449 RCA BP AT1G09450
9 GO:0048449 RCA BP AT1G10980
10 GO:0048449 RCA BP AT1G13030
# … with 173 more rows
I am wondering why GOALL
gives more genes ? I have seen the definition of GOALL
, it says that
GOALL: GO Identifiers (includes less specific terms)
- The second question is that I use this two select, and it produce a very different result. I am wondering whether some one can give me some explantion. I am wondering whether the first select give me the parent of
GO:0000003
and the second give me the child ofGO:0000003
.
> AnnotationDbi::select(org.At.tair.db,
+ keys = "GO:0000003",
+ keytype = "GO", columns = "GOALL") %>%
+ as_tibble()
'select()' returned 1:many mapping between keys and columns
# A tibble: 2,055 x 6
GO EVIDENCE ONTOLOGY GOALL EVIDENCEALL ONTOLOGYALL
<chr> <chr> <chr> <chr> <chr> <chr>
1 GO:0000003 RCA BP GO:0000003 IMP BP
2 GO:0000003 RCA BP GO:0000003 RCA BP
3 GO:0000003 RCA BP GO:0000280 RCA BP
4 GO:0000003 RCA BP GO:0000375 RCA BP
5 GO:0000003 RCA BP GO:0000377 RCA BP
6 GO:0000003 RCA BP GO:0000398 RCA BP
7 GO:0000003 RCA BP GO:0000902 RCA BP
8 GO:0000003 RCA BP GO:0000904 RCA BP
9 GO:0000003 RCA BP GO:0002252 RCA BP
10 GO:0000003 RCA BP GO:0002376 RCA BP
# … with 2,045 more rows
> AnnotationDbi::select(org.At.tair.db,
+ keys = "GO:0000003",
+ keytype = "GOALL", columns = "GO") %>%
+ as_tibble()
'select()' returned 1:many mapping between keys and columns
# A tibble: 8,043 x 6
GOALL EVIDENCEALL ONTOLOGYALL GO EVIDENCE ONTOLOGY
<chr> <chr> <chr> <chr> <chr> <chr>
1 GO:0000003 IMP BP GO:0003700 ISS MF
2 GO:0000003 IMP BP GO:0005634 ISM CC
3 GO:0000003 IMP BP GO:0006355 TAS BP
4 GO:0000003 IMP BP GO:0009908 IMP BP
5 GO:0000003 IMP BP GO:0048366 IMP BP
6 GO:0000003 IMP BP GO:0000226 RCA BP
7 GO:0000003 IMP BP GO:0000278 RCA BP
8 GO:0000003 IMP BP GO:0000911 RCA BP
9 GO:0000003 IMP BP GO:0003725 IDA MF
10 GO:0000003 IMP BP GO:0004525 ISS MF
# … with 8,033 more rows
Best wishes
Guandong Shang
I hit enter too fast, so if you are following this via email you should come to the support site to see the entire answer.
Thanks for yor reply, James.
But I am confused about this sentence:
In my opinion, it should be the
Ancestors
instead ofoffspring
. After all, the definition isGOALL: GO Identifiers (includes less specific terms)
.And I also find a weird thing about
GOALL
in select. According to the definition, the GOALL includes less specific terms, so I believe it should the includedirect GO
and itsancestors GO
. But I findselect
function will drop those GO withoutancestors
. Here is my three result to confirm this things. All I want to find is the related GO(direct and ancestors) ofAT2G17950
.First, I will the use GOBPANCESTOR in GO.db, according to its definition:
Second I use
GOALL in select
to getAT2G17950
all realted GOThird, I will use
GOALL -> TAIR ID
As you can see the second and third methods result are same:
But the number of First is 4 more than second or third
And you can find these 4 GO do not have ancestors. But this 4 GO is the direct GO of
AT2G17950
By the way, I am sorry I do not post the session info. In this reply, I convert my Biocondutcot version into 3.13. Here is the sessionInfo
Best wishes
Guandong Shang
You are right - I mis-spoke (mis-wrote?) about GOALL. I get confused because in my mind the directionality of the GO DAG and GOALL are switched, which seems weird. But when you do a hypergeometric test, you need to know all the GO terms that a given TAIR ID maps to, and that's the direct GO term and all of its ancestors.
Anyway, I think the problem with your code (I don't speak tidyverse, so I didn't really look that closely) is that you filter the ancestors on BP. I believe that your main point is that GOALL is missing some direct GO mappings, which is not true, at least for this TAIR ID:
Which indicates that all the direct GO terms are in GOALL as well. But if you filter on BP, you run into problems because the direct terms aren't all BP (like the four you say are missing).
Whereas the direct GO terms are all the ontologies
Thanks! I get it :).
thanks for the awesome information.