Hey, guys.I met a problem about choice of annotation file in AnnotationHub web server. I was up to do GO enrichemnt analysis of Oryza sativa. I find 3 latest(2019-10-29) annotation file in AnnotationHub web server in https://annotationhub.bioconductor.org/species/Oryza%20sativa .
AH75915 35257 EntrezID gene 346 unqiue GO term
AH75916 35574 EntrezID gene 346 unqiue GO term
AH75917 35257 EntrezID gene 345 unqiue GO term
They are same files when you use AnnotationHub in R(3.6.1) I download all 3 annotation file .And check the total EntrezID gene and total unique GO term. There are actually difference in two prarameter of 3 file. But when I use same gene list to do the GO enrichemnt. The result is mostly same But I don't know which benchmark I can baed on. More gene numbers ,more better? or More unique GO term ,more better? Hope someone can help me Thanks in advance!
May I ask what code you used to determine the differences?
For instance when I grab the ENTREZID column I get 35257 for all three:
I'll investigate the code on how the files were generated to see if they were generated differently but firstly please provide the code you used to discover the differences.
Looks pretty identical to me:
There could be some differences there, but I can't imagine the row counts would be identical for every table if what was in those tables is different? Howeva