Hi,
So James is right about the GO.db data being the place where the gene ontology data is stored. If you are interested in the GO to gene mappings represented in the org.Sc.sgd.db and org.Hs.eg.db packages. For the org.Hs.eg.db package, those data came from here:
> org.Hs.eg.db
OrgDb object:
| DBSCHEMAVERSION: 2.1
| Db type: OrgDb
| Supporting package: AnnotationDbi
| DBSCHEMA: HUMAN_DB
| ORGANISM: Homo sapiens
| SPECIES: Human
| EGSOURCEDATE: 2015-Mar17
| EGSOURCENAME: Entrez Gene
| EGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
| CENTRALID: EG
| TAXID: 9606
| GOSOURCENAME: Gene Ontology
| GOSOURCEURL: ftp://ftp.geneontology.org/pub/go/godatabase/archive/latest-lite/
| GOSOURCEDATE: 20150314
| GOEGSOURCEDATE: 2015-Mar17
| GOEGSOURCENAME: Entrez Gene
| GOEGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
| KEGGSOURCENAME: KEGG GENOME
| KEGGSOURCEURL: ftp://ftp.genome.jp/pub/kegg/genomes
| KEGGSOURCEDATE: 2011-Mar15
| GPSOURCENAME: UCSC Genome Bioinformatics (Homo sapiens)
| GPSOURCEURL: ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19
| GPSOURCEDATE: 2010-Mar22
| ENSOURCEDATE: 2015-Mar13
| ENSOURCENAME: Ensembl
| ENSOURCEURL: ftp://ftp.ensembl.org/pub/current_fasta
| UPSOURCENAME: Uniprot
| UPSOURCEURL: http://www.UniProt.org/
| UPSOURCEDATE: Tue Mar 17 18:48:15 2015
So basically from this directory here:
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
And more specifically, the gene to GO data for these 'eg' packages comes from this file:
gene2go.gz
And for yeast the data is also listed in the object like so:
> org.Sc.sgd.db
OrgDb object:
| DBSCHEMAVERSION: 2.1
| Db type: OrgDb
| Supporting package: AnnotationDbi
| DBSCHEMA: YEAST_DB
| ORGANISM: Saccharomyces cerevisiae
| SPECIES: Yeast
| YGSOURCENAME: Yeast Genome
| YGSOURCEURL: http://downloads.yeastgenome.org/
| YGSOURCEDATE: 14-Mar-2015
| CENTRALID: ORF
| TAXID: 559292
| KEGGSOURCENAME: KEGG GENOME
| KEGGSOURCEURL: ftp://ftp.genome.jp/pub/kegg/genomes
| KEGGSOURCEDATE: 2011-Mar15
| GOSOURCENAME: Gene Ontology
| GOSOURCEURL: ftp://ftp.geneontology.org/pub/go/godatabase/archive/latest-lite/
| GOSOURCEDATE: 20150314
| EGSOURCEDATE: 2015-Mar17
| EGSOURCENAME: Entrez Gene
| EGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
| ENSOURCEDATE: 2015-Mar13
| ENSOURCENAME: Ensembl
| ENSOURCEURL: ftp://ftp.ensembl.org/pub/current_fasta
| UPSOURCENAME: Uniprot
| UPSOURCEURL: http://www.UniProt.org/
| UPSOURCEDATE: Tue Mar 17 19:16:47 2015
So that GO to gene information is basically from here:
http://downloads.yeastgenome.org/
And even more specifically from this file here:
http://downloads.yeastgenome.org/curation/literature/gene_association.sgd.gz
Anyhow having said all of that the actual files that were used for previous incarnations of these databases are probably not online anymore as these things update all the time. But I have a copy of the originals that we used here if you really need them (although I am perplexed about what you would need them for).
So... I hope this helps you, :)
Marc
Thanks both of you.
But the directory maintains the updated versions. I need exatly the version which are used in GO.db_3.1.2, org.Sc.sgd.db_3.1.2, and org.Hs.eg.db_3.1.2.
Thank you Marc.
Please provide me the original copies of the three files: gene_ontology.obo, gene_association.sgd, gene_association.goa_human
Actually I need to compare some similarity measures (protein-protein interaction) using gene ontology. I am doing so by using GO.db_3.1.2, org.Sc.sgd.db_3.1.2, and org.Hs.eg.db_3.1.2 packages. Some implemented R packages also available like GOSemSim for some similarity measures and the current version of this packages (GOSemSim) also uses GO.db_3.1.2, org.Sc.sgd.db_3.1.2, and org.Hs.eg.db_3.1.2 packages. But there are some measures implemented (provided by authors) without using GO.db, org.Sc.sgd.db, and org.Hs.eg.db packages and use data directly by downloading from gene ontology website. So to use these implementations I need the same data set used in GO.db_3.1.2, org.Sc.sgd.db_3.1.2, and org.Hs.eg.db_3.1.2.
Waiting for your response...