homology package question
4
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
I may be missing something, but I don't understand the contents of the mmuhomologyLL2HGID and mmuhomologyHGID2LL environments. My understanding is that the former should take an Entrez Gene ID as a key and output a HGID number as a value, and the converse for the latter. I am trying to compare results from an experiment using Human samples to a similar experiment using Mouse samples. It seems to me the most reasonable way to do this is to map the set of significant genes from one experiment to HGIDs and then from HGIDs to Entrez Gene IDs for the second species. However, it appears that all the keys for both of these environments are the same (and are all Entrez Gene IDs), and the values are all the same (Entrez Gene IDs as well). I see this with versions 1.15.13 as well as 1.14.2, for both mmuhomology and hsahomology packages. > get("20863", mmuhomologyHGID2LL) 10090 20863 > get("20863", mmuhomologyLL2HGID) 10090 20863 > all.equal(unlist(as.list(mmuhomologyLL2HGID)), unlist(as.list(mmuhomologyHGID2LL))) [1] TRUE Am I missing something? Also, is there a better way to compare the two experiments that I am overlooking? Best, Jim -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
Microarray Cancer Microarray Cancer • 1.4k views
ADD COMMENT
0
Entering edit mode
rgentleman ★ 5.5k
@rgentleman-7725
Last seen 9.6 years ago
United States
Looks like a bug - you might want to use Inparanoid - hopefully we will move to that some time in the next six months James MacDonald wrote: > I may be missing something, but I don't understand the contents of the mmuhomologyLL2HGID and mmuhomologyHGID2LL environments. My understanding is that the former should take an Entrez Gene ID as a key and output a HGID number as a value, and the converse for the latter. > > I am trying to compare results from an experiment using Human samples to a similar experiment using Mouse samples. It seems to me the most reasonable way to do this is to map the set of significant genes from one experiment to HGIDs and then from HGIDs to Entrez Gene IDs for the second species. However, it appears that all the keys for both of these environments are the same (and are all Entrez Gene IDs), and the values are all the same (Entrez Gene IDs as well). I see this with versions 1.15.13 as well as 1.14.2, for both mmuhomology and hsahomology packages. > >> get("20863", mmuhomologyHGID2LL) > 10090 > 20863 >> get("20863", mmuhomologyLL2HGID) > 10090 > 20863 > >> all.equal(unlist(as.list(mmuhomologyLL2HGID)), > unlist(as.list(mmuhomologyHGID2LL))) > [1] TRUE > > Am I missing something? Also, is there a better way to compare the two experiments that I am overlooking? > > Best, > > Jim > > -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
ADD COMMENT
0
Entering edit mode
John Zhang ★ 2.9k
@john-zhang-6
Last seen 10.2 years ago
> >I think we should update the homology packages in the near future to use another >source data because the README file on this site says: > >"The old HomoloGene FTP file formats (hmlg.ftp and hmlg.trip.ftp) are now >deprecated. They will be produced for the time being, to make the >transition to the new file formats smoother, but will be discontinued >as of Jan. 1, 2007." I remember they said that the XML version will be the one to use in the future. I tried to parse the XML file quite some time ago and did not get the same set of data as the old one. If this is still true, we may need to re- design/change the package. > >But we don't have time to make the changes for this release. Sorry... > >best > >nianhua > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor Jianhua Zhang Department of Medical Oncology Dana-Farber Cancer Institute 44 Binney Street Boston, MA 02115-6084
ADD COMMENT
0
Entering edit mode
@marco-zucchelli-1987
Last seen 10.2 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070418/ f98af4be/attachment.pl
ADD COMMENT
0
Entering edit mode
Nianhua Li ▴ 870
@nianhua-li-1606
Last seen 10.2 years ago
Hi, James, The source file of mmuhomology is ftp://ftp.ncbi.nih.gov/pub/HomoloGene/current/hmlg.ftp (download on 02/28/2007) and the description is ftp://ftp.ncbi.nih.gov/pub/HomoloGene/README-old According to the description, the 4th and 7th column of hmlg.ftp are Entrez Gene ID, the 5th and 8th column are internal HomoloGene ID. If you look at the hmlg.ftp file, even the current one, you can find that the internal HomoloGene ID is the same as Entrez Gene ID for most of the case. That's why mmuhomologyHGID2LL and mmuhomologyLL2HGID look identical. I think we should update the homology packages in the near future to use another source data because the README file on this site says: "The old HomoloGene FTP file formats (hmlg.ftp and hmlg.trip.ftp) are now deprecated. They will be produced for the time being, to make the transition to the new file formats smoother, but will be discontinued as of Jan. 1, 2007." But we don't have time to make the changes for this release. Sorry... best nianhua
ADD COMMENT
0
Entering edit mode
Hi Nianhua, Nianhua Li wrote: > Hi, James, > > The source file of mmuhomology is > ftp://ftp.ncbi.nih.gov/pub/HomoloGene/current/hmlg.ftp (download on 02/28/2007) > and the description is > ftp://ftp.ncbi.nih.gov/pub/HomoloGene/README-old > > According to the description, the 4th and 7th column of hmlg.ftp are Entrez Gene > ID, the 5th and 8th column are internal HomoloGene ID. If you look at the > hmlg.ftp file, even the current one, you can find that the internal HomoloGene > ID is the same as Entrez Gene ID for most of the case. That's why > mmuhomologyHGID2LL and mmuhomologyLL2HGID look identical. Odd. I wonder if they no longer even check to see if the data are correct. I checked several of the IDs, and AFAIK they really are Entrez Gene IDs, and they really are not HomoloGene IDs. Anyway, it's really easy to get the mappings from biomaRt so that might be the direction to point people until we start using an updated source of these data. Best, Jim > > I think we should update the homology packages in the near future to use another > source data because the README file on this site says: > > "The old HomoloGene FTP file formats (hmlg.ftp and hmlg.trip.ftp) are now > deprecated. They will be produced for the time being, to make the > transition to the new file formats smoother, but will be discontinued > as of Jan. 1, 2007." > > But we don't have time to make the changes for this release. Sorry... > > best > > nianhua > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
ADD REPLY

Login before adding your answer.

Traffic: 832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6