How to get NCBI's gene annotation?
3
0
Entering edit mode
Wei Shi ★ 3.6k
@wei-shi-2183
Last seen 3 months ago
Australia/Melbourne/Olivia Newton-John …
Dear list, The annotation package "org.Mm.eg.db" provides UCSC's annotation for mouse genes. However, this annotation could sometime be different from NCBI's annotation. Below is an example: library(org.Mm.eg.db) mget("Tff1", org.Mm.egSYMBOL2EG) $Tff1 [1] "21784" mget("21784", org.Mm.egCHRLOC) $`21784` 17 5 -31298340 -143285576 Two chromosomal locations were found for "Tff1" which are on chromosome 17 and chromosome 5 respectively. However, this genes is only located on chromosome 17 according to NCBI Entrez gene database. Does anybody know if there is any packages or other sources which provide NCBI gene annotation? I am working on a large set of genes and NCBI does not seem to provide downloadable files which contain gene information such as chromosomal locations etc. > sessionInfo() R version 2.8.1 (2008-12-22) i386-pc-mingw32 locale: LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_M ONETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia. 1252 attached base packages: [1] tools stats graphics grDevices utils datasets methods base other attached packages: [1] org.Mm.eg.db_2.2.6 RSQLite_0.7-1 DBI_0.2-4 AnnotationDbi_1.4.3 Biobase_2.2.2 Thanks, Wei
Annotation Annotation • 2.4k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On Tue, Mar 17, 2009 at 1:58 AM, Wei Shi <shi@wehi.edu.au> wrote: > Dear list, > > The annotation package "org.Mm.eg.db" provides UCSC's annotation for mouse > genes. However, this annotation could sometime be different from NCBI's > annotation. Below is an example: > > library(org.Mm.eg.db) > mget("Tff1", org.Mm.egSYMBOL2EG) > $Tff1 > [1] "21784" > mget("21784", org.Mm.egCHRLOC) > $`21784` > 17 5 > -31298340 -143285576 > > Two chromosomal locations were found for "Tff1" which are on chromosome > 17 and chromosome 5 respectively. However, this genes is only located on > chromosome 17 according to NCBI Entrez gene database. Does anybody know if > there is any packages or other sources which provide NCBI gene annotation? I > am working on a large set of genes and NCBI does not seem to provide > downloadable files which contain gene information such as chromosomal > locations etc. > Try here: ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/mapview/ Sean [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi, the pointer should be for Mouse: ftp://ftp.ncbi.nih.gov/genomes/M_musculus/mapview/seq_gene.md.gz or here I believe ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/GENE_INFO/Mammalia/Mus_musculus.g ene_info.gz The reason that the org.Mm.eg.db package is giving you two locations is because it uses the alignment given by UCSC of the Refseq(s) of your gene. In this particular case NM_009362 aligns with 100% identity on both chr5:143285577-143289234 and chr17:31298341-31301998. By aligning this sequence by hand using BLAT you can see that the chr5 hit appeared as of the July 2007 assembly. Maybe this kind of information is worth keeping in mind. Best, J. Sean Davis wrote: > On Tue, Mar 17, 2009 at 1:58 AM, Wei Shi <shi at="" wehi.edu.au=""> wrote: > >> Dear list, >> >> The annotation package "org.Mm.eg.db" provides UCSC's annotation for mouse >> genes. However, this annotation could sometime be different from NCBI's >> annotation. Below is an example: >> >> library(org.Mm.eg.db) >> mget("Tff1", org.Mm.egSYMBOL2EG) >> $Tff1 >> [1] "21784" >> mget("21784", org.Mm.egCHRLOC) >> $`21784` >> 17 5 >> -31298340 -143285576 >> >> Two chromosomal locations were found for "Tff1" which are on chromosome >> 17 and chromosome 5 respectively. However, this genes is only located on >> chromosome 17 according to NCBI Entrez gene database. Does anybody know if >> there is any packages or other sources which provide NCBI gene annotation? I >> am working on a large set of genes and NCBI does not seem to provide >> downloadable files which contain gene information such as chromosomal >> locations etc. >> > > Try here: > > > ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/mapview/ > > Sean > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Hi Wei, The exact same package also provides the NCBI chromosome assignments. If you use the CHR mapping like this you will only NCBIs annotation and you can see how it is different from that provided by UCSC: mget("21784", org.Mm.egCHR) You can see where the mapping information for each mapping is coming from by looking at the man pages: ?org.Mm.egCHR ?org.Mm.egCHRLOC Marc James F. Reid wrote: > Hi, > > the pointer should be for Mouse: > ftp://ftp.ncbi.nih.gov/genomes/M_musculus/mapview/seq_gene.md.gz > or here I believe > ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/GENE_INFO/Mammalia/Mus_musculus .gene_info.gz > > > The reason that the org.Mm.eg.db package is giving you two locations > is because it uses the alignment given by UCSC of the Refseq(s) of > your gene. > In this particular case NM_009362 aligns with 100% identity on both > chr5:143285577-143289234 and chr17:31298341-31301998. > By aligning this sequence by hand using BLAT you can see that the chr5 > hit appeared as of the July 2007 assembly. > Maybe this kind of information is worth keeping in mind. > > Best, > J. > > > Sean Davis wrote: >> On Tue, Mar 17, 2009 at 1:58 AM, Wei Shi <shi at="" wehi.edu.au=""> wrote: >> >>> Dear list, >>> >>> The annotation package "org.Mm.eg.db" provides UCSC's annotation >>> for mouse >>> genes. However, this annotation could sometime be different from NCBI's >>> annotation. Below is an example: >>> >>> library(org.Mm.eg.db) >>> mget("Tff1", org.Mm.egSYMBOL2EG) >>> $Tff1 >>> [1] "21784" >>> mget("21784", org.Mm.egCHRLOC) >>> $`21784` >>> 17 5 >>> -31298340 -143285576 >>> >>> Two chromosomal locations were found for "Tff1" which are on >>> chromosome >>> 17 and chromosome 5 respectively. However, this genes is only >>> located on >>> chromosome 17 according to NCBI Entrez gene database. Does anybody >>> know if >>> there is any packages or other sources which provide NCBI gene >>> annotation? I >>> am working on a large set of genes and NCBI does not seem to provide >>> downloadable files which contain gene information such as chromosomal >>> locations etc. >>> >> >> Try here: >> >> >> ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/mapview/ >> >> Sean >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Hi Marc: In many cases, the extra annotation provided by UCSC is on the same chromosome with the NCBI annotation. In these cases, org.Mm.egCHR can not tell whether the annotation is from UCSC or from NCBI. Below is an example: > mget("Gvin1", org.Mm.egSYMBOL2EG) $Gvin1 [1] "74558" > mget("74558", org.Mm.egCHR) $`74558` [1] "7" > mget("74558", org.Mm.egCHRLOC) $`74558` 7 7 -113043632 -113300049 Gvin1's chromosomal location is -113300049 at chromosome 7 according to NCBI Entrez Gene database. Thanks, Wei Marc Carlson wrote: > Hi Wei, > > The exact same package also provides the NCBI chromosome assignments. > If you use the CHR mapping like this you will only NCBIs annotation and > you can see how it is different from that provided by UCSC: > > mget("21784", org.Mm.egCHR) > > > You can see where the mapping information for each mapping is coming > from by looking at the man pages: > ?org.Mm.egCHR > ?org.Mm.egCHRLOC > > > Marc > > > > > James F. Reid wrote: > >> Hi, >> >> the pointer should be for Mouse: >> ftp://ftp.ncbi.nih.gov/genomes/M_musculus/mapview/seq_gene.md.gz >> or here I believe >> ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/GENE_INFO/Mammalia/Mus_musculu s.gene_info.gz >> >> >> The reason that the org.Mm.eg.db package is giving you two locations >> is because it uses the alignment given by UCSC of the Refseq(s) of >> your gene. >> In this particular case NM_009362 aligns with 100% identity on both >> chr5:143285577-143289234 and chr17:31298341-31301998. >> By aligning this sequence by hand using BLAT you can see that the chr5 >> hit appeared as of the July 2007 assembly. >> Maybe this kind of information is worth keeping in mind. >> >> Best, >> J. >> >> >> Sean Davis wrote: >> >>> On Tue, Mar 17, 2009 at 1:58 AM, Wei Shi <shi@wehi.edu.au> wrote: >>> >>> >>>> Dear list, >>>> >>>> The annotation package "org.Mm.eg.db" provides UCSC's annotation >>>> for mouse >>>> genes. However, this annotation could sometime be different from NCBI's >>>> annotation. Below is an example: >>>> >>>> library(org.Mm.eg.db) >>>> mget("Tff1", org.Mm.egSYMBOL2EG) >>>> $Tff1 >>>> [1] "21784" >>>> mget("21784", org.Mm.egCHRLOC) >>>> $`21784` >>>> 17 5 >>>> -31298340 -143285576 >>>> >>>> Two chromosomal locations were found for "Tff1" which are on >>>> chromosome >>>> 17 and chromosome 5 respectively. However, this genes is only >>>> located on >>>> chromosome 17 according to NCBI Entrez gene database. Does anybody >>>> know if >>>> there is any packages or other sources which provide NCBI gene >>>> annotation? I >>>> am working on a large set of genes and NCBI does not seem to provide >>>> downloadable files which contain gene information such as chromosomal >>>> locations etc. >>>> >>>> >>> Try here: >>> >>> >>> ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/mapview/ >>> >>> Sean >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
@hotz-hans-rudolf-3342
Last seen 10.2 years ago
On 3/17/09 6:58 AM, "Wei Shi" <shi at="" wehi.edu.au=""> wrote: > Dear list, > > The annotation package "org.Mm.eg.db" provides UCSC's annotation for > mouse genes. However, this annotation could sometime be different from > NCBI's annotation. Below is an example: > > library(org.Mm.eg.db) > mget("Tff1", org.Mm.egSYMBOL2EG) > $Tff1 > [1] "21784" > mget("21784", org.Mm.egCHRLOC) > $`21784` > 17 5 > -31298340 -143285576 > > Two chromosomal locations were found for "Tff1" which are on > chromosome 17 and chromosome 5 respectively. However, this genes is only > located on chromosome 17 according to NCBI Entrez gene database. Does > anybody know if there is any packages or other sources which provide > NCBI gene annotation? I am working on a large set of genes and NCBI does > not seem to provide downloadable files which contain gene information > such as chromosomal locations etc. The files on "ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/" should contain all the data you want. Regards, Hans >> sessionInfo() > R version 2.8.1 (2008-12-22) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC _MONETARY= > English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252 > > attached base packages: > [1] tools stats graphics grDevices utils datasets > methods base > > other attached packages: > [1] org.Mm.eg.db_2.2.6 RSQLite_0.7-1 DBI_0.2-4 > AnnotationDbi_1.4.3 Biobase_2.2.2 > > > Thanks, > Wei > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Hi Hotz: Many thanks for the link. I found the file I need from it. Best regards, Wei Hotz, Hans-Rudolf wrote: > > On 3/17/09 6:58 AM, "Wei Shi" <shi@wehi.edu.au> wrote: > > >> Dear list, >> >> The annotation package "org.Mm.eg.db" provides UCSC's annotation for >> mouse genes. However, this annotation could sometime be different from >> NCBI's annotation. Below is an example: >> >> library(org.Mm.eg.db) >> mget("Tff1", org.Mm.egSYMBOL2EG) >> $Tff1 >> [1] "21784" >> mget("21784", org.Mm.egCHRLOC) >> $`21784` >> 17 5 >> -31298340 -143285576 >> >> Two chromosomal locations were found for "Tff1" which are on >> chromosome 17 and chromosome 5 respectively. However, this genes is only >> located on chromosome 17 according to NCBI Entrez gene database. Does >> anybody know if there is any packages or other sources which provide >> NCBI gene annotation? I am working on a large set of genes and NCBI does >> not seem to provide downloadable files which contain gene information >> such as chromosomal locations etc. >> > > The files on "ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/" should contain all the > data you want. > > > Regards, Hans > > >>> sessionInfo() >>> >> R version 2.8.1 (2008-12-22) >> i386-pc-mingw32 >> >> locale: >> LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;L C_MONETARY= >> English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252 >> >> attached base packages: >> [1] tools stats graphics grDevices utils datasets >> methods base >> >> other attached packages: >> [1] org.Mm.eg.db_2.2.6 RSQLite_0.7-1 DBI_0.2-4 >> AnnotationDbi_1.4.3 Biobase_2.2.2 >> >> >> Thanks, >> Wei >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Wei Shi ★ 3.6k
@wei-shi-2183
Last seen 3 months ago
Australia/Melbourne/Olivia Newton-John …
Hi Marc: Can I know the reason why CHR mapping and CHRLOC mapping use different annotations? My personal opinion is to better use one annotation. If multiple annotations are to be provided, make multiple packages correspondingly or provide annotation options in the package. Thanks, Wei Marc Carlson wrote: > Hi Wei, > > If you read the manual pages that I mentioned in my reply, you will see > that the CHR mapping is always an NCBI annotation and the CHRLOC mapping > is always a UCSC annotation. So it should always be possible to tell > what the chromosome assignments are from both sources (and whether or > not they agree). > > Hope this clarifies things, > > > Marc > > > > > Wei Shi wrote: > >> Hi Marc: >> >> In many cases, the extra annotation provided by UCSC is on the >> same chromosome with the NCBI annotation. In these cases, org.Mm.egCHR >> can not tell whether the annotation is from UCSC or from NCBI. Below >> is an example: >> >> >>> mget("Gvin1", org.Mm.egSYMBOL2EG) >>> >> $Gvin1 >> [1] "74558" >> >>> mget("74558", org.Mm.egCHR) >>> >> $`74558` >> [1] "7" >> >>> mget("74558", org.Mm.egCHRLOC) >>> >> $`74558` >> 7 7 >> -113043632 -113300049 >> >> Gvin1's chromosomal location is -113300049 at chromosome 7 >> according to NCBI Entrez Gene database. >> >> Thanks, >> Wei >> >> Marc Carlson wrote: >> >>> Hi Wei, >>> >>> The exact same package also provides the NCBI chromosome assignments. >>> If you use the CHR mapping like this you will only NCBIs annotation and >>> you can see how it is different from that provided by UCSC: >>> >>> mget("21784", org.Mm.egCHR) >>> >>> >>> You can see where the mapping information for each mapping is coming >>> from by looking at the man pages: >>> ?org.Mm.egCHR >>> ?org.Mm.egCHRLOC >>> >>> >>> Marc >>> >>> >>> >>> >>> James F. Reid wrote: >>> >>> >>>> Hi, >>>> >>>> the pointer should be for Mouse: >>>> ftp://ftp.ncbi.nih.gov/genomes/M_musculus/mapview/seq_gene.md.gz >>>> or here I believe >>>> ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/GENE_INFO/Mammalia/Mus_muscu lus.gene_info.gz >>>> >>>> >>>> The reason that the org.Mm.eg.db package is giving you two locations >>>> is because it uses the alignment given by UCSC of the Refseq(s) of >>>> your gene. >>>> In this particular case NM_009362 aligns with 100% identity on both >>>> chr5:143285577-143289234 and chr17:31298341-31301998. >>>> By aligning this sequence by hand using BLAT you can see that the chr5 >>>> hit appeared as of the July 2007 assembly. >>>> Maybe this kind of information is worth keeping in mind. >>>> >>>> Best, >>>> J. >>>> >>>> >>>> Sean Davis wrote: >>>> >>>> >>>>> On Tue, Mar 17, 2009 at 1:58 AM, Wei Shi <shi@wehi.edu.au> wrote: >>>>> >>>>> >>>>> >>>>>> Dear list, >>>>>> >>>>>> The annotation package "org.Mm.eg.db" provides UCSC's annotation >>>>>> for mouse >>>>>> genes. However, this annotation could sometime be different from NCBI's >>>>>> annotation. Below is an example: >>>>>> >>>>>> library(org.Mm.eg.db) >>>>>> mget("Tff1", org.Mm.egSYMBOL2EG) >>>>>> $Tff1 >>>>>> [1] "21784" >>>>>> mget("21784", org.Mm.egCHRLOC) >>>>>> $`21784` >>>>>> 17 5 >>>>>> -31298340 -143285576 >>>>>> >>>>>> Two chromosomal locations were found for "Tff1" which are on >>>>>> chromosome >>>>>> 17 and chromosome 5 respectively. However, this genes is only >>>>>> located on >>>>>> chromosome 17 according to NCBI Entrez gene database. Does anybody >>>>>> know if >>>>>> there is any packages or other sources which provide NCBI gene >>>>>> annotation? I >>>>>> am working on a large set of genes and NCBI does not seem to provide >>>>>> downloadable files which contain gene information such as chromosomal >>>>>> locations etc. >>>>>> >>>>>> >>>>>> >>>>> Try here: >>>>> >>>>> >>>>> ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/mapview/ >>>>> >>>>> Sean >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor@stat.math.ethz.ch >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor@stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>>> >>>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> > > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi Wei, The org packages are not about annotations from a single source. They are meant to provide annotations for a single organism. And there are many different sources that are gathered/consulted when we build the annotation packages. The manual pages have always provided documentation for where the data all comes from. And the metadata about the origins of these different tables is also available in the databases contained within the package. If we were to only supply data from a single source in an annotation package, a lot of convenience would disappear for the users and there would be a lot less data in each package. Also, it would mean that to do some simple things you would have to involve several packages instead of just one package. Marc Wei Shi wrote: > Hi Marc: > > Can I know the reason why CHR mapping and CHRLOC mapping use > different annotations? My personal opinion is to better use one > annotation. If multiple annotations are to be provided, make multiple > packages correspondingly or provide annotation options in the package. > > Thanks, > Wei > > Marc Carlson wrote: >> Hi Wei, >> >> If you read the manual pages that I mentioned in my reply, you will see >> that the CHR mapping is always an NCBI annotation and the CHRLOC mapping >> is always a UCSC annotation. So it should always be possible to tell >> what the chromosome assignments are from both sources (and whether or >> not they agree). >> >> Hope this clarifies things, >> >> >> Marc >> >> >> >> >> Wei Shi wrote: >> >>> Hi Marc: >>> >>> In many cases, the extra annotation provided by UCSC is on the >>> same chromosome with the NCBI annotation. In these cases, org.Mm.egCHR >>> can not tell whether the annotation is from UCSC or from NCBI. Below >>> is an example: >>> >>> >>>> mget("Gvin1", org.Mm.egSYMBOL2EG) >>>> >>> $Gvin1 >>> [1] "74558" >>> >>>> mget("74558", org.Mm.egCHR) >>>> >>> $`74558` >>> [1] "7" >>> >>>> mget("74558", org.Mm.egCHRLOC) >>>> >>> $`74558` >>> 7 7 >>> -113043632 -113300049 >>> >>> Gvin1's chromosomal location is -113300049 at chromosome 7 >>> according to NCBI Entrez Gene database. >>> >>> Thanks, >>> Wei >>> >>> Marc Carlson wrote: >>> >>>> Hi Wei, >>>> >>>> The exact same package also provides the NCBI chromosome assignments. >>>> If you use the CHR mapping like this you will only NCBIs annotation and >>>> you can see how it is different from that provided by UCSC: >>>> >>>> mget("21784", org.Mm.egCHR) >>>> >>>> >>>> You can see where the mapping information for each mapping is coming >>>> from by looking at the man pages: >>>> ?org.Mm.egCHR >>>> ?org.Mm.egCHRLOC >>>> >>>> >>>> Marc >>>> >>>> >>>> >>>> >>>> James F. Reid wrote: >>>> >>>> >>>>> Hi, >>>>> >>>>> the pointer should be for Mouse: >>>>> ftp://ftp.ncbi.nih.gov/genomes/M_musculus/mapview/seq_gene.md.gz >>>>> or here I believe >>>>> ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/GENE_INFO/Mammalia/Mus_musc ulus.gene_info.gz >>>>> >>>>> >>>>> The reason that the org.Mm.eg.db package is giving you two locations >>>>> is because it uses the alignment given by UCSC of the Refseq(s) of >>>>> your gene. >>>>> In this particular case NM_009362 aligns with 100% identity on both >>>>> chr5:143285577-143289234 and chr17:31298341-31301998. >>>>> By aligning this sequence by hand using BLAT you can see that the chr5 >>>>> hit appeared as of the July 2007 assembly. >>>>> Maybe this kind of information is worth keeping in mind. >>>>> >>>>> Best, >>>>> J. >>>>> >>>>> >>>>> Sean Davis wrote: >>>>> >>>>> >>>>>> On Tue, Mar 17, 2009 at 1:58 AM, Wei Shi <shi at="" wehi.edu.au=""> wrote: >>>>>> >>>>>> >>>>>> >>>>>>> Dear list, >>>>>>> >>>>>>> The annotation package "org.Mm.eg.db" provides UCSC's annotation >>>>>>> for mouse >>>>>>> genes. However, this annotation could sometime be different from NCBI's >>>>>>> annotation. Below is an example: >>>>>>> >>>>>>> library(org.Mm.eg.db) >>>>>>> mget("Tff1", org.Mm.egSYMBOL2EG) >>>>>>> $Tff1 >>>>>>> [1] "21784" >>>>>>> mget("21784", org.Mm.egCHRLOC) >>>>>>> $`21784` >>>>>>> 17 5 >>>>>>> -31298340 -143285576 >>>>>>> >>>>>>> Two chromosomal locations were found for "Tff1" which are on >>>>>>> chromosome >>>>>>> 17 and chromosome 5 respectively. However, this genes is only >>>>>>> located on >>>>>>> chromosome 17 according to NCBI Entrez gene database. Does anybody >>>>>>> know if >>>>>>> there is any packages or other sources which provide NCBI gene >>>>>>> annotation? I >>>>>>> am working on a large set of genes and NCBI does not seem to provide >>>>>>> downloadable files which contain gene information such as chromosomal >>>>>>> locations etc. >>>>>>> >>>>>>> >>>>>>> >>>>>> Try here: >>>>>> >>>>>> >>>>>> ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/mapview/ >>>>>> >>>>>> Sean >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor at stat.math.ethz.ch >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at stat.math.ethz.ch >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>>> >> >>
ADD REPLY

Login before adding your answer.

Traffic: 497 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6