GenBank RefSeq conversion
1
1
Entering edit mode
@eleni-christodoulou-2653
Last seen 6.2 years ago
Singapore
Hello all! I was trying to convert RefSeq accession numbers to GenBank accesion numbers (or the opposite). I think that there must exist a library that does this job automatically...Does anyone know anything relevant to this? Thank you all, Eleni [[alternative HTML version deleted]]
convert convert • 3.4k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On Fri, May 30, 2008 at 8:53 AM, Eleni Christodoulou <elenichri at="" gmail.com=""> wrote: > Hello all! > > I was trying to convert RefSeq accession numbers to GenBank accesion numbers > (or the opposite). I think that there must exist a library that does this > job automatically...Does anyone know anything relevant to this? Hi, Eleni. There is no direct relationship between RefSeq and GenBank numbers. A given RefSeq may or may not be represented by exactly one GenBank accession. In fact, a RefSeq may not represent any "real" sequence, but can be a composite of several "real" sequences. As an example, see here: http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=NM_007294.2 It looks like this RefSeq is actually composed of 4 different sequences from genbank (if I am reading the record correctly). The only way I know to deal with this (at least in the general case) is to go through Entrez Gene (or the Ensembl equivalent of a gene) to find those accessions in GenBank and RefSeq that share a common Gene ID. You can do this using the annotation package for the organism of interest, I think. Steffen or others might be able to comment on how to do this using biomaRt. Sean
ADD COMMENT
0
Entering edit mode
Sean Davis wrote: > On Fri, May 30, 2008 at 8:53 AM, Eleni Christodoulou > <elenichri at="" gmail.com=""> wrote: > >> Hello all! >> >> I was trying to convert RefSeq accession numbers to GenBank accesion numbers >> (or the opposite). I think that there must exist a library that does this >> job automatically...Does anyone know anything relevant to this? >> > > Hi, Eleni. There is no direct relationship between RefSeq and GenBank > numbers. A given RefSeq may or may not be represented by exactly one > GenBank accession. In fact, a RefSeq may not represent any "real" > sequence, but can be a composite of several "real" sequences. As an > example, see here: > > http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=NM_007294.2 > > It looks like this RefSeq is actually composed of 4 different > sequences from genbank (if I am reading the record correctly). > > The only way I know to deal with this (at least in the general case) > is to go through Entrez Gene (or the Ensembl equivalent of a gene) to > find those accessions in GenBank and RefSeq that share a common Gene > ID. You can do this using the annotation package for the organism of > interest, I think. Steffen or others might be able to comment on how > to do this using biomaRt. > > Sean > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > What Sean mentioned should work to at least let you connect the dots. As an example, for human you could use the package "org.Hs.eg.db" and then use the following mappings to get what you want: 1st use "org.Hs.egACCNUM2EG" to get Entrez Gene IDs for your GenBank accessions. And then use "org.Hs.egREFSEQ" to get RefSeq IDs for your Entrez Gene IDs. Marc
ADD REPLY

Login before adding your answer.

Traffic: 887 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6