all human gene coordinates
5
1
Entering edit mode
Wim Kreinen ▴ 100
@wim-kreinen-5642
Last seen 10.3 years ago
Dear list, I am completly new to bioconductor and R. And I am looking for a tool (library) that provides the coordinates for all human genes. Does it exist? Thanks Wim [[alternative HTML version deleted]]
• 3.2k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States
Hi Wim, See the org.Hs.eg.db package. Best, Jim On 12/3/2012 4:47 PM, Wim Kreinen wrote: > Dear list, > > I am completly new to bioconductor and R. And I am looking for a tool > (library) that provides the coordinates for all human genes. Does it exist? > > Thanks > Wim > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
WATSON Mick ▴ 50
@watson-mick-5575
Last seen 9.9 years ago
United Kingdom
Yes, biomaRt. Not sent from an iPhone Wim Kreinen <wkreinen at="" gmail.com=""> wrote: Dear list, I am completly new to bioconductor and R. And I am looking for a tool (library) that provides the coordinates for all human genes. Does it exist? Thanks Wim [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 8.4 years ago
United States
Hi Wim, How about this: library(Homo.sapiens) ## This loads both of the packages just mentioned. ## This lists all the things you can retrieve: cols(Homo.sapiens) ## Then you can do something like this k = keys(Homo.sapiens, keytype="ENTREZID") res <- select(Homo.sapiens, keys = k, cols =c("TXSTART","TXEND"), keytype="ENTREZID") head(res) That would get you the starts and ends of all known transcripts for each gene in a data.frame. Marc On 12/03/2012 01:47 PM, Wim Kreinen wrote: > Dear list, > > I am completly new to bioconductor and R. And I am looking for a tool > (library) that provides the coordinates for all human genes. Does it exist? > > Thanks > Wim > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
@steve-lianoglou-2771
Last seen 22 months ago
United States
On Mon, Dec 3, 2012 at 4:47 PM, Wim Kreinen <wkreinen at="" gmail.com=""> wrote: > Dear list, > > I am completly new to bioconductor and R. And I am looking for a tool > (library) that provides the coordinates for all human genes. Does it exist? And the third option not yet mentioned is the TxDb.Hsapiens.* packages. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENT
0
Entering edit mode
Wim Kreinen ▴ 100
@wim-kreinen-5642
Last seen 10.3 years ago
Thanks, is there a method to get all protein coding transcripts. With your method I get microRNAs as well. Thanks Wim 2012/12/5 Steve Lianoglou <mailinglist.honeypot@gmail.com> > Hi Wim, > > Please keep emails on the bioc list by hitting "reply all" -- this way > you can get more (and better help) by having more eyes on your > question, and also others can benefit as well. > > So: > > On Wed, Dec 5, 2012 at 11:29 AM, Wim Kreinen <wkreinen@gmail.com> wrote: > > This sounds promising. > > And principally I understand how it works but ... How do I define keys > if I > > want all transcripts? > > I defined via isActiveSeq the chr1...chr22, chrX, chrY as active > > chromosomes. > > > > I tried > > library ("TxDb.Hsapiens. UCSC.hg19.knownGenes") > > txdb->TxDb.Hsapiens. UCSC.hg19.knownGenes > > cols->c("TXCHROM", "TXSTRAND", "TXSTART", "TXEND") > > keys -> ? #How do I define keys if I want all transcripts? > > alltranscripts->select (txdb, keys=keys, cols=cols, keytype="TXID") > > First: what's up w/ the spaces in your "TxDb.Hsapiens.[SPACE]UCSC..." > > It's also ...knownGene -- not ...knownGeneS > > Also, a suggestion: use `<-` for assignment, and not `->` ... although > the latter works, if anybody else is meant to read your code, they're > likely going to be confused for a bit until they get used to your > "odd" (but correct) choice of assignment direction. > > Anyhow -- how about: > > R> library(BiocInstaller) > R> biocLite("TxDb.Hsapiens.UCSC.hg19.knownGene") > R> library("TxDb.Hsapiens.UCSC.hg19.knownGene") > R> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene > R> txs <- transcripts(txdb) > R> head(txs) > R> head(txs) > GRanges with 6 ranges and 2 metadata columns: > seqnames ranges strand | tx_id tx_name > <rle> <iranges> <rle> | <integer> <character> > [1] chr1 [ 11874, 14409] + | 1 uc001aaa.3 > [2] chr1 [ 11874, 14409] + | 2 uc010nxq.1 > ... > > the ucsc id's are in the tx_name column. > > HTH, > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi, On Thu, Dec 13, 2012 at 11:20 AM, Wim Kreinen <wkreinen at="" gmail.com=""> wrote: > Thanks, > > is there a method to get all protein coding transcripts. With your method I > get microRNAs as well. Here's one non-sophisticated way. The idea is to get the info for all coding exons grouped by tx_id, then filter the transcript list by ids that appear in the coding-exon list names: R> library("TxDb.Hsapiens.UCSC.hg19.knownGene") R> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene R> txs <- transcripts(txdb) R> cds <- cdsBy(txdb) R> txs.coding <- txs[mcols(txs)$tx_id %in% names(cds)] HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY

Login before adding your answer.

Traffic: 692 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6