biomaRt query: retrieve exon locations etc
1
1
Entering edit mode
Tim Smith ★ 1.1k
@tim-smith-1532
Last seen 10.2 years ago
Hi All, Sorry for the naive question! I was trying to retrieve some coordinates (start and end positions) from biomaRt and I'm not sure if I'm doing things right… Problem definition: For a gene, retrieve the 5'UTR and exon coordinates for the most common isoform of the gene. *************** library(biomaRt) ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") getAtt <- c('chromosome_name', 'start_position', 'end_position', 'strand','exon_chrom_start','exon_chrom_end',         '5_utr_start','5_utr_end','3_utr_start','3_utr_end') elocs <- getBM(attributes=getAtt,filters="hgnc_symbol",value="ZMYM4",m art=ensembl) print(elocs) ************** However, this would give me the coordinates for all isoforms and it would be difficult to get the coordinates for the most common isoform. How can I identify the most common isoform? many thanks! [[alternative HTML version deleted]]
biomaRt biomaRt • 4.2k views
ADD COMMENT
1
Entering edit mode
@steffen-durinck-4465
Last seen 10.2 years ago
Hi Tim, There is no filter for getting the cannonical transcript only. Something that gets close is filtering for transcripts that have a ccds id, adding this to your query only returns one transcript: elocs <- getBM(attributes=c(getAtt,"ensembl_transcript_id"),filters=c("hgnc_sym bol","with_ccds"),value=list("ZMYM4",TRUE),mart=ensembl) Cheers, Steffen On Fri, Jan 25, 2013 at 12:46 PM, Tim Smith <tim_smith_666@yahoo.com> wrote: > Hi All, > > Sorry for the naive question! I was trying to retrieve some coordinates > (start and end positions) from biomaRt and I'm not sure if I'm doing things > right > > Problem definition: For a gene, retrieve the 5'UTR and exon coordinates > for the most common isoform of the gene. > > *************** > library(biomaRt) > ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") > > getAtt <- c('chromosome_name', 'start_position', 'end_position', > 'strand','exon_chrom_start','exon_chrom_end', > '5_utr_start','5_utr_end','3_utr_start','3_utr_end') > > elocs <- > getBM(attributes=getAtt,filters="hgnc_symbol",value="ZMYM4",mart=ens embl) > print(elocs) > > ************** > > However, this would give me the coordinates for all isoforms and it would > be difficult to get the coordinates for the most common isoform. How can I > identify the most common isoform? > > many thanks! > > [[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 747 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6