Find human SNP orthologous flanking sequence
1
1
Entering edit mode
jfertaj ▴ 30
@jfertaj-8566
Last seen 2.2 years ago
United Kingdom

Hi all,

 

I would like to ask if it would be possible to retrieve orthologous sequences to a human SNP flanking sequence automatically using biomaRt or another R package.

Basically I have a list of 4 human SNPs and I have the flanking sequences (60 kb each side) for each of them. I would like to obtain the orthologous sequences for several species (not only primates)

Many thanks in advance

Juan

bioconductor biomart • 1.3k views
ADD COMMENT
1
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 8.4 years ago
United States

Hi Juan,

There are several possible answers to this depending on exactly what you are starting with and what you want to find at the end.  I will make one guess about what you wanted here but others may have other ideas.  If your starting point is a range then you could do this:

First lets assume you know where your SNP is and that you can make a GRanges object to represent that

gr <- GRanges(seqnames='chr5', IRanges(174152000, width=200), strand='+')

Then lets see which genes this region overlaps with...

library(VariantAnnotation)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
locateVariants(gr, txdb, AllVariants())

Notice that it overlaps with gene 4488 (entrez gene ID), so how can we look up homologs?
Well one approach is to use the inparanoid resources in the AnnotationHub:

library(AnnotationHub)
ah = AnnotationHub()
ahp = query(ah, 'Inparanoid8Db')

At this point we have 268 different species to choose from, lets look at human since that is what we started with

ahph = query(ahp, 'Homo sapiens')
human = ahph[[1]]

Now lets see what kind of keys we can use to query this resource:

head(keys(human, 'HOMO_SAPIENS'))

Those are uniprot IDs.  We can look those up using an OrgDb.  Lets get the OrgDb for human and translate the entrez gene ID 4488 into a uniprot ID:

Org.Hs = query(ah, c('OrgDb','Homo sapiens'))[[1]]
ids <- mapIds(Org.Hs, '4488', 'UNIPROT', 'ENTREZID')

Now we can look up the matching entrez gene IDs from mouse (for example) using select() like so:

select(human, ids, 'MUS_MUSCULUS', 'HOMO_SAPIENS')


Hope this helps,

Marc

ADD COMMENT

Login before adding your answer.

Traffic: 734 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6