Question

Bioinformatics researching Schizophrenia

0

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 10.7 years ago

I am an undergraduate student and I have been given a project that involves bioinformatics. My supervisor researches chromosome abnormalities and gene expression and he has been given a list of GWAS generated SNPs, that have been linked to Schizophrenia and it has been assigned to me to generate a list of potential genes/regions that we could CROSS REFERENCE with the other list to see if any of these SNPs occur at sites such as miRNA sites, methyltransferase genes, Acetylation and Deacetylation genes etc. that could be implicated in Schizophrenia. I was hoping you would be able to recommend the necessary R-Packages for A. generating a list of potentially implicated regions and B. to cross validate this list with the one I have received. Thank you -- output of sessionInfo(): s -- Sent via the guest posting facility at bioconductor.org.

miRNA • 1.6k views

ADD COMMENT • link updated 10.6 years ago by Valerie Obenchain ★ 6.8k • written 10.6 years ago by Guest User ★ 13k

Martin Morgan · Answer 1 · 2014-09-17

Hi,

To generate a list of gene (or other) regions for your SNPS you could do the following.

1. get SNP location from SNPloc package

(Assuming you only have SNP id's and not locations.)

All pre-built Bioconductor annotations are listed here:

http://www.bioconductor.org/packages/devel/BiocViews.html#___AnnotationData

Search for 'SNPloc' and choose the package that was aligned to the same genome as your SNPs. See the man page for examples of how to extract snps into a GRanges using the rsid.

2. get (or make) TxDb package for regions of interest

Search the annotation site for 'TxDb'. These packages contain gene models from various resources and genome builds (apparent in titles). If you don't see a compatible TxDb you can create you own with a function from GenomicFeatures.

library(GenomicFeatures)
?makeTranscriptDb

Once you have the TxDb you can extract gene, exons, UTRs or other regions. I'll use the known gene table from UCSC:

library(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene

There are several extractors, see ?transcriptsBy:

genes <- transcriptsBy(txdb, "gene")

3. overlap SNP locations with regions from TxDb

There are several options for overlaps. You could use findOverlaps() from IRanges or locateVariants() from VariantAnnotation. Alternatively you could use the biomaRt package to extract metadata based on the SNPs.

Hopefully this is enough to get you going. It would be helpful to know what information you have in the 'list' of GWAS generated SNPs. If you run into problems please show an example of what you've tried so we can give a more specific answer.

Valerie

--
Valerie Obenchain
Program in Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, Seattle, WA 98109

Email: vobencha@fhcrc.org
Phone: (206) 667-3158