help with analysis of genotyping data from Illumina HumanOmni5-4v1_B chip
1
0
Entering edit mode
@abhishek-pratap-4927
Last seen 8.5 years ago
United States
Hi Guys We have recently obtained from precalled genotype data from our collaborators generated from the Illumina Human Omni5 array chip (HumanOmni5-4v1_B). The genotypes have already been called using the Illumina's Genome Studio. I being new to the array based genotyping data (coming from sequencing arena) would like to know the following. 1. What QC can be done on these genotype data files (200 sampled) to ascertain their quality and filter out the low quality calls. 2. Does bioconductor have a package for annotation of this chip HumanOmni5-4v1_B. I was not able to find "humanomni5quadv1bCrlmm" but not sure if that would give me the annotation on loci / SNP. 3. Any existing slick way to create VCF files from these 200 genotype files. Our goal is to summarize the information in a single VCF across all the samples tagging the low quality ones. Many thanks! -Abhi
SNP Annotation SNP Annotation • 2.1k views
ADD COMMENT
0
Entering edit mode
@stephanie-m-gogarten-5121
Last seen 4 months ago
University of Washington
Hi Abhi, 1. The GWASTools package was designed for QC of precalled array data. See the "Data Cleaning" vignette for a recommended workflow. You might also want to look at Laurie et al 2010 in Genetic Epidemiology (10.1002/gepi.20516), as the vignette implements the QC methods described therein. 2. I usually get the annotation file from Illumina (it would probably be called HumanOmni5-4v1_B.csv). Your collaborators may have this file, or you could register with Illumina's website to download it. It has rsID, chromosome, position, alleles, and probe sequences. 3. I don't know of a good way at the moment, but "export GWASTools objects as VCF" is going on my to-do list. I recently used the un- slick way of PLINK file -> load in PLINK/SEQ -> export VCF. You might also try creating a VariantAnnotation object from your data and using the writeVcf method. Stephanie On 1/14/14 11:19 AM, Abhishek Pratap wrote: > Hi Guys > > We have recently obtained from precalled genotype data from our > collaborators generated from the Illumina Human Omni5 array chip > (HumanOmni5-4v1_B). The genotypes have already been called using the > Illumina's Genome Studio. > > I being new to the array based genotyping data (coming from sequencing > arena) would like to know the following. > > 1. What QC can be done on these genotype data files (200 sampled) to > ascertain their quality and filter out the low quality calls. > > 2. Does bioconductor have a package for annotation of this chip > HumanOmni5-4v1_B. I was not able to find "humanomni5quadv1bCrlmm" but > not sure if that would give me the annotation on loci / SNP. > > 3. Any existing slick way to create VCF files from these 200 genotype > files. Our goal is to summarize the information in a single VCF across > all the samples tagging the low quality ones. > > > Many thanks! > -Abhi > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Thanks a lot Stephanie for your quick response. This is was very useful info. I will follow up with package specific questions if any. Cheers! -Abhi On Tue, Jan 14, 2014 at 1:54 PM, Stephanie M. Gogarten <sdmorris at="" u.washington.edu=""> wrote: > Hi Abhi, > > 1. The GWASTools package was designed for QC of precalled array data. See > the "Data Cleaning" vignette for a recommended workflow. You might also > want to look at Laurie et al 2010 in Genetic Epidemiology > (10.1002/gepi.20516), as the vignette implements the QC methods described > therein. > > 2. I usually get the annotation file from Illumina (it would probably be > called HumanOmni5-4v1_B.csv). Your collaborators may have this file, or you > could register with Illumina's website to download it. It has rsID, > chromosome, position, alleles, and probe sequences. > > 3. I don't know of a good way at the moment, but "export GWASTools objects > as VCF" is going on my to-do list. I recently used the un-slick way of > PLINK file -> load in PLINK/SEQ -> export VCF. You might also try creating > a VariantAnnotation object from your data and using the writeVcf method. > > Stephanie > > > On 1/14/14 11:19 AM, Abhishek Pratap wrote: >> >> Hi Guys >> >> We have recently obtained from precalled genotype data from our >> collaborators generated from the Illumina Human Omni5 array chip >> (HumanOmni5-4v1_B). The genotypes have already been called using the >> Illumina's Genome Studio. >> >> I being new to the array based genotyping data (coming from sequencing >> arena) would like to know the following. >> >> 1. What QC can be done on these genotype data files (200 sampled) to >> ascertain their quality and filter out the low quality calls. >> >> 2. Does bioconductor have a package for annotation of this chip >> HumanOmni5-4v1_B. I was not able to find "humanomni5quadv1bCrlmm" but >> not sure if that would give me the annotation on loci / SNP. >> >> 3. Any existing slick way to create VCF files from these 200 genotype >> files. Our goal is to summarize the information in a single VCF across >> all the samples tagging the low quality ones. >> >> >> Many thanks! >> -Abhi >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >
ADD REPLY

Login before adding your answer.

Traffic: 550 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6