Please help: SNP Frequency Analysis Package in Bioconductor
1
1
Entering edit mode
@javerjung-sandhu-5043
Last seen 10.3 years ago
Dear Sir or Madam, I am interested in using Bioconductor to analyze and compare some data I’ve recently acquired and was wondering if you could help direct me to an appropriate analysis package. I have two sets of data, from two separate groups of AML patients. One set is of 96 patients, the other set is of 178 patients, and I have sequencing data for both groups. For a few genes of interest the first group’s dataset, it looked like there were a number of SNPs that were found with a higher than normal frequency among the 96 patients. What I would like to do is find out if the frequency of these SNPs is the same in the dataset of 178 patients. I just need a way of analyzing the frequency of SNPs in a large set of sequences. Thanks very much, Jung [[alternative HTML version deleted]]
Sequencing Sequencing • 1.3k views
ADD COMMENT
0
Entering edit mode
@vincent-j-carey-jr-4
Last seen 3 months ago
United States
On Thu, Mar 22, 2012 at 5:36 PM, Javerjung Sandhu <jsandhu@bcgsc.ca> wrote: > Dear Sir or Madam, > > I am interested in using Bioconductor to analyze and compare some data > I’ve recently acquired and was wondering if you could help direct me to an > appropriate analysis package. > > I have two sets of data, from two separate groups of AML patients. One set > is of 96 patients, the other set is of 178 patients, and I have sequencing > data for both groups. For a few genes of interest the first group’s > dataset, it looked like there were a number of SNPs that were found with a > higher than normal frequency among the 96 patients. What I would like to do > is find out if the frequency of these SNPs is the same in the dataset of > 178 patients. I just need a way of analyzing the frequency of SNPs in a > large set of sequences. > This question is not very clear. If you have already done SNP calling with your sequences (using samtools or some other external resource) I suppose it would be typical to have results in VCF format. This can be parsed using VariantAnnotation package, and you can tabulate variants and do some kind of categorical analysis to compare populations downstream. Some of the relevant computations for variant tabulation on the basis of VCF are given in the cgdv17 experimental data package vignette. There are no high level functions that I know of for doing population comparisons, so you should involve an experienced statistical geneticist if possible. If you have not done SNP calling, and your sequence data are in SAM or BAM format, you could use the pileup manipulations of Rsamtools to enumerate and tabulate variants. Examples of such manipulations in the domain of transcript variants can be found in the vignette of the ggtut experimental data package. > Thanks very much, > Jung > > > > [[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
On Thu, Mar 22, 2012 at 10:39 PM, Vincent Carey <stvjc at="" channing.harvard.edu=""> wrote: > On Thu, Mar 22, 2012 at 5:36 PM, Javerjung Sandhu <jsandhu at="" bcgsc.ca=""> wrote: > >> Dear Sir or Madam, >> >> I am interested in using Bioconductor to analyze and compare some data >> I?ve recently acquired and was wondering if you could help direct me to an >> appropriate analysis package. >> >> > I have two sets of data, from two separate groups of AML patients. ?One set >> is of 96 patients, the other set is of 178 patients, and I have sequencing >> data for both groups. ?For a few genes of interest the first group?s >> dataset, it looked like there were a number of SNPs that were found with a >> higher than normal frequency among the 96 patients. What I would like to do >> is find out if the frequency of these SNPs is the same in the dataset of >> 178 patients. ?I just need a way of analyzing the frequency of SNPs in a >> large set of sequences. >> > > This question is not very clear. ?If you have already done SNP calling with > your sequences (using samtools or some other external resource) I suppose > it would be typical to have results in VCF format. ?This can be parsed > using VariantAnnotation package, and you can tabulate variants and do some > kind of categorical analysis to compare populations downstream. Some of the > relevant computations for variant tabulation on the basis of VCF are given > in the cgdv17 experimental data package vignette. ?There are no high level > functions that I know of for doing population comparisons, so you should > involve an experienced statistical geneticist if possible. I will second Vince's comment about involving a statistical geneticist; it is really easy to do these types of analyses incorrectly. As for "how to do it" in R, you could take a look at: http://www.genabel.org/tutorials/ABEL-tutorial Outside R, you might take a look at plink. Samtools also has rudimentary association testing capabilities. Sean > If you have not done SNP calling, and your sequence data are in SAM or BAM > format, you could use the pileup manipulations of Rsamtools to enumerate > and tabulate variants. ?Examples of such manipulations in the domain of > transcript variants can be found in the vignette of the ggtut experimental > data package. > > >> Thanks very much, >> Jung >> >> >> >> ? ? ? ?[[alternative HTML version deleted]] >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > ? ? ? ?[[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 520 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6