Why is assocTestMM running with only a fraction of SNPs in genotype data?
1
1
Entering edit mode
naglemi ▴ 10
@naglemi-16588
Last seen 5.8 years ago

I'm trying to run assocTestMM (GENESIS) with 4.4M SNPs. However, it runs with only a small fraction of SNPs. I've read the R documentation for assocTestMM and don't see any type of filter settings. I checked my genotype data prior to running assocTestMM and made sure the correct number of SNPs are in this dataset. Please see below the inconsistency between how many SNPs are in the dataset and how many are used by assocTestMM. How can I run this package with all of the SNPs in the dataset?

Below you can see the object geno has 4,428,726 SNPs

> geno
File:
gatk_882_WG_genotypes_biallelic_snps_VQSR_0.05maf_bis_pruned_tobed_0.1geno.gds (953.8M)
+    [  ] *
|--+ sample.id   { Int32 882 ZIP_ra(38.1%), 1.3K }
|--+ snp.id   { Str8 4428726 ZIP_ra(20.1%), 12.4M }
|--+ snp.position   { Int32 4428726 ZIP_ra(45.2%), 7.6M }
|--+ snp.chromosome   { UInt8 4428726 ZIP_ra(0.10%), 4.3K } *
|--+ snp.allele   { Str8 4428726 ZIP_ra(14.6%), 2.5M }
|--+ genotype   { Bit2 882x4428726, 931.3M } *
\--+ sample.annot   [ data.frame ] *
   |--+ sex   { Str8 882 ZIP_ra(3.06%), 34B }
   \--+ phenotype   { Int32 882 ZIP_ra(1.13%), 47B }

However, when I try to run GWAS only a small fraction of SNPs are included. Why is this? How can I include the whole set? 

> genoData <- GenotypeData(geno, scanAnnot = scanAnnotTRANS)
> assoc <- assocTestMM(genoData = genoData, nullMMobj = LMMnullmod, chr=19,test = "Wald")
Running analysis with 586 Samples and 143191 SNPs
​Beginning Calculations...
LMM GWAS GENESIS assocTestMM • 1.7k views
ADD COMMENT
2
Entering edit mode
mconomos ▴ 70
@mconomos-7819
Last seen 5.0 years ago
University of Washington, Seattle, WA, …
Hello, In the code you provided, it looks like you are specifying chr = 19 in the function call to assocTestMM. This will result in only variants in chromosome 19 being tested. Please remove that argument, and chr will default to NULL, which runs all variants. Best, Matt On Mon, Jul 23, 2018 at 12:42 PM naglemi [bioc] <noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User naglemi <https: support.bioconductor.org="" u="" 16588=""/> wrote Question: > Why is assocTestMM running with only a fraction of SNPs in genotype data? > <https: support.bioconductor.org="" p="" 111313=""/>: > > I'm trying to run assocTestMM (GENESIS) with 4.4M SNPs. However, it runs > with only a small fraction of SNPs. I've read the R documentation for > assocTestMM and don't see any type of filter settings. I checked my > genotype data prior to running assocTestMM and made sure the correct number > of SNPs are in this dataset. Please see below the inconsistency between how > many SNPs are in the dataset and how many are used by assocTestMM. How can > I run this package with all of the SNPs in the dataset? > > Below you can see the object *geno *has 4,428,726 SNPs > > > genoFile: > gatk_882_WG_genotypes_biallelic_snps_VQSR_0.05maf_bis_pruned_tobed_0.1geno.gds (953.8M) > + [ ] * > |--+ sample.id { Int32 882 ZIP_ra(38.1%), 1.3K } > |--+ snp.id { Str8 4428726 ZIP_ra(20.1%), 12.4M } > |--+ snp.position { Int32 4428726 ZIP_ra(45.2%), 7.6M } > |--+ snp.chromosome { UInt8 4428726 ZIP_ra(0.10%), 4.3K } * > |--+ snp.allele { Str8 4428726 ZIP_ra(14.6%), 2.5M } > |--+ genotype { Bit2 882x*4428726*, 931.3M } * > \--+ sample.annot [ data.frame ] * > |--+ sex { Str8 882 ZIP_ra(3.06%), 34B } > \--+ phenotype { Int32 882 ZIP_ra(1.13%), 47B } > > However, when I try to run GWAS only a small fraction of SNPs are > included. Why is this? How can I include the whole set? > > > genoData <- GenotypeData(geno, scanAnnot = scanAnnotTRANS) > > assoc <- assocTestMM(genoData = genoData, nullMMobj = LMMnullmod, chr=19,test = "Wald") > > Running analysis with 586 Samples and 143191 SNPs > Beginning Calculations... > > ------------------------------ > > Post tags: LMM, GWAS, GENESIS, assocTestMM > > You may reply via email or visit > Why is assocTestMM running with only a fraction of SNPs in genotype data? > -- Matthew P. Conomos, PhD. Research Scientist Department of Biostatistics Genetics Analysis Center University of Washington Seattle, WA 98105-1016, USA email: mconomos@uw.edu phone: (206) 685-8848
ADD COMMENT
0
Entering edit mode

Thanks for the quick response! I misunderstood that option and thought it asked for the number of chromosomes. After removing the option chr, I'm able to run the whole SNP set. 

ADD REPLY
0
Entering edit mode

Hi matt,
Sorry to revive this old thread. I'd be interested in using ivar.return.betaCov from assocTestMM; but this is now defunct replaced by assocTestSingle function which doesn't provide an option to get betaCov. Any way I can extract beta cov?

ADD REPLY
0
Entering edit mode

There's not a way to do it on an entire Iterator object right now. However, if you extract a matrix of genotypes, you can run the internal function GENESIS:::testGenoSingleVar, which has the option to set GxE.return.cov=TRUE. https://github.com/UW-GAC/GENESIS/blob/master/R/testGeno.R#L10

ADD REPLY

Login before adding your answer.

Traffic: 583 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6