This is not really a question but more of a warning to other users.
I have performed a regression analysis using the assocTestRegression
function under three different models (dominant,recessive,additive).
My data set contains ~3 million markers which have been filtered so
that only SNPs with >= MAF of 10% are included. Please note that this
filter was applied with both cases and controls as one big data set
(i.e. I did not perform the filter for cases and controls separately).
Once I have examined the results of the association under the
recessive model, I noticed very large beta estimates (8-9). When I
looked at the genotype counts, I realised that this was due to the
fact that in some SNPs, there is perfect linear separation. In other
words, the AA genotype has a count of 0 in cases and a count of 170 in
controls, which leads to inflated estimates.
I was surprised to find that the function does not throw a warning for
this or drops the analysis for SNPs where this occurs.
Regards,
Danica
-- output of sessionInfo():
--
Sent via the guest posting facility at bioconductor.org.
Hi Danica,
assocTestRegression will return an error code for SNPs that are
monomorphic in either cases or controls, but it seems that you have
found a case that we did not test for.
I consulted with Matt Conomos, who wrote this function, and he said
the
following:
Since AA has a count of 0 in cases in the example given, and an error
was not returned, I would assume that both AB and BB are non-zero in
cases, but it would be nice to confirm this. Also, it would be nice
to
know which allele is the minor allele (the function returns this),
since
a recessive model is being fit. If the A allele is the minor allele,
then the recessive model collapses the AB and BB classes, and this
could
lead to the separability issue. I may need to add in a check for this
when fitting dominant or recessive models.
Could you please provide the full output of assocTestRegression for
the
SNPs where you see this problem? Also, include the output of
sessionInfo() so we know which version of GWASTools you are using.
Stephanie
On 9/4/14, 3:56 AM, Danica [guest] wrote:
> This is not really a question but more of a warning to other users.
>
> I have performed a regression analysis using the assocTestRegression
function under three different models (dominant,recessive,additive).
My data set contains ~3 million markers which have been filtered so
that only SNPs with >= MAF of 10% are included. Please note that this
filter was applied with both cases and controls as one big data set
(i.e. I did not perform the filter for cases and controls separately).
>
> Once I have examined the results of the association under the
recessive model, I noticed very large beta estimates (8-9). When I
looked at the genotype counts, I realised that this was due to the
fact that in some SNPs, there is perfect linear separation. In other
words, the AA genotype has a count of 0 in cases and a count of 170 in
controls, which leads to inflated estimates.
>
> I was surprised to find that the function does not throw a warning
for this or drops the analysis for SNPs where this occurs.
>
> Regards,
> Danica
>
>
>
> -- output of sessionInfo():
>
>
>
> --
> Sent via the guest posting facility at bioconductor.org.
>