I have a very noisy data set that I am having trouble thresholding to
get
useful genes, so I am resorting to using missing values. (Yes, I
could
impute, but my n is not huge and consists of four non-distinct
groups). My
question is:
How does limma handle missing data? It doesn't mind NA's coming in,
but
does it correct for the different df at each gene given the number of
NA's?
Thanks,
Sean
Hi Sean,
>I have a very noisy data set that I am having trouble thresholding to
get
>useful genes, so I am resorting to using missing values. (Yes, I
could
>impute, but my n is not huge and consists of four non-distinct
groups). My
>question is:
>
>How does limma handle missing data?
>
When fitting gene-wise linear models in limma using lmFit(), missing
values are removed from the response vector of log-ratios, and the
corresponding row/s of the design matrix are removed before the linear
model is fitted to each gene.
>It doesn't mind NA's coming in, but
>does it correct for the different df at each gene given the number of
NA's?
>
So yes, the degrees of freedom will depend on the number of missing
values for each gene. Best wishes,
Matt Ritchie
>Thanks,
>Sean
>