Hello,
This is a more theoretical question about a potential use of DESeq2.
As I understand it, DESeq2 fits gene expression per gene using a negative binomial model with an alpha dispersion parameter estimated with the aid of a log-normal prior distribution parameterised by other genes with similar mean expression values (except for outliers with very high observed dispersion).
It has been shown on many occasions that the negative binomial is a suitable model for modelling gene expression with at least a moderate degree of independence between replicate measurements for that gene (e.g. n livers from n different mice).
Now consider the problem of pseudoreplication. For example, n measurements derived from the same passage of the same cell line in two different conditions (e.g. not drug treated vs. drug treated). For the most part we expect the cell line to be genomically homogeneous between cells of that cell line. Prior to drug treatment the cell lines are distributed onto different plates generating a set of pseudoreplicates, grown for k days (where k is <4) and drugged separately.
Would DESeq2 still be a suitable tool for this experimental design to test the hypothesis of differential expression of genes between the two conditions for the set of pseudoreplicates? More precisely, I would expect gene expression for the pseudoreplicates to be close to a Poisson (though I do not know for definite) due to minimal biological variation between pseudoreplicates. Would DESeq2's dispersion estimate overestimate the dispersion parameter for this design i.e. would the dispersion parameter be quite greater than zero?
Best wishes, Thomas