Entering edit mode
Iddo Ben-dov
▴
20
@iddo-ben-dov-6603
Last seen 10.3 years ago
hi,
in both edgeR and DESeq2, estimation of dispersion precedes negative
binomial GLM fitting.
my question is, can I use a design formula when estimating dispersion
which is different from the formula used for GLM fitting?
specifically, I would like to use a simplified design when estimating
dispersion and a full design for GLM fitting.
my motivation for doing so is that with the full design estimation of
dispersion is too demanding for my computer and time.
my dataset includes 400 mRNAseq profiles (~22,000 genes). there are
100 controls and 100 cases, and each was sampled twice - before and
after intervention.
thus, the full design is:
~ group*intervention + individual:group (blocking factor)
as I mentioned, estimation of dispersion with the above design is not
practical, and I thus would like to simplify to:
~ group*intervention
and introduce the 'individual' blocking factor only for NB GLM
fitting.
is this statistically valid?
appreciate any help,
iddo