Entering edit mode
Paul Boutros
▴
340
@paul-boutros-371
Last seen 10.2 years ago
Hi all,
I have (what I think is) a fairly interesting model system that I'm a
little
unsure how to best analyze, and what tools to use. I'm looking for
any
advice/ideas/suggestions on techniques/tools that might be applicable.
I'll briefly outline the system a tiny bit first:
- give rats drug at time 0
- 50% of rats get sick at time 3 weeks
- key event deciding if rats get sick happens in 24 hours
- no known way of predicting which rat will get sick
So you can see the problem: if you sacrifice the rats and run arrays
at 24
hours, you don't know which rats will get sick and which ones won't.
Our collaborators ran a bunch of affy arrays at rats at an early time-
point.
When I take this data, normalize it (e.g. RMA) and remove highly
invariant
genes (e.g. require CV > 0.5) I can cluster the animals into two nice
groups
that look quite different on a heatmap when using hierarchical
clustering.
Even better, one of those two groups looks much like (and clusters
with)
control animals.
So my question: is there a better way of doing this analysis? In
particular is
there a way of fitting a model to each gene that will help me
distinguish the
extent to which each gene may be involved in separation, and which
genes are
consistent across all rats or are simply randomly perturbed?
My thought on model-fitting was to randomly assign each rat to one of
the two
classes (resistant or sensitive), then to run the model-fitting. I
would
repeat this for all (well, at least many) permutations and then use
some sort
of measure of goodness-of-fit for the model (residuals?) to see select
the best
classification.
Does that seem reasonable? Any other thoughts or ideas are very much
appreciated.
Paul