Question

Class discovery options?

0

Entering edit mode

Paul Boutros ▴ 340

@paul-boutros-371

Last seen 10.2 years ago

Hi all, I have (what I think is) a fairly interesting model system that I'm a little unsure how to best analyze, and what tools to use. I'm looking for any advice/ideas/suggestions on techniques/tools that might be applicable. I'll briefly outline the system a tiny bit first: - give rats drug at time 0 - 50% of rats get sick at time 3 weeks - key event deciding if rats get sick happens in 24 hours - no known way of predicting which rat will get sick So you can see the problem: if you sacrifice the rats and run arrays at 24 hours, you don't know which rats will get sick and which ones won't. Our collaborators ran a bunch of affy arrays at rats at an early time- point. When I take this data, normalize it (e.g. RMA) and remove highly invariant genes (e.g. require CV > 0.5) I can cluster the animals into two nice groups that look quite different on a heatmap when using hierarchical clustering. Even better, one of those two groups looks much like (and clusters with) control animals. So my question: is there a better way of doing this analysis? In particular is there a way of fitting a model to each gene that will help me distinguish the extent to which each gene may be involved in separation, and which genes are consistent across all rats or are simply randomly perturbed? My thought on model-fitting was to randomly assign each rat to one of the two classes (resistant or sensitive), then to run the model-fitting. I would repeat this for all (well, at least many) permutations and then use some sort of measure of goodness-of-fit for the model (residuals?) to see select the best classification. Does that seem reasonable? Any other thoughts or ideas are very much appreciated. Paul

affy ASSIGN affy ASSIGN • 973 views

ADD COMMENT • link 20.7 years ago Paul Boutros ▴ 340