Question

fix

0

Entering edit mode

Chad Shaw ▴ 80

@chad-shaw-395

Last seen 10.3 years ago

Stephen: > I agree with some of WHAT you say CHAD, the PROBLEM is THAT MOST > multiVARIATE methods are BUILt on top OF the marginal tests. FOR instance > machine learning methods are based on gene subsets for each of k CROSS > validations. Right. I recognize that gene selection is a central component of many sequential data analysis schemes-- "at stage 1" pick a set of genes which (by a selection scheme) show regulation in the array experiments -- then at stage 2 you do something with that. My comment is STILL that this is a bad approach. I'm guilty of it, too. We are focusing on the trees instead of the ecosystem -- and if we had better covariate info/ knowledge of gene-connectedness we wouldnt be doing this. Moreover, if what you are doing at stage 2-k is based on 'binning' of genes, then a low frequency false positives at stage 1 will matter less, and so will slightly sub-optimal single gene power. > USE of the appropriate TEST (fold/T/F/cyber-T/etc..)for subset > selection is IMHO the most IMPORTANT!! choice . > > Yes I agree. Its just that THE FIXATION on this topic to the exclusion of what seem to be scientificially relevant other topics is BOTH maddening and disheartening. CAS

• 609 views

ADD COMMENT • link 21.0 years ago Chad Shaw ▴ 80