Entering edit mode
Chad Shaw
▴
80
@chad-shaw-395
Last seen 10.3 years ago
Stephen:
> I agree with some of WHAT you say CHAD, the PROBLEM is THAT MOST
> multiVARIATE methods are BUILt on top OF the marginal tests. FOR
instance
> machine learning methods are based on gene subsets for each of k
CROSS
> validations.
Right. I recognize that gene selection is a central component of many
sequential data analysis
schemes-- "at stage 1" pick a set of genes which (by a selection
scheme)
show regulation in the
array experiments -- then at stage 2 you do something with that.
My comment is STILL that this is a bad approach. I'm guilty of it,
too.
We are focusing on the trees instead of the ecosystem -- and if we had
better covariate
info/ knowledge of gene-connectedness we wouldnt be doing this.
Moreover, if what you are doing at stage 2-k is based on 'binning' of
genes,
then a low frequency false positives at stage 1 will matter less, and
so
will slightly sub-optimal
single gene power.
> USE of the appropriate TEST (fold/T/F/cyber-T/etc..)for subset
> selection is IMHO the most IMPORTANT!! choice .
>
>
Yes I agree. Its just that THE FIXATION on this topic to the
exclusion
of what
seem to be scientificially relevant other topics is BOTH maddening and
disheartening.
CAS