Hello,
I am a bit puzzled about what you actually want to ask.
James Anderson wrote:
> For binary classification problem in microarray, if you do some
random subsampling classification (every time split data into 80%
training and 20% test with stratification (perserving the ratio in
each class), repeat many times). When you get some results, one thing
you would normally look at is how significantly different is your
results from what you are going to get by chance, that's why people do
label permutation test. My question is that: Are the final results of
label permutation test for accuracy equal to the proportion of the
large class (say there are 80 normal vs. 20 disease, is the mean
accuracy of label permutation test equal to 80/(80+20) as long as you
repeat enough times? Is this classifier independent?
>
The mean accuracy of your classifier after label permutation, in a
cross-validation setting presumably, depends very much on your
classifier. What you should contrast it to is the accuracy of the
naive
classifier "assign every sample to the larger class", 80% in your
case.
A good reason for label permutation in your case is that you want to
assess the classifier's generalizability, because one can always
construct a classifier that has an accuracy of 100% on the training
data, but performs badly on independent test data. That is one reason
why people do label permutation with classification because the
classifier's mean accuracy in a cross-validation setting gives a
better
estimate of the classifier's accuracy on test data. (You have to make
sure that you do not use any aspect of the set-aside training data for
training the classifier, though.) An even better estimate for your
classifier's performance, however, would be its accuracy on a
completely
independent test data set. Cross-validation on your training data
could
then be used to select parameters of your classifier, if needed.
Hope this helps.
Regards,
Joern
> Thanks a lot!
>
> James
>
>
> ---------------------------------
> Building a website is a piece of cake.
>
well, classification sometimes isn't.