question regarding .632plus error rate estimator in ipred package

0

Entering edit mode

James Anderson ▴ 820

@james-anderson-1641

Last seen 10.2 years ago

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20071030/ 88913ce3/attachment.pl

• 745 views

ADD COMMENT • link updated 17.1 years ago by Kuhn, Max ▴ 70 • written 17.1 years ago by James Anderson ▴ 820

0

Entering edit mode

Kuhn, Max ▴ 70

@kuhn-max-1170

Last seen 10.2 years ago

James, I think that there is some confusion here: > there is .632plus estimator, but seems that this estimator > does not have feature selection built in The 632 estimator is a method of evaluating model performance form a training set (using the bootstrap). It knows nothing about the model. Feature selection methods happen either as wrappers around the model or, for some models, as built-in qualities of the model (e.g. rpart or nearest shrunken centroids). Functionally, feature selection has nothing to do with resampling estimators of model quality. In practice, it is more complicated. You should take great care when estimating performance on a training set when you are using a feature selection algorithm. You should read: www.pnas.org/cgi/content/abstract/99/10/6562 bioinformatics.oxfordjournals.org/cgi/content/abstract/btm344v1 and the references therein. Max -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of James Anderson Sent: Tuesday, October 30, 2007 5:29 PM To: bioconductor Subject: [BioC] question regarding .632plus error rate estimator in ipredpackage Sorry to bother those who are not interested. In the ipred package, there is .632plus estimator, but seems that this estimator does not have feature selection built in. If that is the case, I am wondering how this can be applied to microarray, since feature selection is a must for microarray. If feature selection is done on the entire dataset and perform .632plus later, there will be some bias with the leave-one-out bootstrap part. I think other estimators should be the same in the sense that it is done on the dataset without performing feature selection. Is what I understand correct or not? __________________________________________________ [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 17.1 years ago Kuhn, Max ▴ 70

Login before adding your answer.