cross validation / bootstrap after classification
3
0
Entering edit mode
@heike-pospisil-1097
Last seen 10.2 years ago
Hello Bioconducters, I used t-test and/or SAM to find significant genes describing the differences in hgu133plus2-chips of two different phenotypical classes. The resulting heatmaps show a promising clustering. Now, I would like to confirm these clusters and to estimate the robustness of this clustering by cross-validation and/or bootstrapping(*). For that, I have two questions: 1) Does there exists an appropriate package and/or source to perfom cross-validation and/or bootstrapping? 2) Which is the right measure to rate the goodness of such a clustering? By now, I looked over the cluster plots(**) and decided if it was good or a bad clustering. Thanks in advance for any suggestion. Best wishes, Heike * with varying chip - subsets ** heatmap(exprs(sub),Colv=as.dendrogram(hclust(dist(t(exprs(sub)), method="euclidean"),method="complete")))
Clustering Clustering • 1.5k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On 11/4/05 7:40 AM, "Heike Pospisil" <pospisil at="" zbh.uni-hamburg.de=""> wrote: > Hello Bioconducters, > > I used t-test and/or SAM to find significant genes describing the differences > in > hgu133plus2-chips of two different phenotypical classes. The resulting > heatmaps > show a promising clustering. > > Now, I would like to confirm these clusters and to estimate the robustness of > this clustering by cross-validation and/or bootstrapping(*). For that, I have > two questions: > > 1) Does there exists an appropriate package and/or source to perfom > cross-validation and/or bootstrapping? > > 2) Which is the right measure to rate the goodness of such a clustering? By > now, > I looked over the cluster plots(**) and decided if it was good or a bad > clustering. Heike, If I understand what you did, there is a major problem with your logic, I think. You are using the genes from a SUPERVISED analysis to do your clustering. There SHOULD be clustering and the strength of the clustering is already measured by the number of significant genes from your SAM analysis. In other words, you told SAM to define genes that divide your two groups and then ask for hierarchical clustering to give you its best guess as to the clustering given those genes--of course you will get back a clustering very close to the clusters that you gave SAM (if, indeed, there is any difference between the two groups). So, there is no point in determining the significance of the heatmap clustering--it doesn't represent an unsupervised analysis anymore. Hope that helps a bit. Sean
ADD COMMENT
0
Entering edit mode
@kevin-coombes-1459
Last seen 10.2 years ago
Hi, I have a package (ClassDiscovery) at http://bioinformatics.mdanderson.org/Software/OOMPA that includes classes for PerturbationClusterTest and BootstrapClusterTest. Richard Simon's book (Design and Analysis of DNA Microarray Experiments) includes a section on assesing the validity of clusters. Of course, clusters arising from a supervised selection of genes aren't meaningful anyway.... -- Kevin --On Friday, November 04, 2005 1:40 PM +0100 Heike Pospisil <pospisil at="" zbh.uni-hamburg.de=""> wrote: > Hello Bioconducters, > > I used t-test and/or SAM to find significant genes describing the > differences in hgu133plus2-chips of two different phenotypical classes. > The resulting heatmaps show a promising clustering. > > Now, I would like to confirm these clusters and to estimate the > robustness of this clustering by cross-validation and/or > bootstrapping(*). For that, I have two questions: > > 1) Does there exists an appropriate package and/or source to perfom > cross-validation and/or bootstrapping? > > 2) Which is the right measure to rate the goodness of such a clustering? > By now, I looked over the cluster plots(**) and decided if it was good > or a bad clustering. > > Thanks in advance for any suggestion. > Best wishes, > Heike > > * with varying chip - subsets > ** heatmap(exprs(sub),Colv=as.dendrogram(hclust(dist(t(exprs(sub)), > method="euclidean"),method="complete"))) > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENT
0
Entering edit mode
@heike-pospisil-1097
Last seen 10.2 years ago
Hello, > I have a package (ClassDiscovery) at > http://bioinformatics.mdanderson.org/Software/OOMPA > that includes classes for > PerturbationClusterTest > and > BootstrapClusterTest. > > Richard Simon's book (Design and Analysis of DNA Microarray Experiments) > includes a section on assesing the validity of clusters. Thanks for these hints. > Of course, clusters arising from a supervised selection of genes aren't > meaningful anyway.... I see, my explanation was too unexact. I use the t-test to get a sub set of gene and cluster them. Now, I would like to decide how robust is this selection depending on a random selection of chips. Sorry for this confusion and thanks for your help. Will try your package soon. Best wishes, Heike
ADD COMMENT

Login before adding your answer.

Traffic: 862 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6