Question

ConsensusClusterPlus - predict clusters for new cases?

0

Entering edit mode

philipp24 ▴ 30

@philipp24-8672

Last seen 8.6 years ago

Germany

Dear all,

I have 4000 (continuous) predictor variables in a set of 150 patients. First, variables with are associated with survival should be identified. I therefore use the multiple testing procedures function (http://svitsrv25.epfl.ch/R-doc/library/multtest/html/MTP.html) with the t-statistic for tests of regression coefficients in Cox proportional hazards survival models to identify significant predictors. This analysis identifies 60 parameters which are significantly associated with survival. I then perform unsupervised k-medoids clustering with the ConsensusClusterPlus package (https://www.bioconductor.org/packages/release/bioc/html/ConsensusClusterPlus.html) which identifies 3 clusters as the optimal solution based on the CDF curve & progression graph:

consClust = ConsensusClusterPlus(exprs(exampleSet), maxK=10,reps=1000,pItem=0.8,pFeature=1,title="example",distance="manhattan",clusterAlg="pam",verbose=FALSE,writeTable=TRUE)

consClustList = matrix(c(consClust[[3]][["consensusClass"]]), ncol=1)

This works fine and consClustList gives me the information which of the 150 patient belongs to which of the three clusters.

Lets assume that I have another set of 50 patients and I want to predict, to which of the three clusters that were identified in the training set (n=150), these patients in the validation set (n=50) belong to. How can I achieve this?

Thanks in advance for your help!

consensusclusterplus clustering • 2.3k views

ADD COMMENT • link 9.5 years ago philipp24 ▴ 30