Question

: Clustering with Diana

0

Entering edit mode

Anthony Bosco ▴ 500

@anthony-bosco-517

Last seen 10.2 years ago

Hi, could someone please help with Diana clustering and visualisation. I would like to do 1-way (genes only) and 2-way (genes and samples) clustering and visualise as a heatmap or in Treeview software. regards Anthony -- ______________________________________________ Anthony Bosco - PhD Student Institute for Child Health Research (Company Limited by Guarantee ACN 009 278 755) Subiaco, Western Australia, 6008 Ph 61 8 9489 , Fax 61 8 9489 7700 email anthonyb@ichr.uwa.edu.au

Clustering Clustering • 2.4k views

ADD COMMENT • link updated 19.9 years ago by Christopher Wilkinson ▴ 140 • written 19.9 years ago by Anthony Bosco ▴ 500

score 0 · Answer 1 · 2004-12-16

I recommend that you start with the documentation and examples for the 'heatmap' function. This creates heatmaps of your underlying data, augmented by optional dendrogram representations of clusterings of your rows (could be genes) and columns (could be arrays). In its default mode, it performs the clustering for you and it will do agglomerative hierarchical, I believe. Start there and get comfortable. Then, you can do something more advanced -- namely, request a non-default clustering method. You maybe able to do this using the 'hclustfun' argument of 'heatmap' (i.e. specify 'diana' here) or you could perform your clustering in advance, using 'diana' if you like, and pass those results as dendrograms to 'heatmap'. Can't help you with Treeview. Good luck, Jenny Anthony Bosco writes: > Hi, > > could someone please help with Diana clustering and visualisation. > > I would like to do 1-way (genes only) and 2-way (genes and samples) > clustering and visualise as a heatmap or in Treeview software. > > > regards > > > Anthony > -- > ______________________________________________ > > Anthony Bosco - PhD Student > > Institute for Child Health Research > (Company Limited by Guarantee ACN 009 278 755) > Subiaco, Western Australia, 6008 > > Ph 61 8 9489 , Fax 61 8 9489 7700 > email anthonyb@ichr.uwa.edu.au > > >

score 0 · Answer 2 · 2004-12-16

> could someone please help with Diana clustering and visualisation. > > I would like to do 1-way (genes only) and 2-way (genes and samples) > clustering and visualise as a heatmap or in Treeview software. > > > Anthony > -- I've never used Treeview but I have used heatmap and diana. The way I've used diana is to first save the diana object, convert to dendrogram, and define clusters by cutting it a certain height. I've used the diana algorithm both with and without the dissimilarity matrix. I've copied some code I have and modified the names of objects to hopefully be a bit clearer. ## on raw data (matrix of Mvalues, rows = genes, col=arrays) Mvalues <- matrix(0,nrow=100,ncol=9) rownames(Mvalues) <- 300:400 colnames(Mvalues) <- ("a","b","c","d","e","f","g","h","i") for (i in 1:3) Mvalues[,i] <- rnorm(100) for (i in 4:6) Mvalues[,i] <- rnorm(100,mean=2,sd=0.5) for (i in 7:9) Mvalues[,i] <- rnorm(100,mean=-1,sd=0.7) dianaGenes <- diana(Mvalues) ## or using a precomputed dissilarity matrix: ## dianaGenes <- diana(dissMatrix,diss=TRUE,keep.diss=FALSE) dianaDend <- as.dendrogram(as.hclust(dianaGenes)) dianaDendOrder <- order.dendrogram(dianaDend) ## My rownames is index.name. I reorder it based on the new order clusteredGeneNames <- rownames(Mvalues)[dianaDendOrder] ## To select the colours use low <- col2rgb("green")/255 high <- col2rgb("red")/255 heatmapCol <- rgb( seq(low[1],high[1],len=123), seq(low[2],high[2],len=123), seq(low[3],high[3],len=123) ) ## personally I don't much like the red/green system, and prefer heat.colors heatmapCol <- heat.colors(123) ## If you are just clustering on genes you can colour the arrays ## eg say you had 3 groups of 3 colColours <- c(rep("green",3),rep("red",3),rep("blue",3)) ##you can also define clusters by cutting the dendrogram and colouring these: dianaClusters.h2 <- cut(dianaDend,h=2) nClusters <- length(dianaClusters.h2$lower) dianaClusters <- numeric(length=dim(Mvalues)[1]) for (i in 1:nClusters) dianaClusters[order.dendrogram(dianaClusters.h2$lower[[i]])] <- i ## now colour the rows based on clusters. ## I like distinct colours between clusters rowColChoices <- character(nClusters) nClusters.2 <- ceiling(nClusters/2) nClusters.2.min <- min(nClusters.2,floor(nClusters/2)) rowColChoices[1:nClusters.2*2-1] <- rainbow(nClusters.2,start=0,end=2/6) rowColChoices[1:nClusters.2.min*2] <- rev(rainbow(nClusters.2.min,start=3/6,end=5/6)) rowCols <- character(dim(Mvalues)[1]) for (i in 1:length(rowCols)) rowCols[i] <- rowColChoices[dianaClusters[i]] ## or randomly assign colours rowColChoices <- rainbow(nClusters)[sample(nClusters,nClusters)] rowCols <- character(length=dim(MValues)[1]) for (i in 1:length(rowCols)) rowCols[i] <- rowColChoices[dianaClusters[i]] ## To cluster just on genes: heatmap(Mvalues, Rowv=dianaDend, Colv=NA, scale="row", labRow=clusteredGeneNames,cexRow=.2, col=heatmapCol, ColSideColors=colColours,RowSideColors=rowCols) to cluster on genes and arrays I think just replace Colv with a dendrogram object based on clustering over cols: dianaArrays <- diana(t(Mvalues)) dianaDendArrays <- as.dendrogram(as.hclust(dianaArrays)) # call heatmap with Colv=dianaDendArrays and drop ColSideColours heatmap(Mvalues, Rowv=dianaDend, Colv=dianaDendArrays, scale="row", labRow=clusteredGeneNames,cexRow=.2, col=heatmapCol, RowSideColors=rowCols) Cheers Chris Dr Chris Wilkinson Senior Research Officer (Bioinformatics) | ARC Research Associate Child Health Research Institute (CHRI) | Microarray Analysis Group 7th floor, Clarence Rieger Building | Room 121 Women's and Children's Hospital | School of Mathematical Sciences 72 King William Rd, North Adelaide, 5006 | The University of Adelaide, 5005 Math's Office (Room 121) Ph: 8303 3714 CHRI Office (CR2 52A) Ph: 8161 6363 Christopher.Wilkinson@adelaide.edu.au http://mag.maths.adelaide.edu.au/crwilkinson.html