Entering edit mode
Narayanan, Manikandan NIH/NIAID [E]
▴
60
@narayanan-manikandan-nihniaid-e-5829
Last seen 9.2 years ago
United States
Hi,
I may have stumbled upon an error in the branch length calculation
of the ctc package's hc2Newick function, which converts an hclust
object to Newick format. However I am not confident yet to report it
as a bug and hence this email to see if you agree with my
interpretations below.
> hc = hclust(d = as.dist(matrix(c(0, 2, 4, 4, 2, 0, 4, 4, 4, 4, 0, 2,
4, 4, 2, 0), 4, 4, dimnames = list(LETTERS[1:4], LETTERS[1:4]))))
> hc2Newick(hc)
[1] "((A:1,B:1):0,(C:1,D:1):0);"
Shouldn't the correct output be [1] "((A:1,B:1):1,(C:1,D:1):1);" so
that the branch lengths and hclust heights agree??
Some more information on hclust heights (easiest to view via
plot(hc)):
> hc$labels
[1] "A" "B" "C" "D"
> hc$height
[1] 2 2 4
> hc$merge
[,1] [,2]
[1,] -1 -2
[2,] -3 -4
[3,] 1 2
If my interpretations are correct, then the hc2Newick() function code
can be fixed by removing all "dist" calculation statements and putting
a single "dist" calculation statement like so:
j <- hc$merge[i, 1] #original code
k <- hc$merge[i, 2] #original code
+ dist <- hc$height[i] - (ifelse(j < 0, 0, hc$height[j]) +
ifelse(k < 0, 0, hc$height[k]))/2 #new line
I'm using R 2.15 and packages: "[1] ctc_1.32.0 amap_0.8-7
Biobase_2.18.0 BiocGenerics_0.4.0".
Thanks,
Mani
PS: Apologies if this email was posted twice, as I wasn't sure if I
posted correctly the first time I tried.