Question

Monocle plot_pseudotime_heatmap : Error in cutree(tree, cutree_n) : elements of 'k' must be between 1 and 3

0

Entering edit mode

A.H • 0

@ah-13492

Last seen 7.6 years ago

I immediately admit that I am an expert in R, or scripting in general for that matter, so I hope this is an easy question:
I do not know how to include a decent example dataset for this so I try to explain the problem

I am analyzing single-cell RNA-seq data, using the Monocle (Bioconductor) package. This package allows you to order cells in pseudotime based on their transcriptome, thereby providing insight into transcriptome dynamics during development in pseudotime. I am trying to use the command plot_pseudotime_heatmap to generate a heatmap for a fitted linear model on the gene-expression pattern of a number of disease-markers in pseudotime.

Though for some genesets I get the following error when plotting the heatmap:
Error in cutree(tree, cutree_n) : elements of 'k' must be between 1 and 3.

I cannot figure out what k is (I thought the number of cluster to be formed, but why does this need to be between 1 and 3...), or which setting to change to overcome this error:

Default settings for plot_pseudotime_heatmap():

plot_pseudotime_heatmap(cds_subset, cluster_rows = TRUE, hclust_method = "ward.D2", num_clusters = 6, hmcols = NULL, add_annotation_row = NULL, add_annotation_col = NULL, show_rownames = FALSE, use_gene_short_name = TRUE, norm_method = c("vstExprs", "log"), scale_max = 3, scale_min = -3, trend_formula = "~sm.ns(Pseudotime, df=3)", return_heatmap = FALSE, cores = 1)

As far as I understand, in this package, the command plot_pseudotime_heatmap, first orders the cells according to their pseudotime in 'development', next fits a linear model to the expression pattern per gene (with lm.fit()), and finally transforming this data into a heatmap (using a variant of pheatmap() ), where the pseudotime is split over 100 pseudotime-bins.

I have the difficulty of working with a very rare cellpopulation, causing some for the disease-marker genes to be expressed in only 1 or a few cells and with only a few reads per gene (cut-off for expressed-genes is >=1read in >=1 cell). I understand that trying to fit a linear model on these genes will be impossible, and that this gene will get excluded from the heatmap and I will get the error: <simpleError in lm.fit(X.vlm, y = z.vlm, ...): NA/NaN/Inf in 'y'>. But why can I sometimes continue with the remaining genes, but sometimes still get the above error on the remaining genes??

Many thanks for any help I can get!

R monocle • 3.1k views

ADD COMMENT • link 7.6 years ago A.H • 0