I have a sample dataset derived from single cell RNA Sequencing with 1800 samples. Some genes have only few counts. Using WGCNA I can compute modules and even define the module membership for each gene for each module. I want to find the number of counts for which a gene would be safely clustered into one (or 2) module(s).
Would it be valid to: Define genes with counts e.g. lower 5, by computing the max(counts) for each gene in the original dataset and select gene names. Create subsets with 80% the original dataset counts and compute the module membership in each subset. Compare module labels between subsets for groups of genes (counts lower 5, counts >5 and <10, and so on..) and select the group for which the module label doesn't change? What would be a more statistically valid way to compute module membership preservation for genes?