Question

MCLUST: finding the best number of Gaussian components to fit my data

0

Entering edit mode

travascio.andrea91 • 0

@travascioandrea91-19699

Last seen 6.2 years ago

Hi everybody, my question is the following..

I have a sample of galaxy radial velocities in a galaxy cluster (unfortunately the size of this sample is N=18, I know..N<20 is not the best) and I wish to know what is the number of Gaussians which fit my data distribution in the best way [this can assume the values G=1:3]. Afterthat, I want to know what are the best Gaussian parameters. I expect G=3 as a best result of number of Gaussians to consider, but I need a number (I guess the log likelihood) that describes the significance of this case. Many use MCLUST (R package) for modeling data as a Gaussian finite mixture. I read that It allows to find the optimal number of components (through a clustering hierarchical approach) and the corresponding classification.

I tried to use the following pipelines:

1)Mclust with only these parameters...

> modClust = Mclust(dataset,G=1:3,modelsName="V")
fitting ...
|==============================================================| 100%
> summary(modClust)
%---------------------------------------------------- 
Gaussian finite mixture model fitted by EM algorithm 
%---------------------------------------------------- 
Mclust X (univariate normal) model with 1 component: 
 log.likelihood  n df      BIC      ICL
      -157.1966 18  2 -320.174 -320.174
Clustering table:
 1 
18

but it returns the best G value =1.. but I know it should be 3

2) I thought an alternative method could be to perform a FOR cycle in which I change the G value and I compare the log likelihood values..

Have you advices? What am I doing wrong? What am I not considering?

thanks in advance for the help,

Andrea

software error microarray probe annotation • 1.4k views

ADD COMMENT • link updated 6.2 years ago by Michael Love 43k • written 6.2 years ago by travascio.andrea91 • 0

0

Entering edit mode

I'm removing the "DESeq2" tag as I can't see any relevance to DESeq2.

ADD REPLY • link 6.2 years ago Michael Love 43k