Entering edit mode
Hi all,
I am recently analyzing the array data. There are several groups
represent
different disease subtype. I will just describe what I did here. I
identified significant genes. And extract the expression levels of
these
genes, and performed the cluster analysis using gplots package in
bioconductor/R. The question I have here is the cluster analysis did
not
group the samples well according the disease subtype. So I assume this
is a
question about supervised and unsupervised cluster. From online data,
it
seems this not really right, because supervised analysis describe more
likely the way to classify new samples based on previous data. And
there
come with the idea of semi-supervised concept. Here I am already
confused.
Would the analysis methods, such as PAM, SOM, and Kmeans, be
supervised or
semi-supervised clusters? Could anyone spend time to clear my idea
about
supervised, semi-supervised, and unsupervised? And recommend any
packages
in bioconductor that might help me to group the samples according
disease
sub-type?
I like programming, and have biology/medicine background, with
relatively
limited bioinformatics. Any interpretation are welcome.
Thanks!
Wenhuo Hu
Park lab
Memorial Sloan Kettering Cancer Center
Zuckerman Research Building
408 East 69th Street
Room ZRC-527
New York, NY 10065
Phone 646-888-3220
huw@mskcc.org
[[alternative HTML version deleted]]