Clustering like samples with a lot of data

0

Entering edit mode

gaiusjaugustus • 0

@gaiusjaugustus-10041

Last seen 6.6 years ago

University of Arizona

I have a very large dataframe/matrix with >100 samples and >50,000 datapoints for each sample. I'd like to cluster the samples to get an idea of which have similar patterns. The data looks like this:

Row.names Region1 Region2 ..... Region N
P1 3 3 2
P2 4 4 2
P3 4 4 2

What I'd like is a graphical representation of the values in the cells while clustering for individuals that are similar, then I'd like to do it again clustering both the samples and the regions. I've tried using this as a guide, but R has been choking. I assume there's a package out there that can do this based on a table/matrix of data.

Any help would be appreciated.

clustering • 1.0k views

ADD COMMENT • link 9.0 years ago gaiusjaugustus • 0

1

Entering edit mode

What do you mean by choking? If it is too slow, sometimes you need to reduce the dimensionality of your data via variance filtering for example. I use the aheatmap function from package NMF, it is very good, but again it won't handle loads of features.

ADD REPLY • link 9.0 years ago chris86 ▴ 420

Login before adding your answer.