# Clustering Data

Functions for clustering data.

```function kmeanscluster (x : numerical, k : index)
```
```  Cluster the rows in x in k clusters

It uses the algorithm proposed by Lloyd, and used by Steinhaus,
MacQueen. The algorithm starts with a random partition (cluster).
Then it computes the means of the clusters, and associates each
point to the cluster with the closest mean. It loops this
procedure until there are no changes.

The function works for multi-dimensional x too. The means are then
vector means, and the distance to the mean is measured in Euclidean
distance.

x : rows containing the data points
k : number of clusters that should be used

Returns j : indices of the clusters the rows should belong to.
```
```function similaritycluster (S : numerical, k : index)
```
```  Cluster data depending on the similarity matrix S

This clustering uses the first k eigenvalue of S, and clusters
the entries of their eigenvalues.

S : similarity matrix (symmetric)
k : number of clusters

Returns j : indices of the clusters the rows should belong to.
```
```function eigencluster (x : numerical, k : index)
```
```  Cluster the rows in x in k clusters

This algorithm uses the similarity matrix S, which contains the
Euclidean distances of two rows in x. Then it uses the the function
similaritycluster() to get the clustering of the similarity
matrix.

x : rows containing the data points
k : number of clusters that should be used

Returns j : indices of the clusters the rows should belong to.
```

