# mala::homeDavide “+mala” Eynard’s website

5Jun/122

## Octave clustering demo part 0: introduction and setup

As promised to my PAMI students, I have prepared a couple of demos to get a better grasp of how different clustering algorithms work. Hoping they could be useful for somebody else (and sure that this way I will not lose them so easily ;-)) I have decided to post more information about them here.

All the demos are in one single file and can be downloaded from the course page together with slides and other material (or if you only want the demo file follow this direct link). Once unpacked, you will find the following:

• data/*.mat (some example datasets)
• accuracy.m (calculates clustering accuracy)
• kMeansDemo*.m (k-means demos)
• L2_distance.m (calculates L2 distance between two vectors/matrices)
• laplacian.m (builds the Laplacian used by spectral clustering)
• myKmeans.m (performs k-means clustering)
• plotClustering.m (given clustered data and ground truth, plots clustering results)
• repKmeans.m (automatically repeats k-means ntimes and keeps the result with lowest SSE)
• spectralDemo.m (spectral clustering demo)
• SSE.m (given clustered data and centroids, calculates total SSE)
• uAveragedAccuracy.m (another way to calculate clustering accuracy)

To run the code just open Octave, cd into the spectralClustering directory, and call the different functions. The demo files are executed in the following ways:

```    kMeansDemo1(dataset);
- or -
spectralDemo(dataset,nn,t);```

where dataset is one of the files in the data directory, nn is the number of nearest neighbors for the calculation of the adjacency matrix, and t is the parameter for the Gaussian kernel used to calculate similarities in the weighted adjacency matrix (0 can be used for auto-tuning and usually works fine enough). For example:

```    kMeansDemo1('./data/blobs03.mat');
- or -
spectralDemo('./data/circles02.mat',10,0);```

Good values of nn for these examples range between 10 and 40... I will let you experiment which ones are better for each dataset ;-). Now feel free to play with the demos or read the following tutorials: