Algorithms_in_C  1.0.0
Set of algorithms implemented in C.
kohonen_som_trace.c File Reference

Kohonen self organizing map (data tracing) More...

#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
Include dependency graph for kohonen_som_trace.c:

Macros

#define _USE_MATH_DEFINES
 required for MS Visual C
 
#define max(a, b)   (((a) > (b)) ? (a) : (b))
 shorthand for maximum value
 
#define min(a, b)   (((a) < (b)) ? (a) : (b))
 shorthand for minimum value
 

Functions

double _random (double a, double b)
 Helper function to generate a random number in a given interval. More...
 
int save_nd_data (const char *fname, double **X, int num_points, int num_features)
 Save a given n-dimensional data martix to file. More...
 
void kohonen_get_min_1d (double const *X, int N, double *val, int *idx)
 Get minimum value and index of the value in a vector. More...
 
void kohonen_update_weights (double const *x, double *const *W, double *D, int num_out, int num_features, double alpha, int R)
 Update weights of the SOM using Kohonen algorithm. More...
 
void kohonen_som_tracer (double **X, double *const *W, int num_samples, int num_features, int num_out, double alpha_min)
 Apply incremental algorithm with updating neighborhood and learning rates on all samples in the given datset. More...
 
void test_circle (double *const *data, int N)
 Creates a random set of points distributed near the circumference of a circle and trains an SOM that finds that circular pattern. More...
 
void test1 ()
 Test that creates a random set of points distributed near the circumference of a circle and trains an SOM that finds that circular pattern. More...
 
void test_lamniscate (double *const *data, int N)
 Creates a random set of points distributed near the locus of the Lamniscate of Gerono. More...
 
void test2 ()
 Test that creates a random set of points distributed near the locus of the Lamniscate of Gerono and trains an SOM that finds that circular pattern. More...
 
void test_3d_classes (double *const *data, int N)
 Creates a random set of points distributed in four clusters in 3D space with centroids at the points. More...
 
void test3 ()
 Test that creates a random set of points distributed in six clusters in 3D space. More...
 
double get_clock_diff (clock_t start_t, clock_t end_t)
 Convert clock cycle difference to time in seconds. More...
 
int main (int argc, char **argv)
 Main function.
 

Detailed Description

Kohonen self organizing map (data tracing)

This example implements a powerful self organizing map algorithm. The algorithm creates a connected network of weights that closely follows the given data points. This this creates a chain of nodes that resembles the given input shape.

Author
Krishna Vedala
See also
kohonen_som_topology.c

Function Documentation

◆ get_clock_diff()

double get_clock_diff ( clock_t  start_t,
clock_t  end_t 
)

Convert clock cycle difference to time in seconds.

Parameters
[in]start_tstart clock
[in]end_tend clock
Returns
time difference in seconds
512 {
513  return (double)(end_t - start_t) / (double)CLOCKS_PER_SEC;
514 }

◆ test1()

void test1 ( )

Test that creates a random set of points distributed near the circumference of a circle and trains an SOM that finds that circular pattern.

The following CSV files are created to validate the execution:

  • test1.csv: random test samples points with a circular pattern
  • w11.csv: initial random map
  • w12.csv: trained SOM map

The outputs can be readily plotted in gnuplot using the following snippet

set datafile separator ','
plot "test1.csv" title "original", \
"w11.csv" title "w1", \
"w12.csv" title "w2"

Sample execution
output

262 {
263  int j, N = 500;
264  int features = 2;
265  int num_out = 50;
266 
267  // 2D space, hence size = number of rows * 2
268  double **X = (double **)malloc(N * sizeof(double *));
269 
270  // number of clusters nodes * 2
271  double **W = (double **)malloc(num_out * sizeof(double *));
272 
273  for (int i = 0; i < max(num_out, N); i++) // loop till max(N, num_out)
274  {
275  if (i < N) // only add new arrays if i < N
276  X[i] = (double *)malloc(features * sizeof(double));
277  if (i < num_out) // only add new arrays if i < num_out
278  {
279  W[i] = (double *)malloc(features * sizeof(double));
280 #ifdef _OPENMP
281 #pragma omp for
282 #endif
283  // preallocate with random initial weights
284  for (j = 0; j < features; j++) W[i][j] = _random(-1, 1);
285  }
286  }
287 
288  test_circle(X, N); // create test data around circumference of a circle
289  save_nd_data("test1.csv", X, N, features); // save test data points
290  save_nd_data("w11.csv", W, num_out,
291  features); // save initial random weights
292  kohonen_som_tracer(X, W, N, features, num_out, 0.1); // train the SOM
293  save_nd_data("w12.csv", W, num_out,
294  features); // save the resultant weights
295 
296  for (int i = 0; i < max(num_out, N); i++)
297  {
298  if (i < N)
299  free(X[i]);
300  if (i < num_out)
301  free(W[i]);
302  }
303 }
Here is the call graph for this function:

◆ test2()

void test2 ( )

Test that creates a random set of points distributed near the locus of the Lamniscate of Gerono and trains an SOM that finds that circular pattern.

The following CSV files are created to validate the execution:

  • test2.csv: random test samples points with a lamniscate pattern
  • w21.csv: initial random map
  • w22.csv: trained SOM map

The outputs can be readily plotted in gnuplot using the following snippet

set datafile separator ','
plot "test2.csv" title "original", \
"w21.csv" title "w1", \
"w22.csv" title "w2"

Sample execution
output

359 {
360  int j, N = 500;
361  int features = 2;
362  int num_out = 20;
363  double **X = (double **)malloc(N * sizeof(double *));
364  double **W = (double **)malloc(num_out * sizeof(double *));
365  for (int i = 0; i < max(num_out, N); i++)
366  {
367  if (i < N) // only add new arrays if i < N
368  X[i] = (double *)malloc(features * sizeof(double));
369  if (i < num_out) // only add new arrays if i < num_out
370  {
371  W[i] = (double *)malloc(features * sizeof(double));
372 
373 #ifdef _OPENMP
374 #pragma omp for
375 #endif
376  // preallocate with random initial weights
377  for (j = 0; j < features; j++) W[i][j] = _random(-1, 1);
378  }
379  }
380 
381  test_lamniscate(X, N); // create test data around the lamniscate
382  save_nd_data("test2.csv", X, N, features); // save test data points
383  save_nd_data("w21.csv", W, num_out,
384  features); // save initial random weights
385  kohonen_som_tracer(X, W, N, features, num_out, 0.01); // train the SOM
386  save_nd_data("w22.csv", W, num_out,
387  features); // save the resultant weights
388 
389  for (int i = 0; i < max(num_out, N); i++)
390  {
391  if (i < N)
392  free(X[i]);
393  if (i < num_out)
394  free(W[i]);
395  }
396  free(X);
397  free(W);
398 }
Here is the call graph for this function:

◆ test3()

void test3 ( )

Test that creates a random set of points distributed in six clusters in 3D space.

The following CSV files are created to validate the execution:

  • test3.csv: random test samples points with a circular pattern
  • w31.csv: initial random map
  • w32.csv: trained SOM map

The outputs can be readily plotted in gnuplot using the following snippet

set datafile separator ','
plot "test3.csv" title "original", \
"w31.csv" title "w1", \
"w32.csv" title "w2"

Sample execution
output

463 {
464  int j, N = 200;
465  int features = 3;
466  int num_out = 20;
467  double **X = (double **)malloc(N * sizeof(double *));
468  double **W = (double **)malloc(num_out * sizeof(double *));
469  for (int i = 0; i < max(num_out, N); i++)
470  {
471  if (i < N) // only add new arrays if i < N
472  X[i] = (double *)malloc(features * sizeof(double));
473  if (i < num_out) // only add new arrays if i < num_out
474  {
475  W[i] = (double *)malloc(features * sizeof(double));
476 
477 #ifdef _OPENMP
478 #pragma omp for
479 #endif
480  // preallocate with random initial weights
481  for (j = 0; j < features; j++) W[i][j] = _random(-1, 1);
482  }
483  }
484 
485  test_3d_classes(X, N); // create test data around the lamniscate
486  save_nd_data("test3.csv", X, N, features); // save test data points
487  save_nd_data("w31.csv", W, num_out,
488  features); // save initial random weights
489  kohonen_som_tracer(X, W, N, features, num_out, 0.01); // train the SOM
490  save_nd_data("w32.csv", W, num_out,
491  features); // save the resultant weights
492 
493  for (int i = 0; i < max(num_out, N); i++)
494  {
495  if (i < N)
496  free(X[i]);
497  if (i < num_out)
498  free(W[i]);
499  }
500  free(X);
501  free(W);
502 }
Here is the call graph for this function:

◆ test_3d_classes()

void test_3d_classes ( double *const *  data,
int  N 
)

Creates a random set of points distributed in four clusters in 3D space with centroids at the points.

  • \((0,5, 0.5, 0.5)\)
  • \((0,5,-0.5, -0.5)\)
  • \((-0,5, 0.5, 0.5)\)
  • \((-0,5,-0.5, -0.5)\)
Parameters
[out]datamatrix to store data in
[in]Nnumber of points required
411 {
412  const double R = 0.1; // radius of cluster
413  int i;
414  const int num_classes = 4;
415  const double centres[][3] = {
416  // centres of each class cluster
417  {.5, .5, .5}, // centre of class 1
418  {.5, -.5, -.5}, // centre of class 2
419  {-.5, .5, .5}, // centre of class 3
420  {-.5, -.5 - .5} // centre of class 4
421  };
422 
423 #ifdef _OPENMP
424 #pragma omp for
425 #endif
426  for (i = 0; i < N; i++)
427  {
428  int class =
429  rand() % num_classes; // select a random class for the point
430 
431  // create random coordinates (x,y,z) around the centre of the class
432  data[i][0] = _random(centres[class][0] - R, centres[class][0] + R);
433  data[i][1] = _random(centres[class][1] - R, centres[class][1] + R);
434  data[i][2] = _random(centres[class][2] - R, centres[class][2] + R);
435 
436  /* The follosing can also be used
437  for (int j = 0; j < 3; j++)
438  data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
439  */
440  }
441 }
Here is the call graph for this function:

◆ test_circle()

void test_circle ( double *const *  data,
int  N 
)

Creates a random set of points distributed near the circumference of a circle and trains an SOM that finds that circular pattern.

The generating function is

\begin{eqnarray*} r &\in& [1-\delta r, 1+\delta r)\\ \theta &\in& [0, 2\pi)\\ x &=& r\cos\theta\\ y &=& r\sin\theta \end{eqnarray*}

Parameters
[out]datamatrix to store data in
[in]Nnumber of points required
224 {
225  const double R = 0.75, dr = 0.3;
226  double a_t = 0., b_t = 2.f * M_PI; // theta random between 0 and 2*pi
227  double a_r = R - dr, b_r = R + dr; // radius random between R-dr and R+dr
228  int i;
229 
230 #ifdef _OPENMP
231 #pragma omp for
232 #endif
233  for (i = 0; i < N; i++)
234  {
235  double r = _random(a_r, b_r); // random radius
236  double theta = _random(a_t, b_t); // random theta
237  data[i][0] = r * cos(theta); // convert from polar to cartesian
238  data[i][1] = r * sin(theta);
239  }
240 }
Here is the call graph for this function:

◆ test_lamniscate()

void test_lamniscate ( double *const *  data,
int  N 
)

Creates a random set of points distributed near the locus of the Lamniscate of Gerono.

\begin{eqnarray*} \delta r &=& 0.2\\ \delta x &\in& [-\delta r, \delta r)\\ \delta y &\in& [-\delta r, \delta r)\\ \theta &\in& [0, \pi)\\ x &=& \delta x + \cos\theta\\ y &=& \delta y + \frac{\sin(2\theta)}{2} \end{eqnarray*}

Parameters
[out]datamatrix to store data in
[in]Nnumber of points required
320 {
321  const double dr = 0.2;
322  int i;
323 
324 #ifdef _OPENMP
325 #pragma omp for
326 #endif
327  for (i = 0; i < N; i++)
328  {
329  double dx = _random(-dr, dr); // random change in x
330  double dy = _random(-dr, dr); // random change in y
331  double theta = _random(0, M_PI); // random theta
332  data[i][0] = dx + cos(theta); // convert from polar to cartesian
333  data[i][1] = dy + sin(2. * theta) / 2.f;
334  }
335 }
Here is the call graph for this function:
test_circle
void test_circle(double *const *data, int N)
Creates a random set of points distributed near the circumference of a circle and trains an SOM that ...
Definition: kohonen_som_trace.c:223
data
Definition: prime_factoriziation.c:25
save_nd_data
int save_nd_data(const char *fname, double **X, int num_points, int num_features)
Save a given n-dimensional data martix to file.
Definition: kohonen_som_trace.c:70
test_lamniscate
void test_lamniscate(double *const *data, int N)
Creates a random set of points distributed near the locus of the Lamniscate of Gerono.
Definition: kohonen_som_trace.c:319
kohonen_som_tracer
void kohonen_som_tracer(double **X, double *const *W, int num_samples, int num_features, int num_out, double alpha_min)
Apply incremental algorithm with updating neighborhood and learning rates on all samples in the given...
Definition: kohonen_som_trace.c:179
max
#define max(a, b)
shorthand for maximum value
Definition: kohonen_som_trace.c:32
_random
double _random(double a, double b)
Helper function to generate a random number in a given interval.
Definition: kohonen_som_trace.c:54
test_3d_classes
void test_3d_classes(double *const *data, int N)
Creates a random set of points distributed in four clusters in 3D space with centroids at the points.
Definition: kohonen_som_trace.c:410