Algorithms_in_C  1.0.0
Set of algorithms implemented in C.
kohonen_som_topology.c File Reference

Kohonen self organizing map (topological map) More...

#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
Include dependency graph for kohonen_som_topology.c:

Data Structures

struct  kohonen_array_3d
 to store info regarding 3D arrays More...
 

Macros

#define _USE_MATH_DEFINES
 required for MS Visual C
 
#define max(a, b)   (((a) > (b)) ? (a) : (b))
 shorthand for maximum value
 
#define min(a, b)   (((a) < (b)) ? (a) : (b))
 shorthand for minimum value
 

Functions

double * kohonen_data_3d (const struct kohonen_array_3d *arr, int x, int y, int z)
 Function that returns the pointer to (x, y, z) ^th location in the linear 3D array given by: More...
 
double _random (double a, double b)
 Helper function to generate a random number in a given interval. More...
 
int save_2d_data (const char *fname, double **X, int num_points, int num_features)
 Save a given n-dimensional data martix to file. More...
 
int save_u_matrix (const char *fname, struct kohonen_array_3d *W)
 Create the distance matrix or U-matrix from the trained weights and save to disk. More...
 
void get_min_2d (double **X, int N, double *val, int *x_idx, int *y_idx)
 Get minimum value and index of the value in a matrix. More...
 
double kohonen_update_weights (const double *X, struct kohonen_array_3d *W, double **D, int num_out, int num_features, double alpha, int R)
 Update weights of the SOM using Kohonen algorithm. More...
 
void kohonen_som (double **X, struct kohonen_array_3d *W, int num_samples, int num_features, int num_out, double alpha_min)
 Apply incremental algorithm with updating neighborhood and learning rates on all samples in the given datset. More...
 
void test_2d_classes (double *const *data, int N)
 Creates a random set of points distributed in four clusters in 3D space with centroids at the points. More...
 
void test1 ()
 Test that creates a random set of points distributed in four clusters in 2D space and trains an SOM that finds the topological pattern. More...
 
void test_3d_classes1 (double *const *data, int N)
 Creates a random set of points distributed in four clusters in 3D space with centroids at the points. More...
 
void test2 ()
 Test that creates a random set of points distributed in 4 clusters in 3D space and trains an SOM that finds the topological pattern. More...
 
void test_3d_classes2 (double *const *data, int N)
 Creates a random set of points distributed in four clusters in 3D space with centroids at the points. More...
 
void test3 ()
 Test that creates a random set of points distributed in eight clusters in 3D space and trains an SOM that finds the topological pattern. More...
 
double get_clock_diff (clock_t start_t, clock_t end_t)
 Convert clock cycle difference to time in seconds. More...
 
int main (int argc, char **argv)
 Main function.
 

Detailed Description

Kohonen self organizing map (topological map)

This example implements a powerful unsupervised learning algorithm called as a self organizing map. The algorithm creates a connected network of weights that closely follows the given data points. This thus creates a topological map of the given data i.e., it maintains the relationship between various data points in a much higher dimensional space by creating an equivalent in a 2-dimensional space. Trained topological maps for the test cases in the program

Author
Krishna Vedala
Warning
MSVC 2019 compiler generates code that does not execute as expected. However, MinGW, Clang for GCC and Clang for MSVC compilers on windows perform as expected. Any insights and suggestions should be directed to the author.
See also
kohonen_som_trace.c

Function Documentation

◆ get_clock_diff()

double get_clock_diff ( clock_t  start_t,
clock_t  end_t 
)

Convert clock cycle difference to time in seconds.

Parameters
[in]start_tstart clock
[in]end_tend clock
Returns
time difference in seconds
664 {
665  return (double)(end_t - start_t) / (double)CLOCKS_PER_SEC;
666 }

◆ test1()

void test1 ( )

Test that creates a random set of points distributed in four clusters in 2D space and trains an SOM that finds the topological pattern.

The following CSV files are created to validate the execution:

  • test1.csv: random test samples points with a circular pattern
  • w11.csv: initial random U-matrix
  • w12.csv: trained SOM U-matrix
407 {
408  int j, N = 300;
409  int features = 2;
410  int num_out = 30; // image size - N x N
411 
412  // 2D space, hence size = number of rows * 2
413  double **X = (double **)malloc(N * sizeof(double *));
414 
415  // cluster nodex in 'x' * cluster nodes in 'y' * 2
416  struct kohonen_array_3d W;
417  W.dim1 = num_out;
418  W.dim2 = num_out;
419  W.dim3 = features;
420  W.data = (double *)malloc(num_out * num_out * features *
421  sizeof(double)); // assign rows
422 
423  for (int i = 0; i < max(num_out, N); i++) // loop till max(N, num_out)
424  {
425  if (i < N) // only add new arrays if i < N
426  X[i] = (double *)malloc(features * sizeof(double));
427  if (i < num_out) // only add new arrays if i < num_out
428  {
429  for (int k = 0; k < num_out; k++)
430  {
431 #ifdef _OPENMP
432 #pragma omp for
433 #endif
434  // preallocate with random initial weights
435  for (j = 0; j < features; j++)
436  {
437  double *w = kohonen_data_3d(&W, i, k, j);
438  w[0] = _random(-5, 5);
439  }
440  }
441  }
442  }
443 
444  test_2d_classes(X, N); // create test data around circumference of a circle
445  save_2d_data("test1.csv", X, N, features); // save test data points
446  save_u_matrix("w11.csv", &W); // save initial random weights
447  kohonen_som(X, &W, N, features, num_out, 1e-4); // train the SOM
448  save_u_matrix("w12.csv", &W); // save the resultant weights
449 
450  for (int i = 0; i < N; i++) free(X[i]);
451  free(X);
452  free(W.data);
453 }
int save_u_matrix(const char *fname, struct kohonen_array_3d *W)
Create the distance matrix or U-matrix from the trained weights and save to disk.
Definition: kohonen_som_topology.c:139
int save_2d_data(const char *fname, double **X, int num_points, int num_features)
Save a given n-dimensional data martix to file.
Definition: kohonen_som_topology.c:102
double * kohonen_data_3d(const struct kohonen_array_3d *arr, int x, int y, int z)
Function that returns the pointer to (x, y, z) ^th location in the linear 3D array given by:
Definition: kohonen_som_topology.c:67
void kohonen_som(double **X, struct kohonen_array_3d *W, int num_samples, int num_features, int num_out, double alpha_min)
Apply incremental algorithm with updating neighborhood and learning rates on all samples in the given...
Definition: kohonen_som_topology.c:314
double _random(double a, double b)
Helper function to generate a random number in a given interval.
Definition: kohonen_som_topology.c:87
#define max(a, b)
shorthand for maximum value
Definition: kohonen_som_topology.c:39
void test_2d_classes(double *const *data, int N)
Creates a random set of points distributed in four clusters in 3D space with centroids at the points.
Definition: kohonen_som_topology.c:366
to store info regarding 3D arrays
Definition: kohonen_som_topology.c:48
int dim1
lengths of first dimension
Definition: kohonen_som_topology.c:49
Here is the call graph for this function:

◆ test2()

void test2 ( )

Test that creates a random set of points distributed in 4 clusters in 3D space and trains an SOM that finds the topological pattern.

The following CSV files are created to validate the execution:

  • test2.csv: random test samples points
  • w21.csv: initial random U-matrix
  • w22.csv: trained SOM U-matrix
507 {
508  int j, N = 500;
509  int features = 3;
510  int num_out = 30; // image size - N x N
511 
512  // 3D space, hence size = number of rows * 3
513  double **X = (double **)malloc(N * sizeof(double *));
514 
515  // cluster nodex in 'x' * cluster nodes in 'y' * 2
516  struct kohonen_array_3d W;
517  W.dim1 = num_out;
518  W.dim2 = num_out;
519  W.dim3 = features;
520  W.data = (double *)malloc(num_out * num_out * features *
521  sizeof(double)); // assign rows
522 
523  for (int i = 0; i < max(num_out, N); i++) // loop till max(N, num_out)
524  {
525  if (i < N) // only add new arrays if i < N
526  X[i] = (double *)malloc(features * sizeof(double));
527  if (i < num_out) // only add new arrays if i < num_out
528  {
529  for (int k = 0; k < num_out; k++)
530  {
531 #ifdef _OPENMP
532 #pragma omp for
533 #endif
534  for (j = 0; j < features; j++)
535  { // preallocate with random initial weights
536  double *w = kohonen_data_3d(&W, i, k, j);
537  w[0] = _random(-5, 5);
538  }
539  }
540  }
541  }
542 
543  test_3d_classes1(X, N); // create test data
544  save_2d_data("test2.csv", X, N, features); // save test data points
545  save_u_matrix("w21.csv", &W); // save initial random weights
546  kohonen_som(X, &W, N, features, num_out, 1e-4); // train the SOM
547  save_u_matrix("w22.csv", &W); // save the resultant weights
548 
549  for (int i = 0; i < N; i++) free(X[i]);
550  free(X);
551  free(W.data);
552 }
void test_3d_classes1(double *const *data, int N)
Creates a random set of points distributed in four clusters in 3D space with centroids at the points.
Definition: kohonen_som_topology.c:465
Here is the call graph for this function:

◆ test3()

void test3 ( )

Test that creates a random set of points distributed in eight clusters in 3D space and trains an SOM that finds the topological pattern.

The following CSV files are created to validate the execution:

  • test3.csv: random test samples points
  • w31.csv: initial random U-matrix
  • w32.csv: trained SOM U-matrix
610 {
611  int j, N = 500;
612  int features = 3;
613  int num_out = 30;
614  double **X = (double **)malloc(N * sizeof(double *));
615 
616  // cluster nodex in 'x' * cluster nodes in 'y' * 2
617  struct kohonen_array_3d W;
618  W.dim1 = num_out;
619  W.dim2 = num_out;
620  W.dim3 = features;
621  W.data = (double *)malloc(num_out * num_out * features *
622  sizeof(double)); // assign rows
623 
624  for (int i = 0; i < max(num_out, N); i++) // loop till max(N, num_out)
625  {
626  if (i < N) // only add new arrays if i < N
627  X[i] = (double *)malloc(features * sizeof(double));
628  if (i < num_out) // only add new arrays if i < num_out
629  {
630  for (int k = 0; k < num_out; k++)
631  {
632 #ifdef _OPENMP
633 #pragma omp for
634 #endif
635  // preallocate with random initial weights
636  for (j = 0; j < features; j++)
637  {
638  double *w = kohonen_data_3d(&W, i, k, j);
639  w[0] = _random(-5, 5);
640  }
641  }
642  }
643  }
644 
645  test_3d_classes2(X, N); // create test data around the lamniscate
646  save_2d_data("test3.csv", X, N, features); // save test data points
647  save_u_matrix("w31.csv", &W); // save initial random weights
648  kohonen_som(X, &W, N, features, num_out, 0.01); // train the SOM
649  save_u_matrix("w32.csv", &W); // save the resultant weights
650 
651  for (int i = 0; i < N; i++) free(X[i]);
652  free(X);
653  free(W.data);
654 }
void test_3d_classes2(double *const *data, int N)
Creates a random set of points distributed in four clusters in 3D space with centroids at the points.
Definition: kohonen_som_topology.c:564
Here is the call graph for this function:

◆ test_2d_classes()

void test_2d_classes ( double *const *  data,
int  N 
)

Creates a random set of points distributed in four clusters in 3D space with centroids at the points.

  • \((0,5, 0.5, 0.5)\)
  • \((0,5,-0.5, -0.5)\)
  • \((-0,5, 0.5, 0.5)\)
  • \((-0,5,-0.5, -0.5)\)
Parameters
[out]datamatrix to store data in
[in]Nnumber of points required
367 {
368  const double R = 0.3; // radius of cluster
369  int i;
370  const int num_classes = 4;
371  const double centres[][2] = {
372  // centres of each class cluster
373  {.5, .5}, // centre of class 1
374  {.5, -.5}, // centre of class 2
375  {-.5, .5}, // centre of class 3
376  {-.5, -.5} // centre of class 4
377  };
378 
379 #ifdef _OPENMP
380 #pragma omp for
381 #endif
382  for (i = 0; i < N; i++)
383  {
384  int class =
385  rand() % num_classes; // select a random class for the point
386 
387  // create random coordinates (x,y,z) around the centre of the class
388  data[i][0] = _random(centres[class][0] - R, centres[class][0] + R);
389  data[i][1] = _random(centres[class][1] - R, centres[class][1] + R);
390 
391  /* The follosing can also be used
392  for (int j = 0; j < 2; j++)
393  data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
394  */
395  }
396 }
Definition: prime_factoriziation.c:25
Here is the call graph for this function:

◆ test_3d_classes1()

void test_3d_classes1 ( double *const *  data,
int  N 
)

Creates a random set of points distributed in four clusters in 3D space with centroids at the points.

  • \((0,5, 0.5, 0.5)\)
  • \((0,5,-0.5, -0.5)\)
  • \((-0,5, 0.5, 0.5)\)
  • \((-0,5,-0.5, -0.5)\)
Parameters
[out]datamatrix to store data in
[in]Nnumber of points required
466 {
467  const double R = 0.2; // radius of cluster
468  int i;
469  const int num_classes = 4;
470  const double centres[][3] = {
471  // centres of each class cluster
472  {.5, .5, .5}, // centre of class 1
473  {.5, -.5, -.5}, // centre of class 2
474  {-.5, .5, .5}, // centre of class 3
475  {-.5, -.5 - .5} // centre of class 4
476  };
477 
478 #ifdef _OPENMP
479 #pragma omp for
480 #endif
481  for (i = 0; i < N; i++)
482  {
483  int class =
484  rand() % num_classes; // select a random class for the point
485 
486  // create random coordinates (x,y,z) around the centre of the class
487  data[i][0] = _random(centres[class][0] - R, centres[class][0] + R);
488  data[i][1] = _random(centres[class][1] - R, centres[class][1] + R);
489  data[i][2] = _random(centres[class][2] - R, centres[class][2] + R);
490 
491  /* The follosing can also be used
492  for (int j = 0; j < 3; j++)
493  data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
494  */
495  }
496 }
Here is the call graph for this function:

◆ test_3d_classes2()

void test_3d_classes2 ( double *const *  data,
int  N 
)

Creates a random set of points distributed in four clusters in 3D space with centroids at the points.

  • \((0,5, 0.5, 0.5)\)
  • \((0,5,-0.5, -0.5)\)
  • \((-0,5, 0.5, 0.5)\)
  • \((-0,5,-0.5, -0.5)\)
Parameters
[out]datamatrix to store data in
[in]Nnumber of points required
565 {
566  const double R = 0.2; // radius of cluster
567  int i;
568  const int num_classes = 8;
569  const double centres[][3] = {
570  // centres of each class cluster
571  {.5, .5, .5}, // centre of class 1
572  {.5, .5, -.5}, // centre of class 2
573  {.5, -.5, .5}, // centre of class 3
574  {.5, -.5, -.5}, // centre of class 4
575  {-.5, .5, .5}, // centre of class 5
576  {-.5, .5, -.5}, // centre of class 6
577  {-.5, -.5, .5}, // centre of class 7
578  {-.5, -.5, -.5} // centre of class 8
579  };
580 
581 #ifdef _OPENMP
582 #pragma omp for
583 #endif
584  for (i = 0; i < N; i++)
585  {
586  int class =
587  rand() % num_classes; // select a random class for the point
588 
589  // create random coordinates (x,y,z) around the centre of the class
590  data[i][0] = _random(centres[class][0] - R, centres[class][0] + R);
591  data[i][1] = _random(centres[class][1] - R, centres[class][1] + R);
592  data[i][2] = _random(centres[class][2] - R, centres[class][2] + R);
593 
594  /* The follosing can also be used
595  for (int j = 0; j < 3; j++)
596  data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
597  */
598  }
599 }
Here is the call graph for this function: