Algorithms_in_C 1.0.0
Set of algorithms implemented in C.
kohonen_som_trace.c File Reference

Kohonen self organizing map (data tracing) More...

#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <omp.h>
Include dependency graph for kohonen_som_trace.c:

Macros

#define _USE_MATH_DEFINES
 required for MS Visual C
 
#define max(a, b)   (((a) > (b)) ? (a) : (b))
 shorthand for maximum value
 
#define min(a, b)   (((a) < (b)) ? (a) : (b))
 shorthand for minimum value
 

Functions

double _random (double a, double b)
 Helper function to generate a random number in a given interval. More...
 
int save_nd_data (const char *fname, double **X, int num_points, int num_features)
 Save a given n-dimensional data martix to file. More...
 
void kohonen_get_min_1d (double const *X, int N, double *val, int *idx)
 Get minimum value and index of the value in a vector. More...
 
void kohonen_update_weights (double const *x, double *const *W, double *D, int num_out, int num_features, double alpha, int R)
 Update weights of the SOM using Kohonen algorithm. More...
 
void kohonen_som_tracer (double **X, double *const *W, int num_samples, int num_features, int num_out, double alpha_min)
 Apply incremental algorithm with updating neighborhood and learning rates on all samples in the given datset. More...
 
void test_circle (double *const *data, int N)
 Creates a random set of points distributed near the circumference of a circle and trains an SOM that finds that circular pattern. More...
 
void test1 ()
 Test that creates a random set of points distributed near the circumference of a circle and trains an SOM that finds that circular pattern. More...
 
void test_lamniscate (double *const *data, int N)
 Creates a random set of points distributed near the locus of the Lamniscate of Gerono. More...
 
void test2 ()
 Test that creates a random set of points distributed near the locus of the Lamniscate of Gerono and trains an SOM that finds that circular pattern. More...
 
void test_3d_classes (double *const *data, int N)
 Creates a random set of points distributed in four clusters in 3D space with centroids at the points. More...
 
void test3 ()
 Test that creates a random set of points distributed in six clusters in 3D space. More...
 
double get_clock_diff (clock_t start_t, clock_t end_t)
 Convert clock cycle difference to time in seconds. More...
 
int main (int argc, char **argv)
 Main function. More...
 

Detailed Description

Kohonen self organizing map (data tracing)

This example implements a powerful self organizing map algorithm. The algorithm creates a connected network of weights that closely follows the given data points. This this creates a chain of nodes that resembles the given input shape.

Author
Krishna Vedala
See also
kohonen_som_topology.c

Function Documentation

◆ get_clock_diff()

double get_clock_diff ( clock_t  start_t,
clock_t  end_t 
)

Convert clock cycle difference to time in seconds.

Parameters
[in]start_tstart clock
[in]end_tend clock
Returns
time difference in seconds
512{
513 return (double)(end_t - start_t) / (double)CLOCKS_PER_SEC;
514}

◆ main()

int main ( int  argc,
char **  argv 
)

Main function.

518{
519#ifdef _OPENMP
520 printf("Using OpenMP based parallelization\n");
521#else
522 printf("NOT using OpenMP based parallelization\n");
523#endif
524 clock_t start_clk = clock();
525 test1();
526 clock_t end_clk = clock();
527 printf("Test 1 completed in %.4g sec\n",
528 get_clock_diff(start_clk, end_clk));
529 start_clk = clock();
530 test2();
531 end_clk = clock();
532 printf("Test 2 completed in %.4g sec\n",
533 get_clock_diff(start_clk, end_clk));
534 start_clk = clock();
535 test3();
536 end_clk = clock();
537 printf("Test 3 completed in %.4g sec\n",
538 get_clock_diff(start_clk, end_clk));
539 printf(
540 "(Note: Calculated times include: creating test sets, training "
541 "model and writing files to disk.)\n\n");
542 return 0;
543}
void test2()
Test that creates a random set of points distributed near the locus of the Lamniscate of Gerono and t...
Definition: kohonen_som_trace.c:358
void test1()
Test that creates a random set of points distributed near the circumference of a circle and trains an...
Definition: kohonen_som_trace.c:261
double get_clock_diff(clock_t start_t, clock_t end_t)
Convert clock cycle difference to time in seconds.
Definition: kohonen_som_trace.c:511
void test3()
Test that creates a random set of points distributed in six clusters in 3D space.
Definition: kohonen_som_trace.c:462
Here is the call graph for this function:

◆ test1()

void test1 ( )

Test that creates a random set of points distributed near the circumference of a circle and trains an SOM that finds that circular pattern.

The following CSV files are created to validate the execution:

  • test1.csv: random test samples points with a circular pattern
  • w11.csv: initial random map
  • w12.csv: trained SOM map

The outputs can be readily plotted in gnuplot using the following snippet

set datafile separator ','
plot "test1.csv" title "original", \
"w11.csv" title "w1", \
"w12.csv" title "w2"

Sample execution
output

262{
263 int j, N = 500;
264 int features = 2;
265 int num_out = 50;
266
267 // 2D space, hence size = number of rows * 2
268 double **X = (double **)malloc(N * sizeof(double *));
269
270 // number of clusters nodes * 2
271 double **W = (double **)malloc(num_out * sizeof(double *));
272
273 for (int i = 0; i < max(num_out, N); i++) // loop till max(N, num_out)
274 {
275 if (i < N) // only add new arrays if i < N
276 X[i] = (double *)malloc(features * sizeof(double));
277 if (i < num_out) // only add new arrays if i < num_out
278 {
279 W[i] = (double *)malloc(features * sizeof(double));
280#ifdef _OPENMP
281#pragma omp for
282#endif
283 // preallocate with random initial weights
284 for (j = 0; j < features; j++) W[i][j] = _random(-1, 1);
285 }
286 }
287
288 test_circle(X, N); // create test data around circumference of a circle
289 save_nd_data("test1.csv", X, N, features); // save test data points
290 save_nd_data("w11.csv", W, num_out,
291 features); // save initial random weights
292 kohonen_som_tracer(X, W, N, features, num_out, 0.1); // train the SOM
293 save_nd_data("w12.csv", W, num_out,
294 features); // save the resultant weights
295
296 for (int i = 0; i < max(num_out, N); i++)
297 {
298 if (i < N)
299 free(X[i]);
300 if (i < num_out)
301 free(W[i]);
302 }
303}
int save_nd_data(const char *fname, double **X, int num_points, int num_features)
Save a given n-dimensional data martix to file.
Definition: kohonen_som_trace.c:70
void kohonen_som_tracer(double **X, double *const *W, int num_samples, int num_features, int num_out, double alpha_min)
Apply incremental algorithm with updating neighborhood and learning rates on all samples in the given...
Definition: kohonen_som_trace.c:179
double _random(double a, double b)
Helper function to generate a random number in a given interval.
Definition: kohonen_som_trace.c:54
#define max(a, b)
shorthand for maximum value
Definition: kohonen_som_trace.c:32
void test_circle(double *const *data, int N)
Creates a random set of points distributed near the circumference of a circle and trains an SOM that ...
Definition: kohonen_som_trace.c:223
#define malloc(bytes)
This macro replace the standard malloc function with malloc_dbg.
Definition: malloc_dbg.h:18
#define free(ptr)
This macro replace the standard free function with free_dbg.
Definition: malloc_dbg.h:26
Here is the call graph for this function:

◆ test2()

void test2 ( )

Test that creates a random set of points distributed near the locus of the Lamniscate of Gerono and trains an SOM that finds that circular pattern.

The following CSV files are created to validate the execution:

  • test2.csv: random test samples points with a lamniscate pattern
  • w21.csv: initial random map
  • w22.csv: trained SOM map

The outputs can be readily plotted in gnuplot using the following snippet

set datafile separator ','
plot "test2.csv" title "original", \
"w21.csv" title "w1", \
"w22.csv" title "w2"

Sample execution
output

359{
360 int j, N = 500;
361 int features = 2;
362 int num_out = 20;
363 double **X = (double **)malloc(N * sizeof(double *));
364 double **W = (double **)malloc(num_out * sizeof(double *));
365 for (int i = 0; i < max(num_out, N); i++)
366 {
367 if (i < N) // only add new arrays if i < N
368 X[i] = (double *)malloc(features * sizeof(double));
369 if (i < num_out) // only add new arrays if i < num_out
370 {
371 W[i] = (double *)malloc(features * sizeof(double));
372
373#ifdef _OPENMP
374#pragma omp for
375#endif
376 // preallocate with random initial weights
377 for (j = 0; j < features; j++) W[i][j] = _random(-1, 1);
378 }
379 }
380
381 test_lamniscate(X, N); // create test data around the lamniscate
382 save_nd_data("test2.csv", X, N, features); // save test data points
383 save_nd_data("w21.csv", W, num_out,
384 features); // save initial random weights
385 kohonen_som_tracer(X, W, N, features, num_out, 0.01); // train the SOM
386 save_nd_data("w22.csv", W, num_out,
387 features); // save the resultant weights
388
389 for (int i = 0; i < max(num_out, N); i++)
390 {
391 if (i < N)
392 free(X[i]);
393 if (i < num_out)
394 free(W[i]);
395 }
396 free(X);
397 free(W);
398}
void test_lamniscate(double *const *data, int N)
Creates a random set of points distributed near the locus of the Lamniscate of Gerono.
Definition: kohonen_som_trace.c:319
Here is the call graph for this function:

◆ test3()

void test3 ( )

Test that creates a random set of points distributed in six clusters in 3D space.

The following CSV files are created to validate the execution:

  • test3.csv: random test samples points with a circular pattern
  • w31.csv: initial random map
  • w32.csv: trained SOM map

The outputs can be readily plotted in gnuplot using the following snippet

set datafile separator ','
plot "test3.csv" title "original", \
"w31.csv" title "w1", \
"w32.csv" title "w2"

Sample execution
output

463{
464 int j, N = 200;
465 int features = 3;
466 int num_out = 20;
467 double **X = (double **)malloc(N * sizeof(double *));
468 double **W = (double **)malloc(num_out * sizeof(double *));
469 for (int i = 0; i < max(num_out, N); i++)
470 {
471 if (i < N) // only add new arrays if i < N
472 X[i] = (double *)malloc(features * sizeof(double));
473 if (i < num_out) // only add new arrays if i < num_out
474 {
475 W[i] = (double *)malloc(features * sizeof(double));
476
477#ifdef _OPENMP
478#pragma omp for
479#endif
480 // preallocate with random initial weights
481 for (j = 0; j < features; j++) W[i][j] = _random(-1, 1);
482 }
483 }
484
485 test_3d_classes(X, N); // create test data around the lamniscate
486 save_nd_data("test3.csv", X, N, features); // save test data points
487 save_nd_data("w31.csv", W, num_out,
488 features); // save initial random weights
489 kohonen_som_tracer(X, W, N, features, num_out, 0.01); // train the SOM
490 save_nd_data("w32.csv", W, num_out,
491 features); // save the resultant weights
492
493 for (int i = 0; i < max(num_out, N); i++)
494 {
495 if (i < N)
496 free(X[i]);
497 if (i < num_out)
498 free(W[i]);
499 }
500 free(X);
501 free(W);
502}
void test_3d_classes(double *const *data, int N)
Creates a random set of points distributed in four clusters in 3D space with centroids at the points.
Definition: kohonen_som_trace.c:410
Here is the call graph for this function:

◆ test_3d_classes()

void test_3d_classes ( double *const *  data,
int  N 
)

Creates a random set of points distributed in four clusters in 3D space with centroids at the points.

  • \((0,5, 0.5, 0.5)\)
  • \((0,5,-0.5, -0.5)\)
  • \((-0,5, 0.5, 0.5)\)
  • \((-0,5,-0.5, -0.5)\)
Parameters
[out]datamatrix to store data in
[in]Nnumber of points required
411{
412 const double R = 0.1; // radius of cluster
413 int i;
414 const int num_classes = 4;
415 const double centres[][3] = {
416 // centres of each class cluster
417 {.5, .5, .5}, // centre of class 1
418 {.5, -.5, -.5}, // centre of class 2
419 {-.5, .5, .5}, // centre of class 3
420 {-.5, -.5 - .5} // centre of class 4
421 };
422
423#ifdef _OPENMP
424#pragma omp for
425#endif
426 for (i = 0; i < N; i++)
427 {
428 int class =
429 rand() % num_classes; // select a random class for the point
430
431 // create random coordinates (x,y,z) around the centre of the class
432 data[i][0] = _random(centres[class][0] - R, centres[class][0] + R);
433 data[i][1] = _random(centres[class][1] - R, centres[class][1] + R);
434 data[i][2] = _random(centres[class][2] - R, centres[class][2] + R);
435
436 /* The follosing can also be used
437 for (int j = 0; j < 3; j++)
438 data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
439 */
440 }
441}
Definition: prime_factoriziation.c:25
Here is the call graph for this function:

◆ test_circle()

void test_circle ( double *const *  data,
int  N 
)

Creates a random set of points distributed near the circumference of a circle and trains an SOM that finds that circular pattern.

The generating function is

\begin{eqnarray*} r &\in& [1-\delta r, 1+\delta r)\\ \theta &\in& [0, 2\pi)\\ x &=& r\cos\theta\\ y &=& r\sin\theta \end{eqnarray*}

Parameters
[out]datamatrix to store data in
[in]Nnumber of points required
224{
225 const double R = 0.75, dr = 0.3;
226 double a_t = 0., b_t = 2.f * M_PI; // theta random between 0 and 2*pi
227 double a_r = R - dr, b_r = R + dr; // radius random between R-dr and R+dr
228 int i;
229
230#ifdef _OPENMP
231#pragma omp for
232#endif
233 for (i = 0; i < N; i++)
234 {
235 double r = _random(a_r, b_r); // random radius
236 double theta = _random(a_t, b_t); // random theta
237 data[i][0] = r * cos(theta); // convert from polar to cartesian
238 data[i][1] = r * sin(theta);
239 }
240}
Here is the call graph for this function:

◆ test_lamniscate()

void test_lamniscate ( double *const *  data,
int  N 
)

Creates a random set of points distributed near the locus of the Lamniscate of Gerono.

\begin{eqnarray*} \delta r &=& 0.2\\ \delta x &\in& [-\delta r, \delta r)\\ \delta y &\in& [-\delta r, \delta r)\\ \theta &\in& [0, \pi)\\ x &=& \delta x + \cos\theta\\ y &=& \delta y + \frac{\sin(2\theta)}{2} \end{eqnarray*}

Parameters
[out]datamatrix to store data in
[in]Nnumber of points required
320{
321 const double dr = 0.2;
322 int i;
323
324#ifdef _OPENMP
325#pragma omp for
326#endif
327 for (i = 0; i < N; i++)
328 {
329 double dx = _random(-dr, dr); // random change in x
330 double dy = _random(-dr, dr); // random change in y
331 double theta = _random(0, M_PI); // random theta
332 data[i][0] = dx + cos(theta); // convert from polar to cartesian
333 data[i][1] = dy + sin(2. * theta) / 2.f;
334 }
335}
Here is the call graph for this function: