Algorithms_in_C
1.0.0
Set of algorithms implemented in C.
|
Kohonen self organizing map (data tracing) More...
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
Macros | |
#define | _USE_MATH_DEFINES |
#define | max(a, b) (((a) > (b)) ? (a) : (b)) |
#define | min(a, b) (((a) < (b)) ? (a) : (b)) |
Functions | |
double | _random (double a, double b) |
int | save_nd_data (const char *fname, double **X, int num_points, int num_features) |
void | get_min_1d (double const *X, int N, double *val, int *idx) |
void | update_weights (double const *x, double *const *W, double *D, int num_out, int num_features, double alpha, int R) |
void | kohonen_som_tracer (double **X, double *const *W, int num_samples, int num_features, int num_out, double alpha_min) |
void | test_circle (double *const *data, int N) |
void | test1 () |
void | test_lamniscate (double *const *data, int N) |
void | test2 () |
void | test_3d_classes (double *const *data, int N) |
void | test3 () |
double | get_clock_diff (clock_t start_t, clock_t end_t) |
int | main (int argc, char **argv) |
Kohonen self organizing map (data tracing)
This example implements a powerful self organizing map algorithm. The algorithm creates a connected network of weights that closely follows the given data points. This this creates a chain of nodes that resembles the given input shape.
#define _USE_MATH_DEFINES |
required for MS Visual C
#define max | ( | a, | |
b | |||
) | (((a) > (b)) ? (a) : (b)) |
shorthand for maximum value \
#define min | ( | a, | |
b | |||
) | (((a) < (b)) ? (a) : (b)) |
shorthand for minimum value \
double _random | ( | double | a, |
double | b | ||
) |
Helper function to generate a random number in a given interval.
Steps:
r1 = rand() % 100
gets a random number between 0 and 99r2 = r1 / 100
converts random number to be between 0 and 0.99\[ y = (b - a) \times \frac{\text{(random number between 0 and RAND_MAX)} \; \text{mod}\; 100}{100} + a \]
[in] | a | lower limit |
[in] | b | upper limit |
double get_clock_diff | ( | clock_t | start_t, |
clock_t | end_t | ||
) |
void get_min_1d | ( | double const * | X, |
int | N, | ||
double * | val, | ||
int * | idx | ||
) |
Get minimum value and index of the value in a vector
[in] | x | vector to search |
[in] | N | number of points in the vector |
[out] | val | minimum value found |
[out] | idx | index where minimum value was found |
void kohonen_som_tracer | ( | double ** | X, |
double *const * | W, | ||
int | num_samples, | ||
int | num_features, | ||
int | num_out, | ||
double | alpha_min | ||
) |
Apply incremental algorithm with updating neighborhood and learning rates on all samples in the given datset.
[in] | X | data set |
[in,out] | W | weights matrix |
[in] | D | temporary vector to store distances |
[in] | num_samples | number of output points |
[in] | num_features | number of features per input sample |
[in] | num_out | number of output points |
[in] | alpha_min | terminal value of alpha |
int main | ( | int | argc, |
char ** | argv | ||
) |
int save_nd_data | ( | const char * | fname, |
double ** | X, | ||
int | num_points, | ||
int | num_features | ||
) |
Save a given n-dimensional data martix to file.
[in] | fname | filename to save in (gets overwriten without confirmation) |
[in] | X | matrix to save |
[in] | num_points | rows in the matrix = number of points |
[in] | num_features | columns in the matrix = dimensions of points |
void test1 | ( | ) |
Test that creates a random set of points distributed near the circumference of a circle and trains an SOM that finds that circular pattern. The following CSV files are created to validate the execution:
test1.csv
: random test samples points with a circular patternw11.csv
: initial random mapw12.csv
: trained SOM mapThe outputs can be readily plotted in gnuplot using the following snippet
void test2 | ( | ) |
Test that creates a random set of points distributed near the locus of the Lamniscate of Gerono and trains an SOM that finds that circular pattern. The following CSV files are created to validate the execution:
test2.csv
: random test samples points with a lamniscate patternw21.csv
: initial random mapw22.csv
: trained SOM mapThe outputs can be readily plotted in gnuplot using the following snippet
void test3 | ( | ) |
Test that creates a random set of points distributed in six clusters in 3D space. The following CSV files are created to validate the execution:
test3.csv
: random test samples points with a circular patternw31.csv
: initial random mapw32.csv
: trained SOM mapThe outputs can be readily plotted in gnuplot using the following snippet
void test_3d_classes | ( | double *const * | data, |
int | N | ||
) |
Creates a random set of points distributed in four clusters in 3D space with centroids at the points
[out] | data | matrix to store data in |
[in] | N | number of points required |
void test_circle | ( | double *const * | data, |
int | N | ||
) |
Creates a random set of points distributed near the circumference of a circle and trains an SOM that finds that circular pattern. The generating function is
\begin{eqnarray*} r &\in& [1-\delta r, 1+\delta r)\\ \theta &\in& [0, 2\pi)\\ x &=& r\cos\theta\\ y &=& r\sin\theta \end{eqnarray*}
[out] | data | matrix to store data in |
[in] | N | number of points required |
void test_lamniscate | ( | double *const * | data, |
int | N | ||
) |
Creates a random set of points distributed near the locus of the Lamniscate of Gerono.
\begin{eqnarray*} \delta r &=& 0.2\\ \delta x &\in& [-\delta r, \delta r)\\ \delta y &\in& [-\delta r, \delta r)\\ \theta &\in& [0, \pi)\\ x &=& \delta x + \cos\theta\\ y &=& \delta y + \frac{\sin(2\theta)}{2} \end{eqnarray*}
[out] | data | matrix to store data in |
[in] | N | number of points required |
void update_weights | ( | double const * | x, |
double *const * | W, | ||
double * | D, | ||
int | num_out, | ||
int | num_features, | ||
double | alpha, | ||
int | R | ||
) |
Update weights of the SOM using Kohonen algorithm
[in] | X | data point |
[in,out] | W | weights matrix |
[in,out] | D | temporary vector to store distances |
[in] | num_out | number of output points |
[in] | num_features | number of features per input sample |
[in] | alpha | learning rate \(0<\alpha\le1\) |
[in] | R | neighborhood range |