SSOM User Guide

The SSOM class implements a Self-Organizing Map (SOM), a type of artificial neural network used for unsupervised learning. This manual explains each method and argument in the class.

Initialization of SSOM class

som = SSOM(grid_size=10, num_iterations=100, max_learning_rate=0.1, 
     learning_decay=0.25, neighborhood_decay=0.5, metric="euclidean", 
     grid_shape="2D", log_interval=None)

Arguments:

grid_size
- Specifies the size of the SOM grid.
- For grid_shape="1D", it is an integer (number of nodes in the 1D grid).
- For grid_shape="2D", it is a tuple (rows, cols) specifying the grid dimensions.
num_iterations
- Total number of iterations during training.
- Determines how many steps the SOM will take to adjust to the data.
max_learning_rate
- The initial learning rate for updating the weights of the SOM.
- The learning rate decreases over iterations based on the learning_decay.
learning_decay
- Controls how quickly the learning rate decreases over time.
- Higher values mean faster decay.
neighborhood_decay
- Controls how quickly the neighborhood radius decreases over time.
- The radius affects which nodes in the SOM are influenced by a given data point.
metric
- Specifies the distance metric used to find the Best Matching Unit (BMU):
  - "euclidean": Uses Euclidean distance.
  - "ssim": Uses Structural Similarity Index (SSIM) for image data.
grid_shape
- "1D": Creates a 1D SOM grid (linear topology).
- "2D": Creates a 2D SOM grid (grid topology).
log_interval
- Controls how frequently the training progress is logged and recorded.
- If not specified, it is automatically set to ensure no more than 100 logs are saved.

Example

Check test/ for more examples how to run SSOM

Assume we have sample data (which are two-dimensional data)

# Generate synthetic data
import numpy as np
x = np.random.normal((-0.5, -0.5), 0.2, (200, 2))
x = np.append(x, np.random.normal((-0.25, 1), 0.2, (100, 2)), axis=0)
x = np.append(x, np.random.normal((0.5, 0.5), 0.2, (150, 2)), axis=0)
input_data = x

Initialize SSOM class (2-D map topology)

from ssom import SSOM  # Assuming SSOM is imported correctly
# Initialize a 2D SOM with 10x10 grid
som = SSOM(grid_size=(10, 10), num_iterations=1000, max_learning_rate=0.5, metric="euclidean")

Initialize SSOM class (1-D ring topology)

# Initialize the SSOM with a 1D grid of size 20
som = SSOM(grid_size=20, num_iterations=5000, grid_shape="1D")

`Train SSOM with input data`

som.train(input_data)

`train()`

Trains the SOM on the input data (input_data).
Training process is to adjusts the weights of the SOM nodes to match the input data distribution.
input_data
A NumPy array of shape (samples, dimensions). Each row is a data sample.

Important outputs of `SSOM`

The som_grid and sample_bmu_mapping are critical attributes for understanding the result of the SOM training:

som_grid: Represents the learned weights of the SOM nodes and can be visualized to interpret the clustering.
sample_bmu_mapping: Links each input data point to its nearest SOM node, showing how the SOM clusters the data.

`som_grid`

Description

The som_grid attribute represents the weights of the SOM nodes. It stores the learned representation of the input data.
Its shape depends on the topology of the SOM:
- For grid_shape="1D", it is a 2D array of shape (grid_size, num_input_dims).
- For grid_shape="2D", it is a 3D array of shape (grid_rows, grid_cols, num_input_dims).

Usage

After training, the som_grid contains the trained SOM nodes, which approximate the distribution of the input data.

Example

# Access the trained SOM grid
print("Trained SOM Grid:")
print(som.som_grid)

`sample_bmu_mapping`

Description

The sample_bmu_mapping is a list of tuples, each containing:
1. The index of a sample in the input data.
2. The index of the Best Matching Unit (BMU) for that sample.
This mapping shows how each input sample is clustered within the SOM grid.

Usage

Use this attribute to understand the clustering of the input data points within the SOM.

Example

# Access the sample-to-BMU mapping
print("Sample to BMU Mapping:")
for sample_index, bmu_index in som.sample_bmu_mapping:
    print(f"Sample {sample_index} maps to BMU at index {bmu_index}")

Other Methods

1. `_initialize_grid_indices()`

grid_indices = self._initialize_grid_indices()

Description

Initializes the indices of the SOM grid.
For grid_shape="2D", creates a (rows, cols, 2) array where each element stores its row and column index.

Example

# Create 2D grid indices for a 5x5 SOM
som = SSOM(grid_size=(5, 5), grid_shape="2D")
grid_indices = som._initialize_grid_indices()

2. `_initialize_grid()`

self._initialize_grid(num_input_dims)

Description

Randomly initializes the weights of the SOM grid.
For grid_shape="1D", creates a (grid_size, num_input_dims) array.
For grid_shape="2D", creates a (rows, cols, num_input_dims) array.

Arguments

num_input_dims
- The number of dimensions in the input data (e.g., 3 for RGB).

Example

# Initialize a 1D SOM grid with 10 nodes and 3 input dimensions
som = SSOM(grid_size=10, grid_shape="1D")
som._initialize_grid(3)

3. `ssim()`

self.ssim(x, y, C1=0.01**2, C2=0.03**2)

Description

Computes the Structural Similarity Index (SSIM) between two data points.
Useful for comparing image-like data.

Arguments

x, y
Input arrays to compare.
C1, C2
Small constants to stabilize SSIM computation.

Example

# Compare two RGB patches
ssim_value = som.ssim(patch1, patch2)

4. `find_bmu()`

bmu_index = self.find_bmu(input_patch)

Description

Finds the Best Matching Unit (BMU) for a given input patch based on the specified metric ("euclidean" or "ssim").

Arguments

input_patch
A data point from the input dataset.

Returns

The index of the BMU:
- For grid_shape="1D", an integer index.
- For grid_shape="2D", a tuple (row, col).

Example

# Find BMU for an input point
bmu = som.find_bmu(np.array([0.5, 0.5]))

Example

# Train the SOM with 2D input data
data = np.random.rand(100, 2)
som.train(data)

Attributes for Logging

The following attributes store historical data during training:

som_grid_history
A list of SOM grid states logged at each log_interval.
neighborhood_influence_history
A list of neighborhood influence values for each iteration.
learning_rate_history
A list of learning rates at each log_interval.
bmu_history
A list of BMU indices for selected data points at each log_interval.
step_history
A list of iteration steps corresponding to the logs.
selected_data_point_history
A list of input data points selected at each log_interval.

Example

# Access learning rate history after training
print(som.learning_rate_history)

Visualization Example

You can visualize the SOM training progress and final results using matplotlib.

# Plot learning rate decay
plt.plot(som.step_history, som.learning_rate_history)
plt.title("Learning Rate Decay")
plt.xlabel("Iteration Step")
plt.ylabel("Learning Rate")
plt.grid()
plt.show()

Files

UserGuide.md

Latest commit

History

UserGuide.md

File metadata and controls

SSOM User Guide

Initialization of SSOM class

Arguments:

Example

Train SSOM with input data

train()

Important outputs of SSOM

som_grid

Description

Usage

Example

sample_bmu_mapping

Description

Usage

Example

Other Methods

1. _initialize_grid_indices()

Description

Example

2. _initialize_grid()

Description

Arguments

Example

3. ssim()

Description

Arguments

Example

4. find_bmu()

Description

Arguments

Returns

Example

Example

Attributes for Logging

Example

Visualization Example

`Train SSOM with input data`

`train()`

Important outputs of `SSOM`

`som_grid`

`sample_bmu_mapping`

1. `_initialize_grid_indices()`

2. `_initialize_grid()`

3. `ssim()`

4. `find_bmu()`