Self-Organizing Maps (SOM) is a non-linear mapping technique which gives a 2D space representation
of a given set of points from a multidimensional space derived from a large
series of molecular descriptors [1]. Each point of this set is related to a
SOM node, which is characterized by N weighted connections varying between
0 and 1 (Figure 1). Training SOM consists in rearranging the layer nodes
by gradually adjusting their weights. After selecting a first hyperspace point,
the distances between its coordinates and each node of the SOM layer are calculated.
The node having the shortest distance is called "winner" and the hyperspace point
is "projected" on this node of the map. Then, the weights of the winning node and
its neighbors are modified according to the equation:
|
|
(1) |
where xj is the component j of input vector x;
wij represent the weight vector of the node i for the descriptor j; t and α(t)
are respectively the iteration number and the learning rate; γ(t,r) is the
triangular neighborhood function depending on the iteration number and the distance r
between the node i and the winning unit.
Figure 1. Simplified representation of a SOM projection.
The learning rate α(t) is linearly decreased during the training process
from α(0) to zero. The triangular function γ(t,r) works on the whole
map and it is discretely decreased with increasing the distance and the number
of iterations.
The same procedure is successively repeated for all the hyperspace vectors and
each point is associated with a node in the SOM layer. The points which are
close in the descriptor hyperspace remain close in the SOM layer, occupying the
same nodes or the neighboring ones. When SOM is applied on a chemical data set,
the maps can then reveal similar compounds, if the Euclidean distance is accepted
as a similarity measure.
References
1. T. Kohonen, Self-Organizing Maps, Springer-Verlag, Berlin (Germany), 2001.
|