In the context of supervised database classification,
the Fuzzy Partition algorithm [1] allows to generate fuzzy rules
from numerical data by developing two steps:
- partitioning a working space into fuzzy subspaces;
- defining a fuzzy rule for each fuzzy subspace.
Let us assume that the working space is a N-dimension hyperspace defined by
N molecular descriptors, each dimension i can be partitioned into L intervals Iij,
where j represents an interval in the partition selected.
Indicating with P(x1, x2, ... xn)
a molecular vector in the hyperspace, a rule for
a subspace Sk, derived by combining N intervals Iij, is defined by [2]:
if x1 is associated with µ1k(x1)
and x2 is associated with µ2k(x2)...
and xN is associated with µNk(xN)
=> the score of the activity O for P is OkP, (1)
where xi represents the value of the ith descriptor for the molecule P,
µik is the membership function related to the descriptor i for
the subspace k, and OkP is the biochemical activity value related
to the subspace Sk.
The "and" of the fuzzy rule is generally represented by the Min operator [3] and
the membership functions can be defined by triangular, trapezoidal or gaussian shapes [2,4,5].
These techniques of rule generation are very simple as all the fuzzy rules can
be formulated by linguistic labels. But their performances in the database classification
depend on the choice of the partition selected. Generally, a coarse partition leads
to a generalist system but also to a model where prediction results are too approximate;
a fine partition leads to a precise model of classification but also to a non-generalist system.
To overcome this drawback, fuzzy classification methods were proposed [6], which simultaneously
use several fuzzy partitions of different sizes in a single fuzzy rule-based classification
system. Then, the relation (1) becomes:
if x1 is associated with µ1km(x1) and x2
with associated to µ2km(x2)... and xN is associated
with µNkm(xN) => the score of the activity
O for P is OkmP, (2)
where m > 2 and represents the amount of fuzzy subspaces according to the
partition on the axes.
This approach allows to obtain a good compromise between generalist and specialist systems,
improving the classification performances, as showed by the satisfactory results obtained
in a previous paper studying olfactory series [7].
But this algorithm does not allow to solve the second big problem related to fuzzy
partition that is due to the very high number of fuzzy subspaces generated when a large
set of molecular descriptors is considered. Then, a new algorithm, named Adaptive Fuzzy
Partition (AFP) and derived from several studies concerning fuzzy and non-fuzzy
fields [8-11], was implemented, in which the hyperspace partition was adapted to the
training data by dividing dynamically and separately the axes associated with the
N molecular descriptors.
In a first phase, the global descriptor hyperspace is considered and cut into
two subspaces where the fuzzy rules are derived. These two subspaces are divided
step by step into smaller subspaces until certain conditions are satisfied, namely:
- the number of molecular vectors, within a subspace,
attains a minimum threshold number;
- the difference between two generated subspaces
is negligible in terms of chemical activities
represented;
- the number of subspaces exceeds a maximum threshold
number.
The aim of the algorithm is to select the descriptor and the cut position
which allow to get the maximal difference between the two fuzzy rule scores
generated by the new subspaces. The score is determined by the weighted average
of the chemical activity values in an active subspace A and in its neighboring
subspaces. If the number of trial cuts per descriptor is defined by N_cut,
the number of trial partitions equals (N_cut + 1)N. Only the best cut is
selected to subdivide the original subspace. In Figure 1, for example, three cuts
per axis are tested from the original descriptor space with 2 dimensions.
As cut x1 is the best, two subspaces are generated and considered
to be further divided. Then, cut y3 is selected, but the procedure
evaluates useful partitioning only in subspace S2; finally, three subspaces
are built.
Figure 1. Example of adaptive partition on a two-dimensional space.
Three cuts for each axis are evaluated and three subspaces are generated.
The fuzzy rules are generated by the relation (1), but the membership functions,
defined by trapezoidal shapes, are based on the boundaries of the subspaces.
If the width of a subspace Sk on the ith dimension, after each cut,
is represented by wi, the p and q parameters defining the shape of
the trapezoid (Figure 2) are calculated by
where the two parameters λi and Vi vary so that
p > 1 and q < 1. If p = 1 and q = 1,
the membership function becomes a rectangle. All the rules created during the fuzzy
procedure are considered to establish the model between descriptor hyperspace
and biochemical activities.
Figure 2. Representation of a trapezoidal membership function µ(x),
defined on the descriptor i.
The degree of membership of the subspace Sk can be represented by
|
|
(3) |
M is the number of molecular vectors in a given subspace, N is the total
number of descriptors, µik(Xi)Pi
is the fuzzy membership function related to the descriptor
i for the molecular vector Pj, and APj is the experimental activity
of the compound Pj. A classic procedure of centroid defuzzification [12]
is implemented to determine the chemical activity of a new test molecule. All the subspaces k
are considered and the general formula to compute the degree of membership of
the activity O for a generic molecule Pj is
|
|
(4) |
where N_subsp represents the total number of subspaces.
References
- Y. Lin, G.J. Cunningham, Building a Fuzzy System from Input-Output Data, J. Intell. Fuzzy Syst. (1994) 2, 243-250.
- M. Sugeno, T. Yasakawa, A fuzzy-logic-based approach to qualitative modeling, IEEE T. Fuzzy Syst. (1993) 1, 7-31.
- D. Dubois, H. Prade, An introduction to possibilistic and fuzzy logics, in: G. Shafer, J. Pearl (Eds.), Readings in Uncertain Reasoning, Morgan Kaufmann, San Francisco, 1990, pp. 742-761.
- M. Ichino, A Nonparametric Multiclass Pattern Classificatier, IEEE T. Syst. Man Cy. (1979) 6, 345-352.
- B. Fritzke, Fast learning with incremental radial basis function networks, Neural Process. Lett. (1994) 1, 2-5.
- H. Ishibuchi, K. Nozaki, H. Tanaka, Distributed representation of fuzzy rules and its application to pattern classification, Fuzzy Sets Syst. (1992) 52, 21-32.
- K. Audouze, F. Ros, M. Pintore, J.R. Chretien, Prediction of odours of aliphatic alcohols and carbonylated compounds using fuzzy partition and self organising maps (SOM), Analusis (2000) 28, 625-632.
- Y. Lin, G.A. Cunningham III, S.V. Coggeshall, Using Fuzzy Partitions to Create Fuzzy Systems for Input Output data and Set the Initial Weights in a Fuzzy Neural Networks, IEEE T. Fuzzy Syst. (1997) 5, 614-621.
- W. Pedrycz, Fuzzy Sets in Pattern Recognition: Methodology and Methods, Pattern Recogn. (1990) 23, 121-146.
- B.D. Ripley, Statistical aspects of neural networks, in: O.E. Barndorff-Nielsen, J.L. Jensen, W.S. Kendall (Eds.), Networks and Chaos: Statistical and Probabilistic Aspects, Chapman and Hall, London, 1993, pp. 40-123.
- P.A. Chou, Optimal Partitioning for Classification and Regression Trees, IEEE T. Pattern Anal. Mach. Intel. (1991) 13, 340-354.
- M.M. Gupta, J. Qi, Theory of T-norms and fuzzy inference methods, Fuzzy Sets Syst. (1991) 40, 431-450
|