A fundamental hypothesis of the proposed technique for automatic hand gesture recognition is that gestures may be modeled as sequences of a finite number of ``canonical" postures of the hand. Each posture is associated to a state of a probabilistic finite state machine, in particular of a Hidden Markov Model. Each gesture is identified with a HMM with an appropriate number of states and transition probabilities. The recognition problem becomes that of estimating the number of states and identifying the parameters of the model from the observations sequence. The time trajectory of the state estimates describe the estimated gesture.
The equations of a HMM with discrete states and continuous range observations are
in which
is a finite-state Markov chain and
are vector real-valued observations. The components of
are the means of the size functions.
is a sequence of N(0,1) independent and identically distributed random
variables. The model is specified by the transition probability matrix
A
and by the matrices
C
and
of appropriate dimensions. Without loss of generality the
N
states can be identified with the set of the unit vectors in
.
The observation equations are the sum of a component directly determined
by the state and a gaussian noise term with variance determined by the
elements of
. When
the first term of this sum becomes exactly the
j
-th column of
C
, hence this column is a symbol of the correspondent state in the
observation space. In other words, this quantity is the mathematical
description of what we called the canonical posture associated with the
considered state. It is a posture model.
Well known algorithms [ 1 ] generate the sequence of estimates of the states by measuring at any time k the probabilistic distance between the current observation and each posture model, according to the following expression
where
is, except for a scale factor, the gaussian density of the current
observation given the
i
-th posture model and
is the
i
-th column of
A
.
According to our hypotesis, we expect clustering of the observations around the posture models. This is confirmed in real image sequences as it can be seen in Figure (5).
![]() |
![]() |
Adrian F Clark