Performance Analysis of Pursuit-Evasion Game-Based Guidance Laws

International Journal of Aeronautical and Space Sciences.
2010.
Jun,
11(2):
110-117

- Published : June 15, 2010

Download

PDF

e-PUB

PubReader

PPT

Export by style

Share

Article

Metrics

Cited by

TagCloud

We propose guidance laws based on a pursuit-evasion game. The game solutions are obtained from a pursuit-evasion game solver developed by the authors. We introduce a direct method to solve planar pursuit-evasion games with control variable constraints in which the game solution is sought by iteration of the update and correction steps. The initial value of the game solution is used for guidance of the evader and the pursuer, and then the pursuit-evasion game is solved again at the next time step. In this respect, the proposed guidance laws are similar to the approach of model predictive control. The proposed guidance method is compared to proportional navigation guidance for a pursuit-evasion scenario in which the evader always tries to maximize the capture time. The capture sets of the two guidance methods are demonstrated.
The admissible control inputs
u
_{p}
(
t
) and
u
_{e}
(
t
) are assumed to be piecewise-continuous functions subject to the following constraints:
Our pursuit-evasion problem is written as
subject to the constraints in Eq. (2) and the final constraint,
To develop a numerical algorithm, we used a direct approach based on parameterization of the control inputs. The control inputs of both players are separated into the following form:
where
u
_{p,k }
and
u
_{e,k}
, which are assumed to be constant in the k-th interval, should satisfy the constraints in Eq. (2) as follows:
The solution technique proposed in (Tahk et al., 1998b) is composed of two procedures: update and correction. The update procedure determines δ
u
_{e}
, a small perturbation of
u
_{e}
, in the direction of maximizing
t
_{f}
^{ *}
. It also includes an algorithm for the computation of a new
u
_{p}
^{*}
. Instead of finding the minimum time solution, this algorithm computes δ
u
_{p}
, the smallest variation of
u
_{p}
^{*}
, to satisfy the final constraint (Eq. 4). If
u
_{p}
* is far from the minimum time solution, the correction procedure executes and provides an adequate tuning of
u
_{p}
^{*}
to satisfy the optimality within the error bounds defined by the user. The aforementioned procedures, illustrated in
Fig. 1
, are implemented repeatedly until there is no improvement in
t
_{f}
^{*}
. In (Kim et al., 2006), the update and correction procedures are called step 1 and step 2 to avoid confusion of terminologies.
u
_{e}
and δ
u
_{p}
, are said to be admissible if they satisfy following conditions:
The capture condition is expressed as
where
P
and
E
are defined as
and
v
_{p}
and
v
_{e}
are the sensitivities of
r
_{p}
(
t f
) and
r
_{e}
(
t f
), respectively, to the perturbation of
t
_{f}
. We define
v
_{r}
as follows:
Flowchart of the algorithm in (Kim et al., 2006).
Then, Eq. (8) can be written as
Let
n
_{1}
and
n
_{2}
denote two orthonormal unit vectors in the two-dimensional space. We define
n
_{1}
as
Now,
P , E
, and (특수문자)
r
_{f}
are expressed as
where
p
_{1}
,
p
_{2}
,
e
_{1}
, and
e
_{2}
are
N
'1 column vectors and
d
_{1}
and
d
_{2}
are scalars. For a given δ
u
_{e}
, let δ
u
_{p}
^{o}
, which is orthogonal to the null space of
p
^{T}
_{2}
, be the minimum norm solution of the combination of Eqs. (11) and (13), which is expressed as1
If δ
u
_{p}
^{o}
is admissible, thenδ
u
_{p}
^{*}
=δ
u
_{p}
^{o}
. For
u
_{p}
=
u
_{p}
^{*}
and
t
_{f}
=
t
_{f}
^{*}
, δ
u
_{p}
satisfying the capture condition can be written as
where δ
u
_{p}
^{n}
is the vector in the null space of
p
^{T}
_{2}
, which can be chosen freely without affecting the capture condition. After some manipulations of Eqs. (14) and (15), we obtain
where
The evader's control input is then updated as follows:
where j denotes the iteration number, and subscript k is kth element of the separated evader's control variables.
When the deviation of
u
_{p}
and
t
_{f}
from the minimum norm solution grows too much, an optimal δ
u
_{p}
^{n}
should be found to correct
u
_{p}
and
t
_{f}
. δ
u
_{p}
^{n}
, an undefined term of Eq. (16), is calculated as follows. We write δ
u
_{p}
^{n}
as
where δ
u
_{p}
^{v}
is the amount of constraint violation of
u
_{p}
+δ
u
_{p}
^{o}
+δ
u
_{p'}
^{n}
, and δ
u
_{p}
^{r}
is used to make δ
u
_{p}
^{n}
which belongs to the null space of
p
^{T}
_{2}
. Given δ
u
_{p}
^{v}
, δ
u
_{p}
^{r}
can be expressed as
where δ
u
_{p}
^{ro}
belongs to the null space of , and δ
u
_{p}
^{rn}
is to be determined in an optimal way.
For minimizing (수식삽입), we find the δ
u
_{p}
^{ro}
that minimizes the norm of -δ
u
_{p}
^{v}
+δ
u
_{p}
^{ro}
. The norm of δ
u
_{p}
^{n}
=-δ
u
_{p}
^{v}
+δ
u
_{p}
^{ro}
is minimized by
After calculating δ
u
_{p}
^{n}
, each component of is examined to check whether or not any violations of control input constraints occur. As soon as δ δ
u
_{p}
^{n}
is determined, δ
u
_{p}
is updated using the following relationship:
Once
u
_{e}
and
u
_{p}
are
u
_{p}
dated,
t
_{ f }
is
u
_{p}
dated using
In Eqs. 22) and (23),
i
denotes the iteration number.
u
_{p}
+δ
u
_{p}
is assumed to close to the optimal control variable for the case of small perturbations of both players' control variables. If the pursuer's control input
u
_{p}
is far from the optimum to minimize a payoff function, we optimize the pursuer's control variable during the correction procedure.
Assume that the pursuer's control input is subject to variation, but the evader's control input is fixed; that is, δ
u
_{e}
= 0 . The first variation necessary condition for optimality is given by for all admissible δ
u
_{p}
. To inspect the optimality condition at a later time, we define
I
_{S}
as the set of saturated parameters. It is represented as
where
K
_{S}
and
s(i)
denote the number of elements in
I
_{S}
, and its
i
th element. We construct a
N
×
K
_{s}
matrix
S
as
The purs
u
_{e}
r's control variable is restricted to constraints, and a saturated control variable can be perturbed only in one direction, as shown by
Then, let
q
_{1}
denotes the component of
p
_{1}
normal to
p
_{2}
, which is given by
Hence, the Kuhn-Tucker condition is written as
where m
_{k}
> 0 for all . We also derive the following necessary conditions for optimality:
For the correction procedure, we set dδ
u
_{e}
= 0 as previously mentioned. Then, we substitute all elements of
q
_{1}
that do not satisfy the necessary conditions for optimality by zero to obtain (특수문자), and we compute (특수문자), the component of (특수문자) normal to (특수문자):
We choose d
u
_{p}
^{rn}
as
to reduce
t
_{f}
because is the null space of
p
_{2}
^{T}
. We compute δ
u
_{p}
^{rn}
and the total correction of d
u
_{p}
given by the following relations, starting from δ
u
_{p}
^{o}
=δ
u
_{p}
^{v}
=δ
u
_{p}
^{ro}
, as follows:
If there is any violation of the control input constraint, we use the
u
_{p}
date procedure of
u
_{p}
as explained in step 1.
t
using the initial value of the game solution after performing step 2. Then, state variables changed by integration are considered as initial conditions. The termination condition is the same as the proposed algorithm. The algorithm as a guidance law is described in
Fig. 2
.
With integration, this modified algorithm changes control inputs. First, control input elements of both players are got rid of the control input sets after an integration and the others are replaced previous elements of each control inputs. The reason is that the first control input elements of the pursuer and the evader are used during integration, and both players' positions are updated by integration. By reason of integration with δ
t
, both players have new initial conditions. After executing integration and replacement (movement) of control inputs, the control input elements of both players are improved during step 1 as previously mentioned. In this respect, the proposed guidance law is similar to the approach of model predictive control (MPC). A schematic diagram of the movement of control input elements is shown in
Fig. 3
. In addition, we reduced the integration time interval δ
t
when
t
_{f}
/
N
was smaller than the specific integration time interval δ
t
. In this case, time interval δ
t
was reduced to
t
_{f}
/
N
. If δ
t
is used consistently, pursuit-evasion game solutions are calculated with the small time interval compared to δ
t
. In this case, integration with δ
t
can result in a large miss distance because the dynamics of both players integrate with a much longer time interval than with a discrete time interval of their control inputs near the final time. Thus, we should reduce the integration time interval, which guarantees that integration time interval δ
t
is close to
t
_{f}
/
N
. If the integration time interval contin
u
_{e}
s to decrease, δ
t
becomes infinitesimal and the iteration executes infinitely. To prevent this, the integration time interval is fixed the specific value near the final time and the iteration number of the solver is limited. Applying this technique, we can improve the interception precision at the final stage. This guidance law is called the differential game guidance law.
Flow chart of the algorithm after modification (Kim et al., 2006).
Movement of the control input elements (Kim et al., 2006).
In these equations,
x
_{e}
and
y
_{e}
denote the position in Cartesian coordinates, γ
_{e}
is the flight path angle,
v
_{e}
is the speed,
u
_{e}
is the non-dimensional control input variable,
R
_{e}
is the minimum turn radius, and
c
is the drag coefficient. If the drag coefficient is zero, the evader does not produce a drag force and its speed does not decrease.
The equations of motion of the pursuer, which are analogous to those of the evader, are as follows:
where
R
_{p}
is the minimum turn radius, andx
x
_{pr}
,
y
_{pr}
,
γ
_{pr}
,
v
_{pr}
, and
u
_{p}
are analogous to the evader model. The c-3ptnstant
a
, which is composed of the zero-lift-drag coefficient
C
_{D0}
and maximum lift coefficient
C
_{Lmax}
, is defined as
The constant
b
consists of the maximum lift coefficient and the induced drag factor
K
:
The engagement scenario for the numerical example is shown in
Fig. 5
. The evader is initially 6,724.97 m apart from the pursuer along the x-axis, and moves with
v
_{e}
(0) = 300
m
/
s
and (0) = 131.86°. The pursuer initially moves with
v
_{p}
(0) = 920.83
m
/
s
and (0) = 85.36°. The parameters used in this engagement model are taken from (Guelman et al., 1988) as
R
_{e}
= 600
m
,
R
_{p}
= 1515.15
m
,
a
= 0.0875,
b
= 0.4. For a realistic evader, the drag coefficient was chosen as 0.4.
The differential game guidance law parameters were chosen as
N
= 50 and ε
_{1}
= 0.002
m
. The algorithm proposed in (Tahk et al., 1998a, b) includes the parameter ε
_{1}
, but the differential game guidance law does not define it. The reason is that the proposed guidance law; i.e., the differential game guidance law, does not require it because of the difference in the execution control between step 1 and step 2, as mentioned in the previous section.
The trajectories of the pursuer and the evader are shown in
Fig. 6
. The pursuer intercepts the evader at 18.565769 seconds. The time histories of the control input of both players are illustrated in
Fig. 7
. The figure shows that an initial control input of the evader is a zero. The figure also shows that the control inputs of both players are constrained by Eq. (2).
Pursu _{e}r and evader engagement geometry (Kim et al., 2006).
Intercept scenario (Kim et al., 2006).
To compare the performance of the differential game guidance law with other guidance laws, proportional navigation guidance (PNG) was adopted as a guidance law of the pursuer. In spite of using PNG, we assumed that the evader knows the maneuvers of the pursuer which are optimal. In this case, the pursuer's control inputs are calculated from
where
N
' denotes the effective navigation ratio, and
V
_{c}
and are the closing velocity and line-of-sight rate, respectively. From
a
_{cmd}
, the control input of the purs
u
_{e}
r is determined as follows:
It is also constrained by the first expression in Eq. (2). Using PNG as a guidance law, the termination condition is replaced by the miss distance. If the miss distance is smaller than 1 m, we believe that the purs
u
_{e}
r intercepts the evader.
Trajectories of the pursuer using PNG as a guidance law, and of the evader, are given in
Fig. 8
. The pursuer captured the evader at 23.339497 seconds. Time histories of both control inputs are shown in
Fig. 9
, and are similar to
Fig. 7
. The pursuer's initial control input is a few different between the differential game guidance and PNG. This is because a control input of the pursuer using the proposed guidance law is initialized by PNG, and then it is improved by step 2.
Trajectories of the pursuer and evader.
Time histories of the control input of both players.
Table 1
is a summary of the simulation results. The intercept time of the differential game guidance law is shorter than the time using PNG. In the case of using PNG, a computation time is approximately three times longer than that using pursuit-evasion game solution. The number of iterations needed to capture the evader is nearly two times greater than that of the proposed guidance law.
Trajectories of the pursuer using PNG and the evader.
Time histories of the control input of the pursuer using PNG and the evader.
Summary of simulation results
To compare the interception performance of both methods, which are both players using the differential game guidance law and the pursuer using PNG, the capture set was calculated. For simplicity, we considered only the initial condition for the previous example except that the initial flight path angles of the pursuer and the evader varied from 0° to 20° and from 0° to 350°, respectively. The reason why we selected the initial flight path angle ranges of the pursuer is that seeker's gimbals used typical guided missiles adopt lock-on type. The conventional field of view of a seeker is limited from 2° to 5°, but the field of view of seeker-adopted lock-on type gimbals is larger than 15°. The field of view of gimbaled seekers, however, is not much larger than 15°. Thus, the initial flight path angle ranges of the pursuer were chosen as 0° to 20°.
Figures 10
and
11
show the capture sets of the proposed guidance law, and the case using PNG as a guidance law of the pursuer. In these figures, the initial conditions marked with (특수문자) are those for which the pursuer can capture the evader within a finite time. The other symbols indicate the termination of the program without a capture. In
Fig. 10
, the symbol ★ indicates a circumstance for which the pursuer cannot capture the evader under the termination condition of the differential game guidance law, but is able to intercept the evader under the termination condition when using PNG as a guidance law of the pursuer. The symbol × implies that the pursuer is not able to intercept the evader under both termination conditions. In
Fig. 11
, the symbol × indicates that the miss distance is larger than 1 meter, so interception by the pursuer does not occur. From
Figs. 10
and
11
, we know that the capture set of the proposed guidance law is larger than one using PNG as a guidance law of the pursuer.
The capture set of the proposed guidance law.
The capture set of the case using proportional navigation guidance.

1. Introduction

The pursuit-evasion game is a kind of differential game demonstrated by Isaacs (1967). It is an important class of two-player zero-sum differential games with perfect information and is useful in military applications, especially in missile guidance applications. In the game, the pursuer and the evader try to minimize and maximize the intercept time or the miss-distance as a payoff function. Among previous studies, Guelman et al. (1988) studied a simple case of pursuit-evasion game that can be solved without addressing a two-point boundary value problem. Breitner et al. (1993) proved that a multiple shooting method is able to precisely solve practical pursuit-evasion games subject to state constraints. In addition to these studies, a variety of parameter optimization methods such as those described in (Hargraves and Paris, 1987) can be extended to solve realistic pursuit-evasion games. In this study, we propose a direct method for solving pursuit-evasion games (Tahk et al., 1998a, b).
We first summarize the algorithm proposed in (Tahk et al., 1998a, b) for the reader's convenience. The algorithm is a direct method based on discretization of control inputs for solving pursuit-evasion games for which the intercept time is the payoff function of the game. Every iteration of the algorithm has two features: update and correction. The update step improves the evader's control to maximize the intercept time, and modifies the pursuer's control to satisfy the terminal condition. After applying the update step several times, the correction step is used to minimize the pursuer's flight time.
Note that the solution of the pursuit-evasion game can be used for guidance of the pursuer and the evader if the solution is obtained in real time. For this purpose, the iteration number of the solver is limited to reduce the computation time, at the cost of solution accuracy (Kim et al., 2006). In this study, the work of Kim et al. (2006) was refined for better numerical efficiency. Also, the capture set of the proposed guidance method is compared with that of proportional navigation to confirm that the proposed differential game guidance laws provide a smaller no-escape envelope.
2. Problem Formulation

The problem is a two-dimensional pursuit-evasion game with the final time as a payoff function. The dynamics of the pursuer and the evader are expressed as follows:
Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

- 2.1 Step 1 (update algorithm)

We assume that a capture within a finite time is guaranteed. Any perturbations of both players' control input, δ
Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

- 2.2 Step 2 (correction algorithm)

In step 1, the pursuer's control input is determined to satisfy the capture condition, and
Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

3. Constructing the Proposed Guidance Law

In this section, we explain the method we used to construct the guidance law proposed in (Kim et al., 2006). The algorithm introduced in the preceding section consists of two steps. This algorithm initializes both players' control inputs, and then checks the optimality conditions. If the pursuer's control input and final time do not satisfy the optimality conditions, step 2 is performed; otherwise, step 1 is executed. The evader tries to maximize the final time or to avoid capture by the pursuer, and the pursuer tries to intercept the evader. After performing step 1, we examine whether a difference in the positions of the players at the final time is within the error bound. If it is within the error bound, a variation of the evader's position at the final time is examined. If the variation is within the error bound, the algorithm is terminated and gives optimal trajectories. Otherwise, step 1 is repeated. If the difference in the positions of the players at the final time is not within the error bound, the optimality conditions of the pursuer's control input and the final time are checked again.
The initializations of both players' control inputs are the same as the algorithm proposed in (Tahk et al., 1998a, b). In this case, the algorithm is started with step 2. At the proposed algorithm, step 2 is executed when a violation of the optimality condition has occurred and step 1 is repeated when a variation of the evader's position at the final time is not within the error bound. To construct a guidance law, this procedure is changed to execute step 2 once and step 1 ten times. The dynamics of both players are integrated with specific integration time interval δ
Lager Image

Lager Image

4. Numerical Simulation

The differential game guidance law proposed in (Kim et al., 2006) wasapplied to the planar pursuit-evasion game problem illustrated in
Fig. 4
. The equations of motion of the evader are described as
Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Lager Image

Summary of simulation results

Lager Image

Lager Image

Lager Image

5. Conclusions

We carried out a performance analysis of the proposed guidance law based on pursuit-evasion game solutions that were sought by the pursuit-evasion game solver. The derivation of the proposed pursuit-evasion game solver was described, and construction of the proposed guidance law was explained in detail. The guidance law proposed in (Kim et al., 2006) consists of two procedures, update and correction, called step 1 and step 2. Control inputs of the pursuer and the evader were
initialized in the differential game guidance law, and then improved by step 1 and 2. Step 2 executes first that this procedure provides a way to optimize the pursuer trajectory when it deviates excessively from the optimal trajectory. Then, the dynamics of both players were integrated with a specific time interval. Step 1 performs that control inputs of the evader were updated to maximize the increment in the capture time, while the pursuer tried to minimize it. This procedure iterated until satisfying the termination condition. The differential game guidance law was applied to solve a numerical example. For a comparison of the interception performance, PNG was adopted to a guidance law of the pursuer. Simulation results of the engagement scenario were tabulated for each case--the differential game guidance law, and using PNG as the guidance law of the pursuer. Capture sets of both cases were calculated for performance analysis. Based on our numerical example simulation results, we know that the differential game guidance law provided a smaller no-escape envelop than PNG.
Breitner M. H
,
Pesch H. J
,
Grimm W
1993
Complex differential games of pursuit-evasion type with state constraints part 1: Necessary conditions for optimal open-loop strategies
Journal of Optimization Theory and Applications
78
419 -
441
** DOI : 10.1007/BF00939876**

Guelman M
,
Shinar J
,
Green A
1988
Qualitative study of a planar pursuit-evasion game in atmosphere
Proceedings of the AIAA Guidance Navigation and Control Conference
Minneapolis MI
88 -
4158

Hargraves C. R
,
Paris S. W
1987
Direct trajectory optimization using nonlinear programming and collocation
Journal of Guidance Control and Dynamics
10
338 -
342
** DOI : 10.2514/3.20223**

Isaacs R
1967
Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit Control and Optimization
2 printing ed
Wiley
New York NY

Kim Y. S
,
Tahk M. J
,
Ryu H
2006
A guidance law based on pursuit-evasion game solutions
KSAS-JSASS Joint International Symposium on Aerospace Engineering
Busan Korea

Tahk M. J
,
Ryu H
,
Kim J. G
1998a
An iterative numerical method for class of quantitative pursuit-evasion games
Proceeding of AIAA Guidance Navigation and Control Conference
Boston MA
175 -
182

Tahk M. J
,
Ryu H
,
Kim J. G
,
Rhee I. S
1998b
A gradient-based direct method for complex pursuit-evasion games
Proceedings of the 8th International Symposium on Dynamic Games
Vaals Netherlands
579 -
582

Citing 'Performance Analysis of Pursuit-Evasion Game-Based Guidance Laws
'

@article{ HGJHC0_2010_v11n2_110}
,title={Performance Analysis of Pursuit-Evasion Game-Based Guidance Laws}
,volume={2}
, url={http://dx.doi.org/10.5139/IJASS.2010.11.2.0110}, DOI={10.5139/IJASS.2010.11.2.0110}
, number= {2}
, journal={International Journal of Aeronautical and Space Sciences}
, publisher={The Korean Society for Aeronautical & Space Sciences}
, author={Kim, Young-Sam
and
Kim, Tae-Hun
and
Tahk, Min-Jea}
, year={2010}
, month={Jun}