Composite Stackelberg Strategy for Singularly Perturbed Bilinear Quadratic Systems

Ning Bin; Chengke Zhang; Huainian Zhu; Zan Mo

doi:10.1515/JSSI-2015-0154

Artikel Öffentlich zugänglich

Composite Stackelberg Strategy for Singularly Perturbed Bilinear Quadratic Systems

Ning Bin , Chengke Zhang , Huainian Zhu und Zan Mo

Veröffentlicht/Copyright: 25. April 2015

Veröffentlicht von

Veröffentlichen auch Sie bei De Gruyter Brill

Informationen für Autor*innen

Aus der Zeitschrift Journal of Systems Science and Information Band 3 Heft 2

Abstract

Based on singularly perturbed bilinear quadratic problems, this paper proposes to decompose the full-order system into two subsystems of a slow-time and fast-time scale. Utilizing the fixed point iterative algorithm to solve cross-coupled algebraic Riccati equations, equilibrium strategies of the two subsystems can be obtained, and further the composite strategy of the original full-order system. It was proved that such a composite strategy formed an o(ε) (near) Stackelberg equilibrium, and a numerical result of the algorithm was presented in the end.

Keywords: singularly perturbed; bilinear quadratic system; stackelberg equilibrium

1 Introduction

Dynamic game theory has been studied widely over the past decades, and the non-cooperative game theory of linear quadratic systems has been studied intensively in many papers. For example, Cruz. Jr et al. obtained the open-loop Stackelberg strategy in non-zero sum games[1]; in [2], Basar summarized the non-cooperative game theory in linear quadratic systems; in [3], Medanic developed necessary conditions for closed-loop Stackelberg strategies in linear quadratic problems and presented an algorithm for numerical solutions of two-level Stackelberg problems; Mizukami investigated the linear quadratic closed-loop Stackelberg game for the descriptor system and constructed the incentive strategies in [4]. For singularly perturbed systems, in [5], Khalil and Kokotovic discussed the well-posedness of singularly perturbed Nash games and illustrated the impact of the feedback information available to players on the well-posedness of the game; Xu and Mizukami presented a unified approach to achieve the composite approximation of the full-order linear feedback saddle-point solution[6]; Mukaidani proposed a new algorithm for solving cross-coupled algebraic Riccati equations of singularly perturbed Nash games in [7], further applied the algorithm in obtaining the linear quadratic infinite horizon Nash game for general multiparameter singularly pertubed systems[8], studied the computation of the linear closed-loop Stackelberg strategies with small singular perturbation parameter in [9], and investigated the linear closed-loop Stackelberg strategy of the singularly perturbed stochastic systems with state dependent noise[10].

However, game theories of singularly perturbed bilinear systems are seldom discussed, while singularly perturbed bilinear systems are a quite proper and essential description tool in describing many practical systems such as neutron level control problem in a fission reactor, DC-motor, induction motor drives[11], and in financial engineering problems, Black-Scholes Option Pricing Model, Aoki’s two sector macroeconomic growth model, Chander and Tokao’s non-linear input-output model can all be extended to singularly perturbed bilinear models in [12–15].

The structure of this paper is organized as follows. In Section 2, the problem of the differential Stackelberg equilibrium strategy for a singularly perturbed bilinear time-invariant system is presented. Sections 3 and 4 are concerned with the decomposition of the full-order system into two subsystems, and the composition strategy of the original full-order system. A simple numerical example is solved in Section 5. Section 6 contains the conclusion.

2 Problem Statement

Consider a time-invariant singularly perturbed bilinear system:

x˙1(t)εx˙2(t)=[A11A12A21A22]x1(t)x2(t)+[B11B21]u(t)+[B12B22]v(t)+{x1(t)x2(t)[MsMf]}u(t)+{x1(t)x2(t)[NsNf]}v(t)(1)

with initial condition

x1(0)x2(0)=[x10x20]

where x₁(t)∈ R^n₁, x₂(t) ∈ R^n₂ are respectively slow and fast state variable, x(t) = [x₁(t), x₂(t)]^T ∈ Rⁿ are state vector with n₁ + n₂ = n, u(t) ∈ R^m and v(t) ∈ R^l are respectively the control inputs of Player 1 and Player 2, the small singular perturbation parameter ε > 0 represents small time constants, inertias, masses, etc., and A₁₁, A₁₂, A₂₁, A₂₂, B₁₁, B₁₂, B₂₁, B₂₂, M_s, M_f, N_s, N_f are constant matrices of appropriate dimensions, with

x1(t)x2(t)[MsMf]=∑j=1n1x1jMsjMfj+∑j=n1+1n1+n2x2jMsjMfjx1(t)x2(t)[NsNf]=∑j=1n1x1jNsjNfj+∑j=n1+1n1+n2x2jNsjNfj

The cost function for each player is defined by

Ji(u,v)=12∫0∞[xT(t)Qix(t)+uT(t)Riiu(t)+vT(t)Rijv(t)]dt(2)

where

Rii>0,Rij>0,i,j=1,2,i≠j,Qi=[Qi11Qi12Qi12TQi22]

It is assumed that the decision-maker denoted by Player 1 is the leader, and Player 2 is the follower. Under the assumption that both players employ strategies u := u(x, t), v := v(x, t), a strategy set (u^*, v^*) is called a Stackelberg strategy if for any admissible strategy set (u, v), the following conditions hold[10].

J1(u∗,v∗)≤J1(u,v0(u)),∀u∈Rm(3)

where

J2(u,v0(u))=minvJ2(u,v)

and

v∗=v0(u∗)

3 Decomposition of Slow and Fast Systems

Let

B~11(x)B~21(x)=[B11B21]+{[x1x2][MsMf]},B~12(x)B~22(x)=[B12B22]+{[x1x2][NsNf]}B~11=B~11(x),B~21=B~21(x),B~12=B~12(x),B~22=B~22(x)(4)

then (1) can be written as:

x˙1=A11x1+A12x2+B~11u+B~12vεx˙2=A21x1+A22x2+B~21u+B~22v(5a)(5b)

Neglecting the fast modes is equivalent to assuming that they are infinitely fast, that is letting ε = 0. Without the fast modes the system (5) reduces to

x˙1=A11x1+A12x2+B~11u+B~12v(6a)

0=A21x1+A22x2+B~21u+B~22v(6b)

Assuming that A₂₂ is nonsingular, we have

x˙1s=A0x1s+B~01us+B~02vs,x1s=x10(7a)

x2s=−A22−1(A21x1s+B~21us+B~22vs)(7b)

where A0=A11−A12A22−1A21,B~01=B~11−A12A22−1B~21,B~02=B~12−A12A22−1B~22.

Then we can obtain the quadratic cost function for the slow subsystem

Jis=12∫0∞(x1sTQi0x1s+2x1sTDi1us+2x1sTDi2vs+2usTDi3vs+usTRi1sus+vsTRi2svs)dt(8)

where Qi0=Qi11+A21TA22−TQi22A22−1A21,Di1=A21TA22−TQi22A22−1B~21,Di2=A21TA22−TQi22A22−1B~22,Di3=B~21TA22−TQi22A22−1B~22,Ri1s=Ri1+B~21TA22−TQi22A22−1B~21,Ri2s=Ri2+B~22TA22−TQi22A22−1B~22.

Theorem 1

Suppose that the following cross-coupled algebraic Riccati equations has solutions p_1sand p_2s

p1s(A−S1sp1s−S2sp2s)+(A−S1sp1s−S2sp2s)Tp1s+p1sS1sp1s+Q1=0(9a)

p2s(A−S1sp1s−S2sp2s)+(A−S1sp1s−S2sp2s)Tp2s+p2sS2sp2s+Q2=0(9b)

where

A=A0+B~01T11+B~02T21,S1s=12B~02T22−B~01T12,S2s=12B~01T13−B~02T23Q1=Q10+D11T11+D12T21,Q2=Q20+D21T11+D22T21

Then, the Stackelberg equilibrium solution(us∗,vs∗)of the slow subsystem can be given by

us∗=T11+T12p1s−T13p2sx1s(10a)

vs∗=T21−T22p1s+T23p2sx1s(10b)

Proof

The Hamiltonian H_is corresponding to the system (7) and performance index (8) is

His=12(x1sTQi0x1s+2x1sTDi1us+2x1sTDi2vs+2usTDi3vs+usTRi1sus+vsTRi2svs)+λiT(A0x1s+B~01us+B~02vs)(11)

where λ_i ∈ R^{n₁ × 1} is the Langrangian multiplier. □

Given arbitrary u_s, the corresponding v_s is obtained by minimizing J_2s with respect to v_s. Then, the optimal control is given by

vs=−R22s−1(D22Tx1s+D23Tus+B~02Tλ2)

Then the cost J_1s can be obtained, and we can further obtain

us=(−R11s+2D13R22s−1D23T−D23R22s−TR12sR22s−1D23T)−1(D11T−D23R22s−TD12T−D13R22s−1D22T+D23R22s−TR12sR22s−1D22T)x1s+(B~01T−D23R22s−TB~02T)λ1+(D23R22s−TR12s−D13)R22s−1B~02Tλ2=T11x1s+T12λ1−T13λ2(12a)

then

vs=−R22s−1(D22Tx1s+D23Tus+B~02Tλ2)=−R22s−1D22Tx1s−R22s−1D23Tus−R22s−1B~02Tλ2=−R22s−1D22Tx1s−R22s−1D23T(T11x1s+T12λ1−T13λ2)−R22s−1B~02Tλ2=(−R22s−1D22T−R22s−1D23TT11)x1s−R22s−1D23TT12λ1+(R22s−1D23TT13−R22s−1B~02T)λ2=T21x1s−T22λ1+T23λ2(12b)

where

T11=(−R11s+2D13R22s−1D23T−D23R22s−TR12sR22s−1D23T)−1(D11T−D23R22s−TD12T−D13R22s−1D22T+D23R22s−TR12sR22s−1D22T)T12=(−R11s+2D13R22s−1D23T−D23R22s−TR12sR22s−1D23T)−1(B~01T−D23R22s−TB~02T)T13=(−R11s+2D13R22s−1D23T−D23R22s−TR12sR22s−1D23T)−1(D13−D23R22s−TR12s)R22s−1B~02TT21=−R22s−1D22T−R22s−1D23TT11T22=R22s−1D23TT12T23=R22s−1D23TT13−R22s−1B~02T

For −λ˙1=Q10x1s+D11us+D12vs+A0Tλ1,−λ˙2=Q20x1s+D21us+D22vs+A0Tλ2, letting λ₁ = p_1sx_1s and λ₂ = p_2sx_1s, (9a) and (9b) can be derived respectively. This is the desired result. □

In [8], Mukaidani proposed a fixed-point iterative algorithm for solving cross-coupled algebraic Riccati equations (9).

Assumption 1

The triplet (A0,B~01,Q1) and (A0,B~02,Q2) are stabilizable and detectable.

Under Assumption 1, the positive semidefinite solutions of cross-coupled algebraic Riccati equations (9) exist. It is obtained by performing the fixed-point algorithm:

p1s(n+1)(A−S1sp1s(n)−S2sp2s(n))+(A−S1sp1s(n)−S2sp2s(n))Tp1s(n+1)+Q1+p1s(n)TS1sp1s(n)=0(13a)

p2s(n+1)(A−S1sp1s(n)−S2sp2s(n))+(A−S1sp1s(n)−S2p2s(n))Tp2s(n+1)+Q2+p2s(n)TS2sp2s(n)=0(13b)

n = 0, 1, 2, ⋯

where p1s(0),p2s(0) are the solutions of the following algebraic Riccati equations:

p1s(0)A+ATp1s(0)+Q1−p1s(0)TS1sp1s(0)=0(14a)

p2s(0)(A−S1sp1s(0))+(A−S1sp1s(0))Tp2s(0)+Q2−p2s(0)TS2sp2s(0)=0(14b)

The proof can be seen in [8].

In the fast subsystem, we assume that the slow variables are constant in the boundary layer. Redefining the fast variables x_2f = x₂ − x_2s, and the fast controls u_f = u − u_s, v_f = v − v_s, the fast subsystem is formulated as:

x˙2f=1εA22x2f+1εB~21uf+1εB~22vf,x2f(0)=x20−x2s(0)(15)

Then we can obtain the quadratic cost function for the fast subsystem

Jif=12∫0∞(x2fTQi22x2f+ufTRi1uf+vfTRi2vf)dt(16)

Assumption 2

The triplet (A22,B~21,Q122) and (A22,B~22,Q222) are stabilizable and detectable.

Theorem 2

Under Assumption 2, suppose that the following cross-coupled algebraic Riccati equations has solutions p_1fand p_2f

p1f(A22−S1fp1f−S2fp2f)+(A22−S1fp1f−S2fp2f)Tp1f+p1fS1fp1f+Q122=0(17a)

p2f(A22−S1fp1f−S2fp2f)+(A22−S1fp1f−S2fp2f)Tp2f+p2fS2fp2f+Q222=0(17b)

whereS1f=12B~21R11−1B~21T,S2f=12B~22R22−1B~22T.

Then, the Stackelberg equilibrium solution(uf∗,vf∗)of the fast subsystem can be given by

uf∗=−R11−1B~21Tp1fx2f(18a)

vf∗=−R22−1B~22Tp2fx2f(18b)

Proof

we can get the Stackelberg equilibrium solution (uf∗,vf∗) of the fast subsystem

vf∗=−R22−1B~22Tp2fx2f

then

H1f=12(x2fTQ122x2f+ufTR11uf+vfTR12vf)+λ1fT(A22x2f+B~21uf+B~22vf)=12(x2fTQ122x2f+ufTR11uf)+λ1fT(A22x2f+B~21uf)+12λ2fTB~22R22−TR12R22−1B~22Tλ2f−λ1fTB~22R22−1B~22Tλ2f

where λ_if ∈ R^{n₂ × 1} is the Langrangian multiplier. Then

uf∗=−R11−1B~21Tλ1f=−R11−1B~21Tp1fx2f

where p_1f, p_2f satisfy the cross-coupled algebraic Riccati equations (17). □

Similarly, under Assumption 2, the positive semidefinite solutions of cross-coupled algebraic Riccati equations (17) exist, and can be obtained by performing the fixed-point algorithm:

p1f(n+1)(A22−S1fp1f(n)−S2fp2f(n))+(A22−S1fp1f(n)−S2fp2f(n))Tp1f(n+1)+Q122+p1f(n)TS1fp1f(n)=0(19a)

p2f(n+1)(A22−S1fp1s(n)−S2fp2f(n))+(A22−S1fp1s(n)−S2fp2f(n))Tp2f(n+1)+Q222+p2f(n)TS2fp2f(n)=0(19b)

n = 0, 1, 2, 3…

where p1f(0),p2f(0) are the solutions of the following algebraic Riccati equations:

p1f(0)A22+A22Tp1f(0)+Q122−p1f(0)TS1fp1f(0)=0(20a)

p2f(0)(A22−S1fp1f(0))+(A22−S1fp1f(0))Tp2f(0)+Q222−p2f(0)TS2fp2f(0)=0(20b)

4 Composite Strategy

The composite Stackelberg strategy pair of the full-order singularly perturbed system (1) is constructed as follows[16]:

uc=us∗+uf∗=T11+T12p1s−T13p2sx1s−R11−1B~21Tp1fx2f(21a)

vc=vs∗+vf∗=T21−T22p1s+T23p2sx1s−R22−1B~22Tp2fx2f(21b)

With x₁ replacing x_1s, x₂ replacing x_2s + x_2f, for x2s=−A22−1(A21x1s+B~21us+B~22vs), we obtain

uc=G1x1+G2x2(22a)

vc=G3x1+G4x2(22b)

where

G1=T11+T12p1s−T13p2s−R11−1B~21Tp1fA22−1[A21+B~21(T11+T12p1s−T13p2s)+B~22(T21−T22p1s+T23p2s)]G2=−R11−1B~21Tp1fG3=T21−T22p1s+T23p2s−R22−1B~22Tp2fA22−1[A21+B~21(T11+T12p1s−T13p2s)+B~22(T21−T22p1s+T23p2s)]G4=−R22−1B~22Tp2f

Theorem 3

The composite strategy pair constitutes an o(ε) (near) Stackelberg equilibrium of the full-order game, that is,

x1(t)=x1s(t)+o(ε)(23a)

x2(t)=−A22−1(A21+G0)x1s(t)+x2f(t)+o(ε)(23b)

u∗(t)=uc(t)+o(ε)(23c)

v∗(t)=vc(t)+o(ε)(23d)

Proof

The feedback system (5) can be written as

[x˙1εx˙2]=[A11+B~11G1+B~12G3A12+B~11G2+B~12G4A21+B~21G1+B~22G3A22+B~21G2+B~22G4][x1x2](24)

Introducing the Chang transformation and its inverse

T=[I1−εHL−εHLI2],T−1=[I1εH−LI2−εHL](25)

while the transformation equations are given by

εLA11+A21−(εLA12+A22)L=0A12+ε(A11−A12L)H−H(εLA12+A22)=0(26)

we get

TST−1=[S100S2](27)

where S is the system matrix of (24),

S1=(A11+B~11G1+B~12G3)−(A12+B~11G2+B~12G4)L−εH(A21+B~21G1+B~22G3)−εHL(A11+B~11G1+B~12G3)+εHL(A12+B~11G2+B~12G4)L+εH(A22+B~21G2+B~22G4)LS2=(A22+B~21G2+B~22G4)+L(A12+B~11G2+B~12G4)+L(A11+B~11G1+B~12G3)εH+(A21+B~21G1+B~22G3)εH−L(A12+B~11G2+B~12G4)εHL−(A22+B~21G2+B~22G4)εHL

If (A22+B~21G2+B~22G4)+L(A12+B~11G2+B~12G4) is stable, the solution of (24) is approximated for all finite t ≥ 0 by

x1(t)=exp⁡[(A11+B~11G1+B~12G3−A12L−B~11G2L−B~12G4L)t]x1s(0)+o(ε)(28a)

x2(t)=−A22−1(A21+G0)exp⁡[(A11+B~11G1+B~12G3−A12L−B~11G2L−B~12G4L)t]x1s(0)+exp⁡[(A22+B~21G2+B~22G4+LA12+LB~11G2+LB~12G4)t/ε]x2f(0)+o(ε)(28b)

where x_1s(0), x_2f(0) are given by (7a), (15). If in addition (A₁₁ + B~11G1+B~12G3 ) − (A₁₂ + B~11G2+B~12G4 )L is also stable, (28) holds for all t ∈ [0, ∞). Then (23) follows directly from (28), (7) and (15). □

5 A Numerical Example

In order to demonstrate the efficiency of the proposed decomposition method, we have run a simple numerical example. All matrices are chosen randomly, which are given by

A11=[00.400],A12=[000.3450],A21=[0−0.52400],A22=[−0.4650.2620−1],B11=[00],B12=[00],B21=[01],B22=[0.21],M1=N1=1000,M2=N2=0100,M3=N3=0010,M4=N4=0001

and a quadratic cost function

J1(u,v)=12∫0∞(xTQ1x+u2+2v2)dtJ2(u,v)=12∫0∞(xTQ2x+2u2+v2)dt

where

Q1=diag{1,0,1,0},Q2=diag{1,0,1,0},x10=x20=[11].

The simulation result is presented in Figure 1.

Figure 1

Simulation curves of the composite Stackelberg strategy (u_c, v_c)

6 Conclusions

Many real systems possess the structure of the singularly perturbed bilinear control systems such as motor drives, robust control, multi-sector input-output analysis and option pricing. In this paper, we have studied the Stackelberg games for singularly perturbed bilinear systems. And we propose to decompose the full-order system into two subsystems of a slow-time and fast-time scale. Utilizing the fixed point iterative algorithm to solve cross-coupled algebraic Riccati equations, equilibrium strategies of the two subsystems can be obtained, and further the composite strategy of the original full-order system. It has been proved that such a composite strategy formed an o(ε) (near) Stackelberg equilibrium, and a numerical example in the end has demonstrated the efficiency of the algorithm. The conclusion obtained in this paper could be applied to deal with many practical industry engineering and financial engineering problems.

Supported by the National Number Science Fund of China (71171061), 2014 Natural Science Fund of Guangdong Province (Non-cooperative Game Theory of Singularly Perturbed Markov System), Philosophy and Social Science “the Twelfth Five-Year” Plan Project of Guangdong Province (GD14YGL01), 2014 Guangzhou Philosophy and Social Science Project (14Q21)

References

[1] Simaan M, Cruz Jr J B. On the Stackelberg strategy in non-zero sum games. Journal of Optimization Theory and Applications, 1973, 11(5): 533–555.10.1007/BF00935665Suche in Google Scholar

[2] Basar T, Olsder G T. Dynamic non-cooperative game theory. Academic Press, New York, 1991.Suche in Google Scholar

[3] Medanic J. Closed-loop Stackelberg strategies in linear-quadratic problems. IEEE Transactions on Automatic Control, 1978, 23(4): 632–637.10.1109/TAC.1978.1101788Suche in Google Scholar

[4] Mizukami K, Xu H. Closed-loop stackelberg strategies for linear-quadratic descriptor systems. Journal of Optimization Theory and Applications, 1992, 74: 151–170.10.1007/BF00939897Suche in Google Scholar

[5] Khalil H K, Kokotovic P V. Feedback and well-posedness of singularly perturbed Nash games. IEEE Transactions on Automatic Control, 1979, 24(5): 699–708.10.1109/CDC.1978.268107Suche in Google Scholar

[6] Xu H, Mizukami K. Infinite-horizon differential games of singularly perturbed systems: A unified approach. Automatica, 1997, 33(2): 273–276.10.1016/S0005-1098(96)00173-2Suche in Google Scholar

[7] Mukaidani H, Xu H, Mizukami K. A new algorithm for solving cross-coupled algebraic Riccati equations of singularly perturbed Nash games. Proceedings of the 39th IEEE Conference on Decision and Control, Sydney, Australia, 2000: 3648–3653.Suche in Google Scholar

[8] Mukaidani H. A computational efficient numerical algorithm for solving cross-coupled algebraic Riccati equation and its application to multimodeling systems. Proceedings of the 2006 American Control Conference, Minneapolis, Minnesota, USA, 2006: 725–730.10.1109/ACC.2006.1655442Suche in Google Scholar

[9] Mukaidani H. Efficient numerical procedures for solving closed-loop Stackelberg strategies with small singular perturbation parameter. Applied Mathematics and Computation, 2007, 188: 1173–1183.10.1016/j.amc.2006.10.068Suche in Google Scholar

[10] Mukaidani H, Unno M, Yamamoto T. Stackelberg strategies for singularly perturbed stochastic systems. Proceedings of the 2013 European Control Conference, Zurich, Switzerland, 2013: 730–735.10.23919/ECC.2013.6669247Suche in Google Scholar

[11] Aganovic Z, Gajic Z. Linear optimal control of bilinear systems with applications to singularly perturbations and weak coupling. Springer-Verlag, London, 1995.10.1007/3-540-19976-4Suche in Google Scholar

[12] Mceneaney W. A robust control framework for option pricing. Mathematics of Operations Research, 1997, 22(1): 203–221.10.1287/moor.22.1.202Suche in Google Scholar

[13] Aoki M. Some examples of dynamic bilinear models in economics. Lecture Notes in Economics and Mathematical Systems, Springer-Verlag, Berlin Heidelberg, 1975: 163–169.10.1007/978-3-642-47457-6_9Suche in Google Scholar

[14] Chander P. The nonlinear input output model. Journal of Economic Theory, 1983, 30: 219–229.10.1016/0022-0531(83)90105-9Suche in Google Scholar

[15] Tokao F. Nonlinear Leontief model in abstract spaces. Journal of Mathematical Economics, 1986, 15: 151–156.10.1016/0304-4068(86)90006-6Suche in Google Scholar

[16] Kim B S, Lim M T. Composite control for singularly perturbed bilinear systems via successive Galerkin approximation. IEEE Proceedings of Control Theory, 2003, 150(5): 483–488.10.1049/ip-cta:20030814Suche in Google Scholar

Received: 2014-4-22

Accepted: 2014-9-1

Published Online: 2015-4-25

Artikel in diesem Heft

https://doi.org/10.1515/JSSI-2015-0154

Schlagwörter für diesen Artikel

singularly perturbed; bilinear quadratic system; stackelberg equilibrium