Home Composite Stackelberg Strategy for Singularly Perturbed Bilinear Quadratic Systems
Article Publicly Available

Composite Stackelberg Strategy for Singularly Perturbed Bilinear Quadratic Systems

  • Ning Bin EMAIL logo , Chengke Zhang , Huainian Zhu and Zan Mo
Published/Copyright: April 25, 2015
Become an author with De Gruyter Brill

Abstract

Based on singularly perturbed bilinear quadratic problems, this paper proposes to decompose the full-order system into two subsystems of a slow-time and fast-time scale. Utilizing the fixed point iterative algorithm to solve cross-coupled algebraic Riccati equations, equilibrium strategies of the two subsystems can be obtained, and further the composite strategy of the original full-order system. It was proved that such a composite strategy formed an o(ε) (near) Stackelberg equilibrium, and a numerical result of the algorithm was presented in the end.

1 Introduction

Dynamic game theory has been studied widely over the past decades, and the non-cooperative game theory of linear quadratic systems has been studied intensively in many papers. For example, Cruz. Jr et al. obtained the open-loop Stackelberg strategy in non-zero sum games[1]; in [2], Basar summarized the non-cooperative game theory in linear quadratic systems; in [3], Medanic developed necessary conditions for closed-loop Stackelberg strategies in linear quadratic problems and presented an algorithm for numerical solutions of two-level Stackelberg problems; Mizukami investigated the linear quadratic closed-loop Stackelberg game for the descriptor system and constructed the incentive strategies in [4]. For singularly perturbed systems, in [5], Khalil and Kokotovic discussed the well-posedness of singularly perturbed Nash games and illustrated the impact of the feedback information available to players on the well-posedness of the game; Xu and Mizukami presented a unified approach to achieve the composite approximation of the full-order linear feedback saddle-point solution[6]; Mukaidani proposed a new algorithm for solving cross-coupled algebraic Riccati equations of singularly perturbed Nash games in [7], further applied the algorithm in obtaining the linear quadratic infinite horizon Nash game for general multiparameter singularly pertubed systems[8], studied the computation of the linear closed-loop Stackelberg strategies with small singular perturbation parameter in [9], and investigated the linear closed-loop Stackelberg strategy of the singularly perturbed stochastic systems with state dependent noise[10].

However, game theories of singularly perturbed bilinear systems are seldom discussed, while singularly perturbed bilinear systems are a quite proper and essential description tool in describing many practical systems such as neutron level control problem in a fission reactor, DC-motor, induction motor drives[11], and in financial engineering problems, Black-Scholes Option Pricing Model, Aoki’s two sector macroeconomic growth model, Chander and Tokao’s non-linear input-output model can all be extended to singularly perturbed bilinear models in [1215].

The structure of this paper is organized as follows. In Section 2, the problem of the differential Stackelberg equilibrium strategy for a singularly perturbed bilinear time-invariant system is presented. Sections 3 and 4 are concerned with the decomposition of the full-order system into two subsystems, and the composition strategy of the original full-order system. A simple numerical example is solved in Section 5. Section 6 contains the conclusion.

2 Problem Statement

Consider a time-invariant singularly perturbed bilinear system:

x˙1(t)εx˙2(t)=[A11A12A21A22]x1(t)x2(t)+[B11B21]u(t)+[B12B22]v(t)+{x1(t)x2(t)[MsMf]}u(t)+{x1(t)x2(t)[NsNf]}v(t)(1)

with initial condition

x1(0)x2(0)=[x10x20]

where x1(t)∈ Rn1, x2(t) ∈ Rn2 are respectively slow and fast state variable, x(t) = [x1(t), x2(t)]TRn are state vector with n1 + n2 = n, u(t) ∈ Rm and v(t) ∈ Rl are respectively the control inputs of Player 1 and Player 2, the small singular perturbation parameter ε > 0 represents small time constants, inertias, masses, etc., and A11, A12, A21, A22, B11, B12, B21, B22, Ms, Mf, Ns, Nf are constant matrices of appropriate dimensions, with

x1(t)x2(t)[MsMf]=j=1n1x1jMsjMfj+j=n1+1n1+n2x2jMsjMfjx1(t)x2(t)[NsNf]=j=1n1x1jNsjNfj+j=n1+1n1+n2x2jNsjNfj

The cost function for each player is defined by

Ji(u,v)=120[xT(t)Qix(t)+uT(t)Riiu(t)+vT(t)Rijv(t)]dt(2)

where

Rii>0,Rij>0,i,j=1,2,ij,Qi=[Qi11Qi12Qi12TQi22]

It is assumed that the decision-maker denoted by Player 1 is the leader, and Player 2 is the follower. Under the assumption that both players employ strategies u := u(x, t), v := v(x, t), a strategy set (u*, v*) is called a Stackelberg strategy if for any admissible strategy set (u, v), the following conditions hold[10].

J1(u,v)J1(u,v0(u)),uRm(3)

where

J2(u,v0(u))=minvJ2(u,v)

and

v=v0(u)

3 Decomposition of Slow and Fast Systems

Let

B~11(x)B~21(x)=[B11B21]+{[x1x2][MsMf]},B~12(x)B~22(x)=[B12B22]+{[x1x2][NsNf]}B~11=B~11(x),B~21=B~21(x),B~12=B~12(x),B~22=B~22(x)(4)

then (1) can be written as:

x˙1=A11x1+A12x2+B~11u+B~12vεx˙2=A21x1+A22x2+B~21u+B~22v(5a)(5b)

Neglecting the fast modes is equivalent to assuming that they are infinitely fast, that is letting ε = 0. Without the fast modes the system (5) reduces to

x˙1=A11x1+A12x2+B~11u+B~12v(6a)
0=A21x1+A22x2+B~21u+B~22v(6b)

Assuming that A22 is nonsingular, we have

x˙1s=A0x1s+B~01us+B~02vs,x1s=x10(7a)
x2s=A221(A21x1s+B~21us+B~22vs)(7b)

where A0=A11A12A221A21,B~01=B~11A12A221B~21,B~02=B~12A12A221B~22.

Then we can obtain the quadratic cost function for the slow subsystem

Jis=120(x1sTQi0x1s+2x1sTDi1us+2x1sTDi2vs+2usTDi3vs+usTRi1sus+vsTRi2svs)dt(8)

where Qi0=Qi11+A21TA22TQi22A221A21,Di1=A21TA22TQi22A221B~21,Di2=A21TA22TQi22A221B~22,Di3=B~21TA22TQi22A221B~22,Ri1s=Ri1+B~21TA22TQi22A221B~21,Ri2s=Ri2+B~22TA22TQi22A221B~22.

Theorem 1

Suppose that the following cross-coupled algebraic Riccati equations has solutions p1sand p2s

p1s(AS1sp1sS2sp2s)+(AS1sp1sS2sp2s)Tp1s+p1sS1sp1s+Q1=0(9a)
p2s(AS1sp1sS2sp2s)+(AS1sp1sS2sp2s)Tp2s+p2sS2sp2s+Q2=0(9b)
where
A=A0+B~01T11+B~02T21,S1s=12B~02T22B~01T12,S2s=12B~01T13B~02T23Q1=Q10+D11T11+D12T21,Q2=Q20+D21T11+D22T21

Then, the Stackelberg equilibrium solution(us,vs)of the slow subsystem can be given by

us=T11+T12p1sT13p2sx1s(10a)
vs=T21T22p1s+T23p2sx1s(10b)

Proof

The Hamiltonian His corresponding to the system (7) and performance index (8) is

His=12(x1sTQi0x1s+2x1sTDi1us+2x1sTDi2vs+2usTDi3vs+usTRi1sus+vsTRi2svs)+λiT(A0x1s+B~01us+B~02vs)(11)

where λiRn1 × 1 is the Langrangian multiplier. □

Given arbitrary us, the corresponding vs is obtained by minimizing J2s with respect to vs. Then, the optimal control is given by

vs=R22s1(D22Tx1s+D23Tus+B~02Tλ2)

Then the cost J1s can be obtained, and we can further obtain

us=(R11s+2D13R22s1D23TD23R22sTR12sR22s1D23T)1(D11TD23R22sTD12TD13R22s1D22T+D23R22sTR12sR22s1D22T)x1s+(B~01TD23R22sTB~02T)λ1+(D23R22sTR12sD13)R22s1B~02Tλ2=T11x1s+T12λ1T13λ2(12a)

then

vs=R22s1(D22Tx1s+D23Tus+B~02Tλ2)=R22s1D22Tx1sR22s1D23TusR22s1B~02Tλ2=R22s1D22Tx1sR22s1D23T(T11x1s+T12λ1T13λ2)R22s1B~02Tλ2=(R22s1D22TR22s1D23TT11)x1sR22s1D23TT12λ1+(R22s1D23TT13R22s1B~02T)λ2=T21x1sT22λ1+T23λ2(12b)

where

T11=(R11s+2D13R22s1D23TD23R22sTR12sR22s1D23T)1(D11TD23R22sTD12TD13R22s1D22T+D23R22sTR12sR22s1D22T)T12=(R11s+2D13R22s1D23TD23R22sTR12sR22s1D23T)1(B~01TD23R22sTB~02T)T13=(R11s+2D13R22s1D23TD23R22sTR12sR22s1D23T)1(D13D23R22sTR12s)R22s1B~02TT21=R22s1D22TR22s1D23TT11T22=R22s1D23TT12T23=R22s1D23TT13R22s1B~02T

For λ˙1=Q10x1s+D11us+D12vs+A0Tλ1,λ˙2=Q20x1s+D21us+D22vs+A0Tλ2, letting λ1 = p1sx1s and λ2 = p2sx1s, (9a) and (9b) can be derived respectively. This is the desired result. □

In [8], Mukaidani proposed a fixed-point iterative algorithm for solving cross-coupled algebraic Riccati equations (9).

Assumption 1

The triplet (A0,B~01,Q1) and (A0,B~02,Q2) are stabilizable and detectable.

Under Assumption 1, the positive semidefinite solutions of cross-coupled algebraic Riccati equations (9) exist. It is obtained by performing the fixed-point algorithm:

p1s(n+1)(AS1sp1s(n)S2sp2s(n))+(AS1sp1s(n)S2sp2s(n))Tp1s(n+1)+Q1+p1s(n)TS1sp1s(n)=0(13a)
p2s(n+1)(AS1sp1s(n)S2sp2s(n))+(AS1sp1s(n)S2p2s(n))Tp2s(n+1)+Q2+p2s(n)TS2sp2s(n)=0(13b)
n = 0, 1, 2, ⋯

where p1s(0),p2s(0) are the solutions of the following algebraic Riccati equations:

p1s(0)A+ATp1s(0)+Q1p1s(0)TS1sp1s(0)=0(14a)
p2s(0)(AS1sp1s(0))+(AS1sp1s(0))Tp2s(0)+Q2p2s(0)TS2sp2s(0)=0(14b)

The proof can be seen in [8].

In the fast subsystem, we assume that the slow variables are constant in the boundary layer. Redefining the fast variables x2f = x2x2s, and the fast controls uf = uus, vf = vvs, the fast subsystem is formulated as:

x˙2f=1εA22x2f+1εB~21uf+1εB~22vf,x2f(0)=x20x2s(0)(15)

Then we can obtain the quadratic cost function for the fast subsystem

Jif=120(x2fTQi22x2f+ufTRi1uf+vfTRi2vf)dt(16)

Assumption 2

The triplet (A22,B~21,Q122) and (A22,B~22,Q222) are stabilizable and detectable.

Theorem 2

Under Assumption 2, suppose that the following cross-coupled algebraic Riccati equations has solutions p1fand p2f

p1f(A22S1fp1fS2fp2f)+(A22S1fp1fS2fp2f)Tp1f+p1fS1fp1f+Q122=0(17a)
p2f(A22S1fp1fS2fp2f)+(A22S1fp1fS2fp2f)Tp2f+p2fS2fp2f+Q222=0(17b)
whereS1f=12B~21R111B~21T,S2f=12B~22R221B~22T.

Then, the Stackelberg equilibrium solution(uf,vf)of the fast subsystem can be given by

uf=R111B~21Tp1fx2f(18a)
vf=R221B~22Tp2fx2f(18b)

Proof

we can get the Stackelberg equilibrium solution (uf,vf) of the fast subsystem

vf=R221B~22Tp2fx2f

then

H1f=12(x2fTQ122x2f+ufTR11uf+vfTR12vf)+λ1fT(A22x2f+B~21uf+B~22vf)=12(x2fTQ122x2f+ufTR11uf)+λ1fT(A22x2f+B~21uf)+12λ2fTB~22R22TR12R221B~22Tλ2fλ1fTB~22R221B~22Tλ2f

where λifRn2 × 1 is the Langrangian multiplier. Then

uf=R111B~21Tλ1f=R111B~21Tp1fx2f

where p1f, p2f satisfy the cross-coupled algebraic Riccati equations (17). □

Similarly, under Assumption 2, the positive semidefinite solutions of cross-coupled algebraic Riccati equations (17) exist, and can be obtained by performing the fixed-point algorithm:

p1f(n+1)(A22S1fp1f(n)S2fp2f(n))+(A22S1fp1f(n)S2fp2f(n))Tp1f(n+1)+Q122+p1f(n)TS1fp1f(n)=0(19a)
p2f(n+1)(A22S1fp1s(n)S2fp2f(n))+(A22S1fp1s(n)S2fp2f(n))Tp2f(n+1)+Q222+p2f(n)TS2fp2f(n)=0(19b)
n = 0, 1, 2, 3…

where p1f(0),p2f(0) are the solutions of the following algebraic Riccati equations:

p1f(0)A22+A22Tp1f(0)+Q122p1f(0)TS1fp1f(0)=0(20a)
p2f(0)(A22S1fp1f(0))+(A22S1fp1f(0))Tp2f(0)+Q222p2f(0)TS2fp2f(0)=0(20b)

4 Composite Strategy

The composite Stackelberg strategy pair of the full-order singularly perturbed system (1) is constructed as follows[16]:

uc=us+uf=T11+T12p1sT13p2sx1sR111B~21Tp1fx2f(21a)
vc=vs+vf=T21T22p1s+T23p2sx1sR221B~22Tp2fx2f(21b)

With x1 replacing x1s, x2 replacing x2s + x2f, for x2s=A221(A21x1s+B~21us+B~22vs), we obtain

uc=G1x1+G2x2(22a)
vc=G3x1+G4x2(22b)

where

G1=T11+T12p1sT13p2sR111B~21Tp1fA221[A21+B~21(T11+T12p1sT13p2s)+B~22(T21T22p1s+T23p2s)]G2=R111B~21Tp1fG3=T21T22p1s+T23p2sR221B~22Tp2fA221[A21+B~21(T11+T12p1sT13p2s)+B~22(T21T22p1s+T23p2s)]G4=R221B~22Tp2f

Theorem 3

The composite strategy pair constitutes an o(ε) (near) Stackelberg equilibrium of the full-order game, that is,

x1(t)=x1s(t)+o(ε)(23a)
x2(t)=A221(A21+G0)x1s(t)+x2f(t)+o(ε)(23b)
u(t)=uc(t)+o(ε)(23c)
v(t)=vc(t)+o(ε)(23d)

Proof

The feedback system (5) can be written as

[x˙1εx˙2]=[A11+B~11G1+B~12G3A12+B~11G2+B~12G4A21+B~21G1+B~22G3A22+B~21G2+B~22G4][x1x2](24)

Introducing the Chang transformation and its inverse

T=[I1εHLεHLI2],T1=[I1εHLI2εHL](25)

while the transformation equations are given by

εLA11+A21(εLA12+A22)L=0A12+ε(A11A12L)HH(εLA12+A22)=0(26)

we get

TST1=[S100S2](27)

where S is the system matrix of (24),

S1=(A11+B~11G1+B~12G3)(A12+B~11G2+B~12G4)LεH(A21+B~21G1+B~22G3)εHL(A11+B~11G1+B~12G3)+εHL(A12+B~11G2+B~12G4)L+εH(A22+B~21G2+B~22G4)LS2=(A22+B~21G2+B~22G4)+L(A12+B~11G2+B~12G4)+L(A11+B~11G1+B~12G3)εH+(A21+B~21G1+B~22G3)εHL(A12+B~11G2+B~12G4)εHL(A22+B~21G2+B~22G4)εHL

If (A22+B~21G2+B~22G4)+L(A12+B~11G2+B~12G4) is stable, the solution of (24) is approximated for all finite t ≥ 0 by

x1(t)=exp[(A11+B~11G1+B~12G3A12LB~11G2LB~12G4L)t]x1s(0)+o(ε)(28a)
x2(t)=A221(A21+G0)exp[(A11+B~11G1+B~12G3A12LB~11G2LB~12G4L)t]x1s(0)+exp[(A22+B~21G2+B~22G4+LA12+LB~11G2+LB~12G4)t/ε]x2f(0)+o(ε)(28b)

where x1s(0), x2f(0) are given by (7a), (15). If in addition (A11 + B~11G1+B~12G3 ) − (A12 + B~11G2+B~12G4 )L is also stable, (28) holds for all t ∈ [0, ∞). Then (23) follows directly from (28), (7) and (15). □

5 A Numerical Example

In order to demonstrate the efficiency of the proposed decomposition method, we have run a simple numerical example. All matrices are chosen randomly, which are given by

A11=[00.400],A12=[000.3450],A21=[00.52400],A22=[0.4650.26201],B11=[00],B12=[00],B21=[01],B22=[0.21],M1=N1=1000,M2=N2=0100,M3=N3=0010,M4=N4=0001

and a quadratic cost function

J1(u,v)=120(xTQ1x+u2+2v2)dtJ2(u,v)=120(xTQ2x+2u2+v2)dt

where

Q1=diag{1,0,1,0},Q2=diag{1,0,1,0},x10=x20=[11].

The simulation result is presented in Figure 1.

Figure 1 Simulation curves of the composite Stackelberg strategy (uc, vc)
Figure 1

Simulation curves of the composite Stackelberg strategy (uc, vc)

6 Conclusions

Many real systems possess the structure of the singularly perturbed bilinear control systems such as motor drives, robust control, multi-sector input-output analysis and option pricing. In this paper, we have studied the Stackelberg games for singularly perturbed bilinear systems. And we propose to decompose the full-order system into two subsystems of a slow-time and fast-time scale. Utilizing the fixed point iterative algorithm to solve cross-coupled algebraic Riccati equations, equilibrium strategies of the two subsystems can be obtained, and further the composite strategy of the original full-order system. It has been proved that such a composite strategy formed an o(ε) (near) Stackelberg equilibrium, and a numerical example in the end has demonstrated the efficiency of the algorithm. The conclusion obtained in this paper could be applied to deal with many practical industry engineering and financial engineering problems.


Supported by the National Number Science Fund of China (71171061), 2014 Natural Science Fund of Guangdong Province (Non-cooperative Game Theory of Singularly Perturbed Markov System), Philosophy and Social Science “the Twelfth Five-Year” Plan Project of Guangdong Province (GD14YGL01), 2014 Guangzhou Philosophy and Social Science Project (14Q21)


References

[1] Simaan M, Cruz Jr J B. On the Stackelberg strategy in non-zero sum games. Journal of Optimization Theory and Applications, 1973, 11(5): 533–555.10.1007/BF00935665Search in Google Scholar

[2] Basar T, Olsder G T. Dynamic non-cooperative game theory. Academic Press, New York, 1991.Search in Google Scholar

[3] Medanic J. Closed-loop Stackelberg strategies in linear-quadratic problems. IEEE Transactions on Automatic Control, 1978, 23(4): 632–637.10.1109/TAC.1978.1101788Search in Google Scholar

[4] Mizukami K, Xu H. Closed-loop stackelberg strategies for linear-quadratic descriptor systems. Journal of Optimization Theory and Applications, 1992, 74: 151–170.10.1007/BF00939897Search in Google Scholar

[5] Khalil H K, Kokotovic P V. Feedback and well-posedness of singularly perturbed Nash games. IEEE Transactions on Automatic Control, 1979, 24(5): 699–708.10.1109/CDC.1978.268107Search in Google Scholar

[6] Xu H, Mizukami K. Infinite-horizon differential games of singularly perturbed systems: A unified approach. Automatica, 1997, 33(2): 273–276.10.1016/S0005-1098(96)00173-2Search in Google Scholar

[7] Mukaidani H, Xu H, Mizukami K. A new algorithm for solving cross-coupled algebraic Riccati equations of singularly perturbed Nash games. Proceedings of the 39th IEEE Conference on Decision and Control, Sydney, Australia, 2000: 3648–3653.Search in Google Scholar

[8] Mukaidani H. A computational efficient numerical algorithm for solving cross-coupled algebraic Riccati equation and its application to multimodeling systems. Proceedings of the 2006 American Control Conference, Minneapolis, Minnesota, USA, 2006: 725–730.10.1109/ACC.2006.1655442Search in Google Scholar

[9] Mukaidani H. Efficient numerical procedures for solving closed-loop Stackelberg strategies with small singular perturbation parameter. Applied Mathematics and Computation, 2007, 188: 1173–1183.10.1016/j.amc.2006.10.068Search in Google Scholar

[10] Mukaidani H, Unno M, Yamamoto T. Stackelberg strategies for singularly perturbed stochastic systems. Proceedings of the 2013 European Control Conference, Zurich, Switzerland, 2013: 730–735.10.23919/ECC.2013.6669247Search in Google Scholar

[11] Aganovic Z, Gajic Z. Linear optimal control of bilinear systems with applications to singularly perturbations and weak coupling. Springer-Verlag, London, 1995.10.1007/3-540-19976-4Search in Google Scholar

[12] Mceneaney W. A robust control framework for option pricing. Mathematics of Operations Research, 1997, 22(1): 203–221.10.1287/moor.22.1.202Search in Google Scholar

[13] Aoki M. Some examples of dynamic bilinear models in economics. Lecture Notes in Economics and Mathematical Systems, Springer-Verlag, Berlin Heidelberg, 1975: 163–169.10.1007/978-3-642-47457-6_9Search in Google Scholar

[14] Chander P. The nonlinear input output model. Journal of Economic Theory, 1983, 30: 219–229.10.1016/0022-0531(83)90105-9Search in Google Scholar

[15] Tokao F. Nonlinear Leontief model in abstract spaces. Journal of Mathematical Economics, 1986, 15: 151–156.10.1016/0304-4068(86)90006-6Search in Google Scholar

[16] Kim B S, Lim M T. Composite control for singularly perturbed bilinear systems via successive Galerkin approximation. IEEE Proceedings of Control Theory, 2003, 150(5): 483–488.10.1049/ip-cta:20030814Search in Google Scholar

Received: 2014-4-22
Accepted: 2014-9-1
Published Online: 2015-4-25

© 2015 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 4.11.2025 from https://www.degruyterbrill.com/document/doi/10.1515/JSSI-2015-0154/html
Scroll to top button