Smoothing Approximation to the Square-Root Exact Penalty Function

Yaqiong Duan; Shujun Lian

doi:10.1515/JSSI-2016-0087

Enjoy 40% off

academic books on De Gruyter Brill *

Article Publicly Available

Smoothing Approximation to the Square-Root Exact Penalty Function

Yaqiong Duan and Shujun Lian

Published/Copyright: February 25, 2016

Published by

Become an author with De Gruyter Brill

Author Information

From the journal Journal of Systems Science and Information Volume 4 Issue 1

Abstract

In this paper, smoothing approximation to the square-root exact penalty functions is devised for inequality constrained optimization. It is shown that an approximately optimal solution of the smoothed penalty problem is an approximately optimal solution of the original problem. An algorithm based on the new smoothed penalty functions is proposed and shown to be convergent under mild conditions. Three numerical examples show that the algorithm is efficient.

Keywords: constrained optimization; exact penalty function; square-root penalty function; optimal solution; smoothing method

1 Introduction

Consider the following nonlinear constrained optimization problem

minf(x)[P]s.t.gi(x)≤0, i=1,2,⋯,m,x∈Rn,

where f : Rⁿ → R and g_i : Rⁿ → R, i ∈ I = {1, 2, ⋯, m} are twice continuously differentiable functions. Let

G0={x∈Rn|gi(x)≤0,i=1,2,⋯,m}.

To solve [P], many penalty function methods have been proposed in the literatures (see, e.g., [1–10]). In [1] the classical l₁ exact penalty function is defined as follows

f(x,q)=f(x)+q∑i=1mgi+(x),(1)

where gi+(x)=max{0,gi(x)}, i=1,2,⋯,m.

Nonlinear penalty function has been investigated in [11] and [12] as the following form

Lk(x,d)=[f(x)k+∑i=1mdi(gi+(x))k]1/k,

where f(x) is assumed to be positive, k > 0 is a given number, and d=(d1,d2,⋯,dm)∈R+m is the penalty parameter. In [11], it was shown that the exact penalty parameter corresponding to k ∈ (0, 1] is substantially smaller than that of the classical l₁ exact penalty function.

In [13], the lower order exact penalty functions

φq,k(x)=f(x)+q∑i=1m(gi+(x))k,k∈(0,1)

have been investigated. It is shown that any strict local minimizer satisfying the second order sufficiency condition for the original problem is a strict local minimizer of the lower order penalty function with any positive penalty parameter. However, it is not a smooth function. When k=12, smoothing for the nonlinear penalty function

φq(x)=f(x)+q∑i=1mgi+(x)(2)

was investigated in [14] and [15].

In this paper, we propose a method for smoothing the square-root penalty function of the form (2). Different from the smooth functions given in [14] and [15], we give a function approximate to the original function from the left side of 0. The rest of this paper is organized as follows. In Section 2, a new smoothing function to the square-root penalty function is introduced. It is shown that an approximately optimal solution of the smoothed penalty problem is an approximately optimal solution of the original problem. In Section 3, we give an algorithm to compute an approximate solution to [P] based on the smooth penalty function and show that the algorithm is convergent. In Section 4, three numerical examples are given to show the efficiency of the algorithm.

2 Smoothing Exact Lower Order Penalty Function

Consider the following lower order penalty problem

[LOP]minx∈Rnφq(x).

In order to establish the exact penalization, we need the following assumptions given in [13].

Assumption 1

f(x) satisfies the following coercive condition

lim∥x∥→+∞f(x)=+∞.

Under Assumption 1, there exists a box X such that G([P]) ⊂ int(X), where G([P]) is the set of global minima of problem [P], int(X) denotes the interior of the set X. Consider the following problem

minf(x)[P′]s.t.gi(x)≤0,i=1,2,⋯,m,x∈X.

Let G([P′]) denote the set of global minima of problem [P′]. Then G([P′]) = G([P]).

Assumption 2

The set G([P]) is a finite set.

Then we consider the penalty problem of the form

[LOP′]minx∈Xφq(x).

Let p(u)=(max{0,u})12, that is,

p(u)=u12,ifu>0,0,otherwise,

then

φq(x)=f(x)+q∑i=1mp(gi(x)).

For any ϵ > 0, let

pϵ(u)=23ϵ12,ifu≤0,13ϵ−1u32+23ϵ12,if0<u≤ϵ,u12,ifu>ϵ.(3)

It follows that

pϵ′(u)=0,ifu≤0,12ϵ−1u12,if0<u≤ϵ,12u−12,ifu>ϵ.

It is easy to see that p_ϵ(u) is continuously differentiable on R. Furthermore, we can obtain that p_ϵ(u) → p(u) as ϵ→0.

Figure 1 shows the behavior of p(u) (represented by the solid line), p_0.1(u) (represented by the dot line), p_0.01(u) (represented by the broken line) and p_0.001(u) (represented by the dash and dot line).

Figure 1

The behavior of p_ϵ(u) and p(u)

Let

φq,ϵ(x)=f(x)+q∑i=1mpϵ(gi(x)).

Then φ_{q, ϵ}(x) is continuously differentiable on Rⁿ. Consider the following smoothed optimization problem

[SP]minx∈Xφq,ϵ(x).

Lemma 1

For any x ∈ X, ϵ > 0,

0≤φq,ϵ(x)−φq(x)≤23mqϵ12.

Proof

Note that

pϵ(gi(x))−p(gi(x))=23ϵ12,ifgi(x)≤0,−(gi(x))12+13ϵ−1(gi(x))32+23ϵ12,if0<gi(x)≤ϵ,0,ifgi(x)>ϵ.

When g_i(x) ∈ (0, ϵ], let

F(u)=−u12+13ϵ−1u32+23ϵ12,

since

F′(u)=−12u−12+12ϵ−1u12=12ϵ−1u−12(u−ϵ)<0,

we have

0≤pϵ(gi(x))−p(gi(x))≤23ϵ12.

Then

0≤φq,ϵ(x)−φq(x)≤23mqϵ12.

This completes the proof.

Theorem 2

Let {ϵ_j → 0 be a sequence of positive numbers and assume thatx^jis a solution to min_{x ∈ X}φ_{q,ϵ_j}(x) for someq > 0. Letxbe an accumulating point of the sequence {x^j}. Thenxis an optimal solution to min_{x ∈ X}φ_q(x).

Proof

Because x^j is a solution to min_{x ∈ X}φ_{q, ϵ_j}(x), we have

φq,ϵj(xj)≤φq,ϵj(x),∀x∈X.

By Lemma 1, we have

φq(xj)≤φq,ϵj(xj)

and

φq,ϵj(x)≤φq(x)+23mqϵj12.

It follows that

φq(xj)≤φq,ϵj(xj)≤φq,ϵj(x)≤φq(x)+23mqϵj12.

Let j → 0, we have

φq(x¯)≤φq(x).

Theorem 3

Letxj∗∈Xbe an optimal solution of problem [LOP′] andx_{q, ϵ} ∈ Xbe an optimal solution of problem [SP] for someq > 0 andϵ > 0. Then

0≤φq,ϵ(x¯q,ϵ)−φq(xq∗)≤23mqϵ12.

Proof

By Lemma 1, we have

0≤φq,ϵ(x¯q,ϵ)−φq(x¯q,ϵ)≤φq,ϵ(x¯q,ϵ)−φq(xq∗)≤φq,ϵ(xq∗)−φq(xq∗)≤23mqϵ12.

Corollary 4

Suppose that Assumptions 1 and 2 hold, and that for anyx^* ∈ G([P]), there exists a λ^* ∈ R+msuch that the pair (x^*, λ^*) satisfies the second order sufficiency condition defined in [2]. Letx^* ∈ Xbe a global solution of problem [P] andx_{q, ϵ} ∈ Xbe a global solution of problem [SP] forϵ > 0. Then there existsq^* > 0 such that for anyq > q^*,

0≤φq,ϵ(x¯q,ϵ)−f(x∗)≤23mqϵ12,

whereq^*is defined in Corollary 2.3in [13].

Proof

By Corollary 2.3 in [13], we have that x^* is a global solution of problem [LOP′] for q > q^* for an appropriately chosen q^* > 0. Then by Theorem 3, we have

0≤φq,ϵ(x¯q,ϵ)−φq(x∗)≤23mqϵ12.

Since ∑i=1mp(gi(x∗))=0, we have

φq(x∗)=f(x∗)+q∑i=1mp(gi(x∗))=f(x∗).

Definition 5

For ϵ > 0, a point x ∈ X is said to be an ϵ-feasible solution of problem [P], if g_i(x) ≤ ϵ for any i ∈ I.

Theorem 6

Letxj∗∈Xbe an optimal solution of problem [LOP′] andx_{q, ϵ} ∈ Xbe an optimal solution of problem [SP]. Furthermore, letxj∗be a feasible solution of problem [P] andx_{q, ϵ}be anϵ-feasible solution of problem [P], then we have

−mqϵ12≤f(x¯q,ϵ)−f(xq∗)≤0.

Proof

It is clear that ∑i=1mp(gi(xq∗))=0. By Theorem 3 we have

0≤φq,ϵ(x¯q,ϵ)−φq(xq∗)=f(x¯q,ϵ)+q∑i=1mpϵ(gi(x¯q,ϵ))−(f(xq∗)+q∑i=1mp(gi(xq∗)))≤23mqϵ12.

which implies

−q∑i=1mpϵ(gi(x¯q,ϵ))≤f(x¯q,ϵ)−f(xq∗)≤23mqϵ12−q∑i=1mpϵ(gi(x¯q,ϵ)).(4)

By (3), we have

23ϵ12≤pϵ(gi(x¯q,ϵ))≤ϵ12.(5)

Then it follows (4) and (5) that

−mqϵ12≤f(x¯q,ϵ)−f(xq∗)≤0.

This completes the proof.

Theorem 2 and Theorem 3 mean that an approximate solution to [SP] is also an approximate solution to [LOP′] when the error ϵ is sufficiently small. Furthermore, by Theorem 6, an optimal solution to [SP] becomes an approximately optimal solution to [P] if the solution to [SP] is ϵ-feasible.

3 A Smoothing Method

We propose the following algorithm to solve [P].

Algorithm Step 1 Choose an initial point x⁰. Given ϵ₀ > 0, q₀ > 0, 0 < η < 1, andN > 1, let j = 0 and go to Step 2.

Step 2 Use x^j as the starting point to solve min_{x∈ Rⁿ}φ_{q_j,ϵ_j} (x). Let xq∗ be the optimal solution obtained. ( xq∗ is obtained by a quasi-Newton method and a finite difference gradient). Go to Step 3.

Step 3 If xj∗ is ϵ-feasible to [P], then stop and we have obtained an approximately optimal solution xq∗ of the original problem [P]. Otherwise, let q_j+1 = Nq_j, ϵ_j+1 = ηϵ_j, x^j+1 = xq∗ , and j = j + 1, then go to Step 2.

Remark 8

From 0 < η < 1 and N > 1, we can easily obtain the sequence {ϵ_j} is decreasing to 0 and the sequence {q_j} is increasing to + ∞ as j ⟶ +∞.

Now we prove the convergence of the algorithm under mild conditions.

Theorem 9

Suppose that for anyq ∈ [q₀, +∞), ϵ ∈ (0, ϵ₀], the set

argminx∈Rnφq,ϵ(x)≠∅.

Let {xq∗} be the sequence generated by Algorithm 7. If {xq∗} has limit point, then any limit point of {xq∗} is the solution of [P] for any m ≥ 3.

Proof

Let x be any limit point of { xq∗}. Then there exists a natural number set J ⊆ N, such that xq∗ → x, j ∈ J. If we can prove that (i) x∈ G₀ and (ii) f(x) ≤ ∈ f_{x∈ G₀}f(x) hold, then x is the optimal solution of [P].

Suppose to contrary that x ∉ G₀, then there exist δ₀ > 0, i₀∈ I and a subset J₁ ⊂ J such that g_i₀ ( xq∗)≥ δ₀ for any j ∈ J₁.
If ϵ_j ≥ g_i₀ (xq∗) ≥ δ₀, it follows from Step 2 in Algorithm 7 and (3) that
f(xj∗)+13qjϵj−1δ032+23mqjϵj12≤φqj,ϵj(xj∗)≤φqj,ϵj(x)=f(x)+23mqjϵj12,∀x∈G0.
Thus,
f(xj∗)+13qjϵj−1δ032≤φqj,ϵj(xj∗)−23mqjϵj12≤f(x),∀x∈G0,
which contradicts with ϵ_j → 0 and q_j → +∞.
If gi0(xj∗)≥δ0>ϵjorgi0(xj∗)>ϵj≥δ0, it follows from Step 2 in Algorithm 7 and (3) that
f(xj∗)+qjδ012+qj(m−1)ϵj12≤φqj,ϵj(xj∗)≤φqj,ϵj(x)=f(x)+23mqjϵj12,∀x∈G0.
Thus,
f(xj∗)+qjδ012+(13m−1)qjϵj12≤φqj,ϵj(xj∗)≤f(x),∀x∈G0,
which contradicts with ϵ_j → 0 and q_j → + ∞ when m ≥ 3.
Then we have x ∈ G₀.
For any x ∈ G₀, it holds that
f(xj∗)≤φqj,ϵj(xj∗)≤φqj,ϵj(x)=f(x),
then f(x)≤ inf_{x∈ G₀}f(x) holds. This completes the proof.

4 Numerical Examples

In this section, we solve three numerical examples to show the applicability of Algorithm 7 on Matlab.

Example 1

(Example 4.2 in [14] and Example 2 in [15])

minf(x)=x12+x22+2x32+x42−5x1−5x2−21x3+7x4s.t.g1(x)=2x12+x22+x32+2x1+x2+x4−5≤0,g2(x)=x12+x22+x32+x42+x1−x2+x3−x4−8≤0,g3(x)=x12+2x22+x32+2x42−x1−x4−10≤0.

Let x⁰=(1, 1,1, 1), q₀=2.0, ϵ₀=0.1, η=0.1, N=2, the results by Algorithm 7 are shown in Table 1.

Table 1

Numerical results for Example 1

j	xj∗	q_j	ϵ_j	g₁(xj∗)	g₂(xj∗)	g₃(xj∗)	f(xj∗)
0	(1.138375, 1.271269,	2	0.1	15.340209	19.160985	17.939222	−70.152230
0	3.818479, −1.996515)	2	0.1	15.340209	19.160985	17.939222	−70.152230

1	(0.177165, 0.826666,	4	0.01	0.000079	0.000236	−1.925763	−44.233800
1	2.008946, −0.962929)	4	0.01	0.000079	0.000236	−1.925763	−44.233800

2	(0.177135, 0.826640,	8	0.001	−0.000241	0.000000	−1.925947	−44.233093
2	2.008904, −0.962939)	8	0.001	−0.000241	0.000000	−1.925947	−44.233093

3	(0.177135, 0.826640,	16	0.0001	−0.000245	−0.000006	−1.925953	−44.233076
3	2.008903, −0.962938)	16	0.0001	−0.000245	−0.000006	−1.925953	−44.233076

The obtained approximately optimal solution is x^*=(0.1843219, 0.8502275, 1.992824, −0.9814662) with corresponding objective function value −44.233076. From [14], the obtained approximately optimal solution is x^*=(0.169234, 0.835656, 2.008690, −0.964901 ) with corresponding objective function value −44.233582. From [15], the obtained approximately optimal solution is x^*=(0.1585001, 0.8339736, 2.014753, −0.959688) with corresponding objective function value −44.22965.

For the j’ th iteration of the algorithm, we define a constraint error e_j by

ej=∑i=1mmax(gi(xj∗),0).

It is clear that xj∗ is ϵ-feasible to (P) when e_j < ϵ.

Example 2

(Example 4.3 in [14])

minf(x)=1000−x12−2x22−x32−x1x2−x1x3s.t.g1(x)=x12+x22+x32−25=0,g2(x)=(x1−5)2+x22+x32−25=0,g3(x)=(x1−5)2+(x2−5)2+(x3−5)2−25≤0.

Starting point x⁰=(2,2,2), q₀=100, ϵ₀=10, η=0.01, N=10, we obtain the results by Algorithm 7 shown in Table 2.

Table 2

Numerical results for Example 2

j	xj∗	q_j	ϵ_j	e_j	f(xj∗)
0	2.5186624.2310550.970488	100	10	0.18804328	943.809922

1	2.4999994.2205260.968073	1000	0.1	0.000001	944.215680

2	2.4999994.2205250.968073	10000	0.001	0.000000	944.215671

The results show that the obtained approximate global solution is xj∗=(2.499999, 4.220525, 0.968073) with objective function value f( xj∗)=944.215671. From [14], we know that the global solution is (2.500000, 4.220720, 0.967224) with global optimal value 944.215662.

Example 3

(Example 4.5 in [14])

minf(x)=10x2+2x3+x4+3x5+4x6s.t.g1(x)=x1+x2−10=0,g2(x)=−x1+x3+x4+x5=0,g3(x)=−x2−x3+x5+x6=0,g4(x)=10x1−2x3+3x4−2x5−16≤0,g5(x)=x1+4x3+x5−10≤0,0≤x1≤12,0≤x2≤18,0≤x3≤5,0≤x4≤12,0≤x5≤1,0≤x6≤16.

Let x⁰=(3,3,3,3,1,3), q₀=1000, ϵ₀=0.1, η=0.01, N=2, numerical results by Algorithm 7 are shown in Table 3.

Table 3

Numerical results for Example 3 with x⁰ = (3, 3, 3, 3, 1, 3)

j	xj∗	q_j	ϵ_j	e_j	f(xj∗)
	(1.657190, 8.341187,
0	0.119599, 0.548394,	1000	0.1	0.002008	117.053718
	0.988944, 7.471706)

	(1.657434, 8.342565,
1	0.119498, 0.548084,	2000	0.001	0.000000	117.071132
	0.989852, 7.472210)

Let x⁰=(4,4,4,4,1,4), q₀=1000, ϵ₀=0.1, η=0.01, N=2, the results by Algorithm 7 are shown in Table 4.

Table 4

Numerical results for Example 3 with x⁰ = (4, 4, 4, 4, 1, 4)

j	xj∗	q_j	ϵ_j	e_j	f(xj∗)
	(1.619428, 8.382967,
0	0.021739, 0.600295	1000	0.1	0.003962	117.100131
	0.997596, 7.408475)

	(1.618086, 8.381914,
1	0.022029, 0.599781,	2000	0.001	0.000002	117.082494
	0.996276, 7.407669)

	(1.618086, 8.381914,
2	0.022030, 0.599780,	4000	0.00001	0.000000	117.082487
	0.996275, 7.407668)

Let x⁰=(9,9,5,9,1,9), q₀=1000, ϵ₀=0.1, η=0.01, N=2, the results by Algorithm 7 are shown in Table 5.

Table 5

Numerical results for Example 3 with x⁰ = (9, 9, 5, 9, 1, 9)

j	xj∗	q_j	ϵ_j	e_j	f(xj∗)
	(1.739398, 8.258030,
0	0.322256, 0.415438	1000	0.1	0.004508	116.962334
	1.000000, 7.580521)

	(1.739321, 8.260678,
1	0.322508, 0.416937,	2000	0.001	0.000000	117.001623
	0.999876, 7.583312)

The obtained approximately optimal solution is x^*=(1.739321,8.260678,0.322508,0.416937, 0.999876, 7.583312) with corresponding objective function value 117.001623. From [14], the obtained approximately optimal solution is x^*=(1.847052,8.152948,0.607878,0.244707,0.994467, 7.766359 ) with corresponding objective function value 117.038781.

One can see that the numerical results in Table 3, Table 4 and Table 5 are similar. This means that Algorithm 7 does not completely depend on how to choose a starting point in this example.

Demonstrated by the numerical examples, Algorithm 7 is applicable for finding approximate global solutions for inequality-constrained global optimization problems.

According to our experience, initially q₀ may be 0.1, 1, 5, 10, 100, 1000 or 10000, N = 2, 5, 10 or 100, and the iteration formula q = N q. The initial value of ϵ₀ may be 10, 5, 1, 0.5 or 0.1, η=0.5, 0.1, 0.05 or 0.01, and the iteration formula ϵ=ηϵ.

Supported by National Natural Science Foundation of China (71371107 and 61373027) and Natural Science Foundation of Shandong Provence (ZR2013AM013 and ZR2012AL07)

References

[1] Zangwill W I. Non-linear programming via penalty functions. Management Science, 1967, 13: 344–358.10.1287/mnsc.13.5.344Search in Google Scholar

[2] Bazaraa M S, Sherali H D, Shetty C M. Nonlinear programming: Theory and algorithms. 2nd ed. John Wiley Sons, Inc., New York, 1993.Search in Google Scholar

[3] Bai F S, Luo X Y. Modified lower order penalty functions based on quadratic smoothing approximation. Operations Research Transactions, 2012, 16: 9–22.Search in Google Scholar

[4] Lian S J. Smoothing approximation to l₁ exact penalty function for inequality constrained optimization. Applied Mathematics and Computation, 2012, 219: 3113–3121.10.1016/j.amc.2012.09.042Search in Google Scholar

[5] Sun X L, Li D. Value-estimation function method for for constrained global optimization. Journal of Optimization Theory and Applications, 1999, 24: 385–409.10.1023/A:1021736608968Search in Google Scholar

[6] Wang C Y, Zhao W L, Zhou J C, et al. Global convergence and finite termination of a class of smooth penalty function algorithms. Optimization Methods and Software, 2013, 28: 1–25.10.1080/10556788.2011.579965Search in Google Scholar

[7] Xu X S, Meng Z Q, Sun J W, et al. A second-order smooth penalty function algorithm for constrained optimization problems. Computational Optimization and Applications, 2013, 55: 155–172.10.1007/s10589-012-9504-9Search in Google Scholar

[8] Yang X Q, Meng Z Q, Huang X X, et al. Smoothing nonlinear penalty functions for constrained optimization. Numerical Functional Analysis and Optimization, 2003, 24: 351–364.10.1081/NFA-120022928Search in Google Scholar

[9] Yu C J, Teo K L, Zhang L S, et al. A new exact penalty function method for continuous inequality constrained optimization problems. Journal of Industrial and Management Optimization, 2010, 6: 895–910.10.3934/jimo.2010.6.895Search in Google Scholar

[10] Jiang M, Shen R, Xu X, et al. Second-order smoothing objective penalty function for constrained optimization problems. Numerical Functional Analysis and Optimization, 2014, 35: 294–309.10.1080/01630563.2013.811421Search in Google Scholar

[11] Rubinov A M, Yang X Q, Bagirov A M. Penalty functions with a small penalty parameter. Optimization Methods and Software, 2002, 17: 931–964.10.1080/1055678021000066058Search in Google Scholar

[12] Huang X X, Yang X Q. Convergence analysis of a class of nonlinear penalization methods for constrained optimization via first-order necessary optimality conditions. Journal of Optimization Theory and Applications, 2003, 116: 311–332.10.1023/A:1022503820909Search in Google Scholar

[13] Wu Z Y, Bai F S, Yang X Q, et al. An exact lower order penalty function and its smoothing in nonlinear programming. Optimization, 2004, 53: 51–68.10.1080/02331930410001662199Search in Google Scholar

[14] Meng Z Q, Dang C Y, Yang X Q. On the smoothing of the square-root exact penalty function for inequality constrained optimization. Computational Optimization and Applications, 2006, 35: 375–398.10.1007/s10589-006-8720-6Search in Google Scholar

[15] Lian S J. Smoothing approximation to the square-order exact penalty functions for constrained optimization. Journal of Applied Mathematics, 2013, Article ID 568316, 7 pages.10.1155/2013/568316Search in Google Scholar

Received: 2015-9-14

Accepted: 2015-10-16

Published Online: 2016-2-25

Articles in the same Issue

https://doi.org/10.1515/JSSI-2016-0087

Keywords for this article

constrained optimization; exact penalty function; square-root penalty function; optimal solution; smoothing method