On 3-by-3 row stochastic matrices

Nhi Pham; Ilya M. Spitkovsky

doi:10.1515/spma-2023-0103

Article Open Access

On 3-by-3 row stochastic matrices

Nhi Pham and Ilya M. Spitkovsky

Published/Copyright: September 27, 2023

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Special Matrices Volume 11 Issue 1

Abstract

The known constructive tests for the shapes of the numerical ranges in the 3-by-3 case are further specified when the matrices in question are row stochastic. Auxiliary results on the unitary (ir)reducibility of such matrices are also obtained.

Keywords: row stochastic matrix; numerical range; Kippenhahn curve; unitary reducibility

MSC 2010: 15A60; 15B51

1 Introduction

We will use the standard notations C and R for the fields of complex and real numbers, respectively. Also, X m × n will stand for the linear space (the algebra, if m = n ) of m -by- n matrices with the entries in X = R or C . As usual, X n × 1 will be abbreviated to just X n , ⟨ . , . ⟩ will stand for the scalar product on C n , and ∥ . ∥ will stand for the norm associated with it. We will use the standard notation

Re A = ( A + A * ) ∕ 2 , Im A = i ( A * − A ) ∕ 2

for any A ∈ C n × n , with Re A usually called the Hermitian part of A .

The numerical range W ( A ) of a matrix A ∈ C n × n is the range of the (continuous) function f A ( x ) ≔ ⟨ A x , x ⟩ considered on the unit sphere of C n , and thus a compact subset of C . It is also convex, per the celebrated Toeplitz-Hausdorff theorem. A plethora of articles is devoted to further description of W ( A ) for various classes of matrices; see [2, Chapter 1] or more recent [9, Chapter 6] for the detailed discussion of known results, as well as the history of the subject.

Recall, in particular, that for A ∈ C 2 × 2 , the numerical range W ( A ) is the elliptical disk with the foci at the eigenvalues λ 1 and λ 2 of A (the Elliptical Range theorem), for normal A degenerating into the line segment [ λ 1 , λ 2 ] . This immediately implies that for a unitarily reducible A ∈ C 3 × 3 , its numerical range is the convex hull of an ellipse and a point (lying inside or outside the ellipse), degenerating into a triangle, a line segment, or a point, when A is normal. For any n , as was shown by Kippenhahn [4] (see also the English translation [5]), W ( A ) is the convex hull of a certain algebraic curve C ( A ) of class n , nowadays called the Kippenhahn curve of A . For n = 3 and A unitarily irreducible, there are exactly three cases: (i) C ( A ) consists of an ellipse and a point inside, so that W ( A ) is an elliptical disk; (ii) C ( A ) is of degree four, while W ( A ) has a flat portion on its boundary ∂ W ( A ) ; and (iii) C ( A ) has degree six, and W ( A ) has an ovular (oval, in the terminology of [9]) shape; see [3] or [9, Section 6.2] for the tests to determine which of the three shapes materializes for a given A .

This article is prompted by Gau et al. [1] where row stochastic matrices were considered. We, therefore, recall that a matrix A ∈ R n × n is row stochastic if a i j ≥ 0 for i , j = 1 , … , n and ∑ j = 1 n a i j = 1 , i = 1 , … , n . The latter condition is equivalent to e ≔ [ 1 , … , 1 ] T being an eigenvector of A corresponding to the eigenvalue 1.

Furthermore, A ∈ R n × n is column stochastic if its transpose A T is row stochastic; matrices that are simultaneously row and column stochastic are called doubly stochastic. As the names suggest, row, column, and doubly stochastic matrices play an important role in probability theory and statistics.

We will be writing 3-by-3 row stochastic matrices as follows:

(1.1) A = 1 − ( a + b ) a b c 1 − ( c + d ) d e f 1 − ( e + f ) .

The entrywise nonnegativity requirement amounts to

(1.2) a , b , c , d , e , f ≥ 0 ; a + b ≤ 1 , c + d ≤ 1 , e + f ≤ 1 .

In the study by Gau et al. [1, Section 4], examples were provided of matrices (1.1) with W ( A ) being ovular or having a flat portion on the boundary (and thus unitarily irreducible). However, in the only example of an elliptical W ( A ) , the respective matrix A was unitarily reducible. One of the goals of our study is to establish the existence of unitarily irreducible matrices (1.1) with elliptical numerical ranges and to provide the method of their construction. This is done in Section 5. In the preceding Section 4, criteria are established for the Kippenhahn curve of row stochastic 3-by-3 matrices to contain an elliptical component. Section 3 contains the description of all matrices (1.1) with ∂ W ( A ) containing a line segment. A preliminary Section 2 contains the unitary reducibility criterion for matrices (1.1), and Section 6 is a short comment on matrices with ovular numerical ranges.

2 Unitary reducibility

Recall that a matrix A ∈ C n × n is unitarily reducible if it is unitarily similar to a block diagonal matrix, say B , with two or more diagonal blocks, and unitarily irreducible otherwise. Note that B can be chosen with a one-dimensional diagonal block, say [ λ ] , if and only if A and A * share an eigenvector, which in this case is called a reducing eigenvector of A (or A * ). This eigenvector corresponds to the eigenvalue λ of A , as well as λ ¯ of A * , and these eigenvalues are called normal. Furthermore, B can be chosen diagonal if and only if the matrix A is normal, in which case all eigenvectors of A are reducing.

Since e is an eigenvector of a row stochastic matrix A , the latter can be normal only if e is also an eigenvector of A T corresponding to the same eigenvalue 1, i.e., when A is doubly stochastic [1, Proposition 3.3(b)]. For n = 2 , the converse is true, because a doubly stochastic 2-by-2 matrix is symmetric. Also, normality for 2-by-2 matrices is of course equivalent to unitary reducibility.

Starting with n = 3 , however, there exist row stochastic matrices that are unitarily reducible but not doubly stochastic, and doubly stochastic matrices that are not normal. The normality and unitary reducibility criteria for row stochastic matrices are therefore of some interest.

Proposition 1

A row stochastic matrix (1.1) is normal if and only if either (i) a = c , b = e , and d = f or (ii) a = d = e and b = c = f .

Proof

Necessity. Equating the diagonal entries of A A T and A T A yields

(2.1) a 2 + b 2 = c 2 + e 2 , a 2 + f 2 = c 2 + d 2 , b 2 + d 2 = e 2 + f 2 .

Since A is in fact doubly stochastic, we also have

(2.2) a + b = c + e , a + f = c + d , b + d = e + f .

Comparing equation (2.1) with equation (2.2) we see that

(2.3) { a , b } = { c , e } , { a , f } = { c , d } , { b , d } = { e , f }

as nonordered pairs. The first equality in (2.3) yields a = c , b = e , or a = e , b = c . From the remaining two equalities, we derive that (i) holds in the former case and (ii) in the latter.

Sufficiency. The matrix (1.1) is real symmetric if (i) holds and a circulant if (ii) holds. Either way, it is normal.□

Recall that for any normal matrix A , its numerical range is the convex hull of its spectrum: W ( A ) = conv σ ( A ) . So, in the setting of Proposition 1, W ( A ) is the segment [ 1 − μ , 1 ] or the triangle with the vertices 1 , 1 − μ , 1 − μ ¯ , where μ is

a + b + d + a 2 + b 2 + d 2 − ( a b + a d + b d ) or 3 2 ( a + b ) ± i 3 2 ( a − b )

in case (i) or (ii), respectively.

The normality criterion of 3-by-3 row stochastic matrices in terms of their numerical ranges goes back to [8, Theorem 2.3].

Proposition 2

A row stochastic matrix (1.1) is unitarily reducible if and only if

(2.4) a + b + a e − b c f − d = c + d + a d − c f b − e = e + f + d e − b f c − a .

By convention, the equality in equation (2.4) involving an expression with the zero denominator holds if the respective numerator is also zero.

Proof

Case 1. A is normal, and thus unitarily reducible. Then, one of the conditions (i) or (ii) of Proposition 1 holds. For matrices satisfying (i), all three denominators in equation (2.4) vanish, and so equation (2.4) holds by convention. On the other hand, if condition (ii) but not (i) is met, equation (2.4) takes the form 0 = 0 = 0 , and thus also holds.

Case 2. A is not normal. Recall that for 3-by-3 matrices, having a reducing eigenvector is not only sufficient but also necessary to be unitarily reducible. Suppose that such an eigenvector x exists and corresponds to the eigenvalue λ of A .

If λ ∉ R , then x ¯ is also a reducing vector of A , corresponding to the eigenvalue λ ¯ ≠ λ . The vectors x and x ¯ are, therefore, linearly independent, and the orthogonal complement to their span is a reducing subspace of A . Consequently, A is normal – a contradiction.

So, λ must be real, and along with A x = λ x , we have A T x = λ x , implying x ∈ ker ( A − A T ) . Since A is not normal, it cannot be symmetric, the kernel of A − A T is one-dimensional, and (up to inconsequential scalar multiple)

x = [ f − d , b − e , c − a ] T .

It remains to be checked if this x is indeed an eigenvector of A : if it is, then, being an eigenvector of A − A T , it is also an eigenvector of A T .

A direct computation shows that A x and x are collinear if and only if equation (2.4) holds. This completes the proof.□

Remark 1

In Case 2, the common value of the expressions in equation (2.4) is actually 1 − λ , where λ is the (unique) normal eigenvalue of A .

If λ ≠ 1 (equivalently, A is not doubly stochastic), the reducing eigenvector x has to be orthogonal to e :

(2.5) a + d + e = b + c + f .

Now consider (2.5), along with any two of the equations in (2.4), as the system of three polynomial equations. Using Mathematica, we observe that its Gröbner basis can be chosen as consisting of (2.5) and either one of the equations (2.4). Note that the former is nothing but the orthogonality condition of x and the eigenvector [ c f + c e + d e , a e + a f + b f , a d + b c + b d ] T of A T corresponding to the eigenvalue 1. The unitary reducibility criterion for a nonnormal A simplifies respectively.

Remark 2

Propositions 1 and 2 hold for any A ∈ R 3 × 3 of the form (1.1), not necessarily row stochastic. Indeed, the nonnegativity constraints (1.2) are not used in their proofs.

3 Multiply generated points on the boundary of W ( A )

As any convex set in C , the numerical range W ( A ) of A ∈ C n × n is completely defined by the family of its supporting lines. The latter are given by the formula

ℓ θ ≔ e i θ ( λ A ( θ ) + i R ) , θ ∈ [ − π , π ] ,

where λ A ( θ ) is the maximal eigenvalue of Re ( e − i θ A ) . As in the study by Leake et al. [6], we will say that ℓ θ is an exceptional supporting line of W ( A ) if there exists z ∈ ℓ θ for which f A − 1 ( z ) contains linearly independent vectors. This happens if and only if λ A ( θ ) is a multiple eigenvalue.

If A is real, then the characteristic polynomials of Re ( e i θ A ) and Re ( e − i θ A ) are the same, and so the supporting lines ℓ ± θ are exceptional (or not) only simultaneously. If they are, each of the matrices Re ( e ± i θ A ) has a multi-dimensional eigenspace. For θ ≠ 0 mod π , it implies that the intersection ℒ of these eigenspaces is invariant under both A and A T . In the three-dimensional setting, ℒ is guaranteed to be nonzero, and so A is unitarily reducible.

Since we are interested in the case when ∂ W ( A ) contains a multiply generated point while A is unitarily irreducible, let us therefore restrict our attention to θ = 0 .

Proposition 3

Let A ∈ R 3 × 3 be a row stochastic matrix. Then, Re A has a multiple eigenvalue if and only if either

A is permutationally similar to 1 0 0 0 1 − d d 0 d 1 − d with some d ∈ [ 0 , 1 ] , or
A is as in (1.1), with
(3.1) a = t , b = x − t , c = u − t , d = y − u + t , e = v − x + t , f = u + w − y − t .
Here, u , v , and w are subject to constraints
(3.2) u v + u w , v u + v w , w u + w v ∈ [ 3 − 1 , 6 + 2 ]
and
(3.3) 1 + 1 2 min u v w , v w u , u w v ≥ 1 3 ( u + v + w ) + 1 6 u v w + v w u + u w v ≕ ω ,
x , y , and z are given by
(3.4) x = ω − u v 2 w , y = ω − u w 2 v , z = ω − v w 2 u ,
while t satisfies
(3.5) max { 0 , u − y , x − v } ≤ t ≤ min { x , u , u + w − y } .

Proof

Necessity. Denote

(3.6) a + b = x , c + d = y , e + f = z , a + c = u , b + e = v , d + f = w .

Then,

(3.7) Re A = 1 − x u ∕ 2 v ∕ 2 u ∕ 2 1 − y w ∕ 2 v ∕ 2 w ∕ 2 1 − z ,

and for the latter to have a multiple eigenvalue λ , it is necessary and sufficient that all 2 × 2 minors of Re A − λ I are equal to 0. Computing the cofactors of the (3,2), (3,1), and (1,2) entries, we have

(3.8) u v = 2 w ( 1 − x − λ ) , u w = 2 v ( 1 − y − λ ) , v w = 2 u ( 1 − z − λ ) .

Case 1. u v w = 0 .

From equation (3.8) it is then easy to see that at least two of u , v , and w have to equal zero. Suppose, for example, that u = v = 0 . The other two situations can be treated similarly. Then, a = b = c = e = 0 , and A takes the form

1 0 0 0 1 − d d 0 f 1 − f , with the real part 1 0 0 0 1 − d ( d + f ) ∕ 2 0 ( d + f ) ∕ 2 1 − f .

The latter matrix has a multiple eigenvalue if and only if d = f , in which case A itself is exactly as in (i).

Case 2. u v w ≠ 0 .

Equation (3.8) can be rewritten as follows:

x + u v 2 w = y + u w 2 v = z + v w 2 u ( = 1 − λ ) .

Relabeling 1 − λ ≕ ω , we arrive at equation (3.4). In turn, adding the three equalities in equation (3.4), we obtain

x + y + z = 3 ω − 1 2 u v w + v w u + u w v .

But x + y + z = u + v + w , which agrees with the formula for ω in equation (3.3). Finally, solving equations (3.6) for a , b , c , d , and e yields equations (3.1).

We now turn to proving the inequality constraints (3.2) and (3.3). The nonnegativity of x given by the first formula in equation (3.4) implies

u + v + w + u w 2 v + v w 2 u ≥ u v w .

Multiplying both sides by w u v , we arrive at the quadratic inequality ξ + 1 2 ξ 2 ≥ 1 for ξ = w u + w v . Since ξ > 0 , this yields one of the lower bounds in equation (3.2). The other two follow along the same lines from the nonnegativity of y and z . The inequalities in (3.3) are just the restatements of x , y , z ≤ 1 , and (3.5) is equivalent to the nonnegativity of a , b , c , d , e , and f due to equations (3.1).

Finally, observe that in order for inequalities (3.5) to be consistent, it is necessary and sufficient that

x ≤ u + v , u ≤ x + y , y ≤ u + w .

Plugging in x , y , and z from equations (3.4) and solving the resulting inequalities yields the upper bounds in (3.2). So, (ii) holds.

Sufficiency. In case (i), A is symmetric, and Re A = A has a multiple eigenvalue 1. In case (ii), going over our reasoning in the first part of the proof, we see that conditions (3.2) and (3.3) guarantee the entrywise nonnegativity of A . Moreover, since u v w ≠ 0 , condition (3.8) implies that all the 2-by-2 minors of the matrix (3.7), not just the chosen three, vanish.□

Proposition 3 yields the following constructive description of all row stochastic 3-by-3 matrices, the real part of which has a multiple eigenvalue. Namely, conditions (3.2) can be rewritten as a system of linear inequalities in two variables ξ = u ∕ v and η = u ∕ w as follows:

max { 3 − 1 − ξ , ( 3 − 1 ) ξ − 1 , ( 3 ∕ 2 − 1 ) ( 1 + ξ ) } ≤ η ≤ min { 6 + 2 − ξ , ( 6 + 2 ) ξ − 1 , ( 3 + 1 ) ( ξ + 1 ) ∕ 2 } .

Solving this system, we arrive at the solution of (3.2) in the form

{ ( u , u ∕ ξ ) , u ∕ η : u > 0 , ( ξ , η ) ∈ H } ,

where H is the hexagon (including the boundary ∂ H ) with the vertices

A ≔ ( 3 + 2 , 6 + 2 − 3 − 2 ) ≈ ( 3.146 , 1.303 ) , B ≔ ( 2 + 1 , 6 − 2 + 3 − 2 ) ≈ ( 2.414 , 0.767 ) , C ≔ ( 2 − 1 , 3 − 2 ) ≈ ( 0.414 , 0.318 ) ,

and their reflections A ′ , B ′ , and C ′ through the first quadrant bisector (Figure 1).

$Figure 1 (a) Admissible values of ( ξ , η ) \left(\xi ,\eta ) and (b) U U as a function of ( ξ , η ) \left(\xi ,\eta ) .$

Figure 1

(a) Admissible values of ( ξ , η ) and (b) U as a function of ( ξ , η ) .

With ξ and η fixed, condition (3.3) can be rewritten as the upper bound for u as follows:

(3.9) u ≤ 6 ξ η ( ξ + η + 1 ) 2 − 3 min { ξ 2 , η 2 , 1 } .

Denote by U ( ξ , η ) the right-hand side of the inequality (3.9). This is a continuous and piece-wise differentiable function of ξ and η on H , as shown in Figure 1(b). Its breaks in differentiability occur on the line segments in H stemming from the point ( 1 , 1 ) , as shown in Figure 1(a).

In agreement with this,

S = { ( U ( ξ , η ) , U ( ξ , η ) ∕ ξ , U ( ξ , η ) ∕ η ) : ( ξ , η ) ∈ H }

is a piece-wise smooth surface in the first octant of R 3 , as shown in Figure 2.

$Figure 2 Base S {\mathcal{S}} of the admissible cone of the triples ( u , v , w ) \left(u,v,w) .$

Figure 2

Base S of the admissible cone of the triples ( u , v , w ) .

The triples ( u , v , w ) satisfying conditions (3.2) and (3.3) form the cone C with the vertex at the origin and S as its base.

For each positive value of u satisfying inequality (3.9), we then obtain a family of matrices (1.1) parametrized by t as in (3.5). This family is infinite (and even uncountable) whenever the interval defined by (3.5) is proper, which happens if and only if the right bounds in inclusions (3.2) are strict. In terms of the pairs ( ξ , η ) this excludes the dotted portion of ∂ H .

On the other hand, substituting (3.1) into equation (2.5) and solving the resulting linear equation in t yield,

t = u 2 + 1 6 ( w − v ) u v + u w + 1 .

Consequently, all (except at most one) matrices A corresponding to the fixed triple ( u , v , w ) are unitarily irreducible. According to [6, Theorem 3.2], the vertical supporting line of W ( A ) corresponding to the multiple eigenvalue of Re A cannot have a single point of intersection with W ( A ) , and thus contains a flat portion on the boundary.

To illustrate, recall [1, Example 4.5], where

1 6 1 2 3 2 30 − 7 13 − 2 30 0 12 − 2 30 8 30 − 39 33 − 6 30

was suggested as a specific row stochastic 3-by-3 matrix with a flat portion on the boundary of its numerical range. This matrix corresponds to

( ξ , η ) = 9 + 4 30 21 , 15 + 2 30 21 ≈ ( 1.472 , 1.236 ) ,

lying in the interior of H , u = 2 30 − 5 6 , and t = 1 ∕ 6 .

A simpler, not involving irrationalities, example can be constructed as follows. Let ( ξ , η ) = ( 3 ∕ 2 , 1 ) , which is also an interior point of H . Condition (3.9) then becomes u ≤ 36 ∕ 37 , and so u = 2 ∕ 5 is a valid choice. Respectively, v = 4 ∕ 15 and w = 2 ∕ 5 , while formulas (3.4) yield

x = z = 37 ∕ 90 , y = 22 ∕ 90 .

The constraints (3.5) on t simplify to t ∈ [ 7 ∕ 45 , 2 ∕ 5 ] , making t = 1 ∕ 3 a valid option. Finally, from (3.1),

(3.10) A = 1 90 53 30 7 6 68 16 17 20 53 .

Direct computations show that with this choice of A , equalities (2.4) fail, and so A is unitarily irreducible. On the other hand, the eigenvalues of

Re A = 1 90 53 18 12 18 68 18 12 18 53

are 41/90 of multiplicity two, and 46/45. Moreover, the compression of Im A onto the two-dimensional eigenspace of Re A has the eigenvalues ± 7 17 ∕ 306 ≈ ± 0.0943 . Respectively, the boundary of W ( A ) contains a vertical line segment with the endpoints 41 90 ± i 7 17 306 , as illustrated in Figure 3.

$Figure 3 W ( A ) W\left(A) for the matrix given by (3.10).$

Figure 3

W ( A ) for the matrix given by (3.10).

4 Matrices with their Kippenhahn curve containing an ellipse

For an arbitrary A ∈ C 3 × 3 , let { λ j } j = 1 3 be the set of its eigenvalues. According to [3, Theorem 2.3] (see also [9, Theorem 6.2.6]), the Kippenhahn curve of A consists of a nondegenerate ellipse and a point if and only if the following two conditions hold:

(4.1) D ≔ Tr ( A * A ) − ∑ j = 1 3 ∣ λ j ∣ 2 > 0

and the number

(4.2) λ ≔ Tr A + D − 1 ∑ j = 1 3 ∣ λ j ∣ 2 λ j − Tr ( A * A 2 )

coincides with one of λ j .

If these conditions are satisfied, then

(4.3) C ( A ) = E ∪ { λ } ,

the ellipse E has its foci at two other eigenvalues λ ± of A and minor axis of length D . Furthermore, according to [3, Theorem 2.4] or [9, Corollary 6.2.7], λ lies inside the ellipse E , and thus W ( A ) is the elliptical disk bounded by E , if and only if

(4.4) ( ∣ λ + − λ ∣ + ∣ λ − − λ ∣ ) 2 − ∣ λ + − λ − ∣ 2 ≤ D .

For A as in (1.1), we have

(4.5) Tr ( A * A ) = 3 + 2 ( a 2 + b 2 + c 2 + d 2 + e 2 + f 2 + a b + c d + e f ) − 2 ( a + b + c + d + e + f ) ,

while its eigenvalues λ 1 , 2 different from λ 3 = 1 satisfy

(4.6) λ 1 + λ 2 = 2 − ( a + b + c + d + e + f ) , λ 1 λ 2 = det A = 1 − ( a + b + c + d + e + f ) + c e + b e + c f + a e + d e + a d + b d + a f + b f .

From formulas (4.6), we have in particular that

(4.7) λ 1 2 + λ 2 2 = 2 − 2 ( a + b + c + d + e + f ) + ( a + b + c + d + e + f ) 2 − 2 ( c e + b e + c f + a e + d e + a d + b d + a f + b f ) .

These eigenvalues are either both real or complex conjugates of each other, depending on the sign of the discriminant

(4.8) Δ = a 2 + b 2 + c 2 + d 2 + e 2 + f 2 + 2 ( a b + a c + b e + c d + d f + e f ) − 2 ( a d + a e + a f + b c + b d + b f + c e + c f + d e ) .

Consequently, for a row stochastic 3-by-3 matrix A with the Kippenhahn curve as in equation (4.3), the major axis of the ellipse E is either horizontal or vertical.

Proposition 4

The Kippenhahn curve of a row stochastic matrix (1.1) contains an ellipse with a vertical major axis if and only if A is in fact doubly stochastic and

(4.9) ( a + 2 b + c + 2 d ) 2 − 12 ( b c + a d + b d ) < 0 .

Proof

Necessity. The foci of E are nonreal, and therefore,

(4.10) λ ± = λ 1 , 2 = ξ ± i η with η ≠ 0 ,

C ( A ) = E ∪ { 1 } , and

∑ j = 1 3 ∣ λ j ∣ 2 = 1 + 2 ( ξ 2 + η 2 ) = 1 + ( λ 1 2 + λ 2 2 ) + 4 η 2 .

Using formula (4.7), we can further write

∑ j = 1 3 ∣ λ j ∣ 2 = 3 + 4 η 2 − 2 ( a + b + c + d + e + f ) + ( a + b + c + d + e + f ) 2 − 2 ( c e + b c + c f + a e + d e + a d + b d + a f + b f ) .

From here and conditions (4.1) and (4.5),

D = ( a − c ) 2 + ( b − e ) 2 + ( d − f ) 2 − 4 η 2 .

On the other hand, the left-hand side of inequality (4.4) takes the form

( ∣ λ 1 − 1 ∣ ) + ( ∣ λ 2 − 1 ∣ ) 2 − ∣ λ 1 − λ 2 ∣ 2 = 4 ( ( ξ − 1 ) 2 + η 2 ) − 4 η 2 = ( λ 1 + λ 2 − 2 ) 2

which, due to formulas (4.6), coincides with ( a + b + c + d + e + f ) 2 and is thus strictly larger than D . Thus, 1 lies outside of E and hence is a corner point of conv { E , 1 } = W ( A ) . As such, by the Donoghue theorem [9, Proposition 2.6], it is a normal eigenvalue of A , and the respective eigenvector e is reducing. In other words, A is doubly stochastic.

Finally, for doubly stochastic matrices, the discriminant (4.8) boils down to the left-hand side of inequality (4.9). Therefore, condition (4.9) is equivalent to the eigenvalues λ 1 , 2 not being real.

Sufficiency. If A is doubly stochastic, then it is unitarily reducible to a direct sum B ⊕ [ 1 ] with the spectrum { λ 1 , λ 2 } of B ∈ C 2 × 2 as given by (4.10), due to condition (4.9). Consequently, C ( A ) = E ∪ { 1 } , and the major axis of E passing through λ 1 , 2 is, therefore, vertical.□

So, the numerical range of a unitarily irreducible row stochastic 3-by-3 matrix cannot be a “vertical” elliptical disk. Moving to the case of “horizontal” E in equation (4.3), let us take into consideration that all λ j are real and rewrite formula (4.2) as follows:

(4.11) D ( 1 − λ ) = Tr ( A * A 2 ) − D ( λ 1 + λ 2 ) − ( λ 1 3 + λ 2 3 + 1 ) .

Formula (4.1) for D in the case of real λ j , combined with equations (4.5) and (4.7), implies

(4.12) D = ( a − c ) 2 + ( b − e ) 2 + ( d − f ) 2 .

Furthermore, with the use of equations (4.6) and (4.7), we have:

(4.13) λ 1 3 + λ 2 3 = − ( a 3 + b 3 + c 3 + d 3 + e 3 + f 3 ) + a 2 ( 3 − 3 b − 3 c ) + a ( − 3 + 6 b − 3 b 2 + 6 c − 3 b c − 3 c 2 − 3 c d − 3 b e + 3 d e ) + 3 ( b 2 + c 2 + d 2 + e 2 + f 2 ) − 3 ( b + c + d + e + f ) − 3 c 2 d − 3 c d 2 − 3 b 2 e − 3 b e 2 − 3 d 2 f − 3 e 2 f − 3 d f 2 − 3 e f 2 + 3 b c f − 3 c d f − 3 b e f − 3 d e f + 6 d f + 6 c d + 6 b e + 6 e f + 2 ,

while a direct computation shows that

(4.14) Tr ( A * A 2 ) = − 2 a 3 − 2 b 3 − 2 c 3 − 2 d 3 − 2 e 3 − 2 f 3 + 5 a 2 + 5 b 2 + 5 c 2 + 5 d 2 + 5 e 2 + 5 f 2 − 3 a − 3 b − 3 c − 3 d − 3 e − 3 f + 3 − 4 a 2 b − 4 a b 2 − 4 c d 2 − 4 c 2 d − 4 e f 2 − 4 e 2 f − 2 a 2 c − 2 a c 2 − 2 b 2 e − 2 b e 2 − 2 d 2 f − 2 d f 2 − b c 2 − a 2 d − d 2 e − a e 2 − c f 2 − b 2 f + 6 a b + 6 c d + 6 e f + 2 a c + 2 b e + 2 d f − a b c + a b d − a c d + b c d − a b e + c d e + a b f − c d f + a e f − b e f + c e f − d e f .

According to equations (4.6) and (4.12)–(4.14), the right-hand side of equation (4.11) is, therefore, nothing but

(4.15) Φ ≔ a 2 ( e + f ) + a ( b d + d 2 − 2 c e − 3 d e + b f − 2 c f − 2 d f + e f + f 2 ) + b 2 c + b 2 d + b c d + b d 2 − 2 b c e + c 2 e − 2 b d e + c d e + c e 2 + d e 2 − 3 b c f + c 2 f − 2 b d f + c e f + b f 2 ,

a homogeneous cubic polynomial in a , b , c , d , e , and f .

Proposition 5

Let A be given by (1.1) and (1.2). Then, C ( A ) = E ∪ { 1 } , where E is an ellipse with real foci, if and only if A is not symmetric and either

it is in fact doubly stochastic and
(4.16) ( a + 2 b + c + 2 d ) 2 − 12 ( b c + a d + b d ) ≥ 0
or
A − I contains two zero rows or a zero row and a zero column passing through the same diagonal entry.

Proof

Comparing the desired structure of C ( A ) with equation (4.3), we see that it materializes if and only if D > 0 (equivalently: A is not normal) and the right-hand side of formula (4.2) is equal to 1 (equivalently: Φ defined by formula (4.15) is equal to 0). The first condition explains why A cannot be symmetric. In addition, inequality (4.16) is equivalent to the eigenvalues of A (and thus, the foci of E ) being real, provided that A is doubly stochastic. It suffices, therefore, to show that Φ = 0 if and only if A is either doubly stochastic or as described in case (ii).

Solving the extrema problem for Φ , we observe the following: under the additional condition that at most one of the parameters a , b , c , d , e , and f is equal to 0, Φ ( a , b , c , d , e , f ) is nonnegative and, moreover, it is equal to 0 if and only if

(4.17) c − a = b − e = f − d .

This is exactly the criterion for A to be doubly stochastic.

It remains to tackle the cases in which at least two of the variables are equal to 0 and show that either equalities (4.17) hold or A is as described in condition (ii). Without loss of generality, we may suppose that a = 0 , since all other situations can be reduced to this one by appropriate permutational similarities of A and relabeling the variables as needed. We, therefore, have five cases to consider.

Case 1. a = b = 0 . Then, Φ takes the form ( c e + d e + c f ) ( c + e ) , and hence is equal to 0 if and only if c = e = 0 , or c = d = 0 , or e = f = 0 . Respectively, A − I has all zeroes in the first row and column, the first two rows, or the first and third rows. All these cases fall under condition (ii).

Case 2. a = c = 0 while b ≠ 0 . Then, Φ takes the form d ( b − e ) 2 + b ( d − f ) 2 , and hence is equal to 0 if and only if d = f = 0 or b = e , d = f . The matrix A − I has all zeroes in the second row and column in the former case, which falls under condition (ii). The latter case should be excluded, since A is then symmetric.

Case 3. a = d = 0 while b , c ≠ 0 . Then,

(4.18) Φ = c 2 ( e + f ) + c ( b 2 + e 2 + e f − 2 b e − 3 b f ) + b f 2 .

Solving the extrema problem for Φ under these constraints, we conclude that it is nonnegative and is equal to 0 if and only if c = f and b = e + f . This falls under conditions (4.17).

Case 4. a = e = 0 while b , c , d ≠ 0 . Then,

Φ = b 2 ( c + d ) + b ( d 2 + f 2 + c d − 2 d f − 3 c f ) + c 2 f ,

which is nothing but expression (4.18) after the change of variable c → b → f → c and e → d . Consequently, Φ vanishes if and only if b = c and f = c + d , which again implies equalities (4.17).

Case 5. a = f = 0 while b , c , d , and e ≠ 0 . Then,

Φ = ( c + d ) ( ( b − e ) 2 + b d + c e ) > 0 ,

so this case should be excluded.

This exhausts all the possibilities, and thus completes the proof.□

5 Ellipticity of the numerical range

Note that row stochastic matrices A satisfying condition (ii) of Proposition 5 all have 1 as a multiple eigenvalue. So, the elliptical component E of their Kippenhahn curves has 1 as a focus. Consequently, the numerical range of such matrices is the elliptical disk bounded by E .

In contrast with this, for matrices A as in Proposition 5 part (i), 1 is a simple eigenvalue. Indeed, since A is doubly stochastic, 1 is its normal eigenvalue. Therefore, if its (algebraic) multiplicity were greater than 1, so would its geometric multiplicity. Since n = 3 , the rank of A − I would not exceed 1. From the sign pattern of A − I , it would then follow that A is nothing but the identity matrix. This is in contradiction with A not being symmetric.

So, the foci of E , coinciding with the eigenvalues of A , are both different from 1. Moreover, the point one actually lies outside of E . Indeed, the left-hand side of inequality (4.4) simplifies to 4 ( 1 − ( λ 1 + λ 2 ) + λ 1 λ 2 ) , which, due to equations (4.6) can be rewritten as 4 ( c e + b e + c f + a e + d e + a d + b d + a f + b f ) .

Invoking equalities (4.17), we can exclude the variables e and f and rewrite this further as 4 ( a b + b 2 + b c + 3 a d + 3 b d ) . On the other hand, formulas (4.12) and (4.17) imply that D = 3 ( a − c ) 2 . So, inequality (4.4) takes the form

(5.1) 4 ( a b + b 2 + b c + 3 a d + 3 b d ) ≤ 3 ( a − c ) 2 .

A somewhat tedious manipulation with quadratic inequalities shows that under the constraints c ≤ a + b and a ≤ c + d (which of course hold for a doubly stochastic A ) condition (5.1) actually fails. So, 1 is a corner point of W ( A ) , which therefore is not elliptical. Combining this observation with Proposition 4, we arrive at the following conclusion.

Theorem 6

A row stochastic 3-by-3 matrix A cannot have an elliptical numerical range with both foci different from 1. If W ( A ) contains an arc of such an ellipse on its boundary, then A is actually doubly stochastic, and 1 is a corner point of W ( A ) .

Note that for a priori doubly stochastic matrices A , the result of Theorem 6 is implicitly contained in the study by Nylen and Tam [8, Theorem 2.4]. For example, according to this theorem, the endpoints x ± z ∕ 2 and x ± i y ∕ 2 of an ellipse E ⊂ C ( A ) are such that max 2 y 3 , z 6 + 2 3 ( 1 − x ) and min 1 − x − y 3 , z 3 + 2 3 ( 1 − x ) are separated by some α ∈ [ 0 , 1 ] . In particular, z 6 + 2 3 ( 1 − x ) < 1 − x , and so x + z 2 < 1 .

Also, the first sentence in the statement of Theorem 6 is actually a particular case of the following fact:

If A is any nonnegative (not necessarily row stochastic) n -by- n matrix (with n arbitrary) with the numerical range W ( A ) bounded by an ellipse E , then the foci of E are real, and the larger focus coincides with the spectral radius of A .

As was pointed out by the anonymous reviewers, this statement can be derived from Corollary 4.6 and Theorem 4.8 in the study by Maroulas et al. [7].

Let us now turn to the situation when the Kippenhahn curve of A contains an ellipse E with one of the foci located at the point 1. This happens if and only if equation (4.11) is satisfied, with λ in its left-hand side being one of λ 1 , 2 given by

λ 1 , 2 = 1 − a + b + c + d + e + f 2 ± Δ ,

Δ as in equation (4.8) and nonnegative. Equivalently,

(5.2) 1 2 D ( a + b + c + d + e + f ± 2 Δ ) = Φ .

Recall that D and Φ are given by formulas (4.12) and (4.15), respectively, and requirement (5.2) boils down to a homogeneous polynomial equation of degree 6. Finding its general solution seems like a difficult task. However, equation (5.2) can be used to generate numerical examples of row stochastic 3-by-3 matrices with elliptical numerical ranges.

Indeed, choosing five of the positive parameters a , b , c , d , and e at random, we can solve equation (5.2) for the remaining parameter, picking a positive solution (if any). Then, by scaling, it is always possible to bring the solution in compliance with the last three inequalities in (1.2).

Example 1

Let b = 0.11 , c = 0.12 , d = 0.42 , e = 0.83 , and f = 0.16 .

Using Mathematica, we find a ≈ 0.048 , and so

A ≈ 0.842 0.048 0.11 0.12 0.46 0.42 0.83 0.16 0.01 .

The eigenvalues of A are λ 1 ≈ 0.472632 , λ 2 ≈ − 0.160632 , and λ 3 = 1 , and equality (5.2) holds with the + sign. Therefore, C ( A ) = E ∪ { λ } , with λ = λ 1 lying between the foci λ 2 and λ 3 of E . Consequently, W ( A ) is the elliptical disk bounded by E , as illustrated in Figure 4.

$Figure 4 Elliptical W ( A ) W\left(A) of the (unitarily irreducible) matrix from Example 1.$

Figure 4

Elliptical W ( A ) of the (unitarily irreducible) matrix from Example 1.

A direct computation shows that

a + b + a e − b c f − d ≈ 0.0555 , c + d + a d − c f b − e ≈ 0.5387 , e + f + d e − b f c − a ≈ 5.5872

are pairwise distinct. By Proposition 2, the matrix under consideration is unitarily irreducible.

Recall that Proposition 5 describes all unitarily reducible matrices (1.1) with elliptical numerical ranges and 1 as a normal eigenvalue. There also exist plenty of matrices with elliptical W ( A ) and a normal eigenvalue different from 1. For example, any solution to the system consisting of equations (2.5) and (5.2), either one of the expressions in equation (2.4), and satisfying also the inequalities (1.2) generates a matrix (1.1) with C ( A ) as in (4.3) for which λ + = 1 , λ − = λ 1 (or λ 2 ), and λ = λ 2 (resp., λ 1 ) is a normal eigenvalue. The numerical range of A is then either an elliptical disk bounded by E or the ice-cone shaped conv C ( A ) with λ as a corner point, depending on whether (4.4) holds.

To illustrate, consider matrices (1.1) with a = c , b = d , and e = f :

(5.3) A = 1 − ( a + b ) a b a 1 − ( a + b ) b e e 1 − 2 e .

The eigenvalues of the matrix (5.3) are

(5.4) λ 1 = 1 − b − 2 e , λ 2 = 1 − 2 a − b , and λ 3 = 1 ,

with λ 2 being the normal one. Condition (5.2) can be verified directly and implies formula (4.3). In turn, with λ j given by (5.4) condition (4.4) holds if and only if

a ≤ 2 e − b + 3 b 2 + 6 e 2 4 .

In particular, it holds if a ≤ e . Indeed, then λ 2 = λ lies between the foci λ 1 , 1 of E .

Example 2

Let in formula (5.3) a = 0.13 , b = 0.68 , e = 0.34 , i.e.,

A = 0.19 0.13 0.68 0.13 0.19 0.68 0.34 0.34 0.32 .

The numerical range of A is the elliptical disk with the foci 1 and − 0.36 , and the normal eigenvalue 0.06 of A lying between them, as illustrated in Figure 5.

$Figure 5 Elliptical W ( A ) W\left(A) of the (unitarily reducible) matrix from Example 2.$

Figure 5

Elliptical W ( A ) of the (unitarily reducible) matrix from Example 2.

6 Final remark

Recall that the set R of row stochastic 3-by-3 matrices is semialgebraic. Propositions 2 and 3, and Theorem 6 along with equation (5.2) imply, respectively, that the set of the matrices A ∈ R , which are unitarily reducible or have a line segment or an elliptical arc contained in ∂ W ( A ) , is also semialgebraic but of dimension lower than that of R itself. So, generically, matrices of R have an ovular numerical range.

One such matrix, randomly generated, is provided in the following example:

$Figure 6 Ovular W ( A ) W\left(A) of the matrix from Example 3.$

Figure 6

Ovular W ( A ) of the matrix from Example 3.

Example 3

(see Figure 6)

A = 0.33 0.35 0.32 0.90 0.02 0.08 0.48 0.48 0.04 .

ilya@math.wm.edu, imspitkovsky@gmail.com

Acknowledgments

This work is based on the capstone project of the first author [NP] supervised by the second author [IS]. The latter was supported in part by Faculty Research funding from the Division of Science, New York University Abu Dhabi. The authors are thankful to the anonymous reviewers for carefully reading the manuscript and for several helpful suggestions.

Conflict of interest: The second author is a member of the Editorial Advisory Board of the journal, but this did not affect the final decision for the article.
Data availability statement: Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

References

[1] H.-L. Gau, K.-Z. Wang, and P. Y. Wu, Numerical ranges of row stochastic matrices, Linear Algebra Appl. 506 (2016), 478–505. 10.1016/j.laa.2016.06.010Search in Google Scholar

[2] R. A. Horn and C. R. Johnson, Topics in Matrix Analysis, Cambridge University Press, Cambridge, 1994. Search in Google Scholar

[3] D. Keeler, L. Rodman, and I. Spitkovsky, The numerical range of 3×3 matrices, Linear Algebra Appl. 252 (1997), 115–139. 10.1016/0024-3795(95)00674-5Search in Google Scholar

[4] R. Kippenhahn, Über den Wertevorrat einer Matrix, Math. Nachr. 6 (1951), 193–228. 10.1002/mana.19510060306Search in Google Scholar

[5] R. Kippenhahn, On the numerical range of a matrix, Linear Multilinear Algebra 56 (2008), no. 1–2, 185–225. 10.1080/03081080701553768Search in Google Scholar

[6] T. Leake, B. Lins, and I. M. Spitkovsky, Pre-images of boundary points of the numerical range, Operators Matrices 8 (2014), 699–724. 10.7153/oam-08-39Search in Google Scholar

[7] J. Maroulas, P. J. Psarrakos, and M. J. Tsatsomeros, Perron-Frobenius type results on the numerical range, Linear Algebra Appl. 348 (2002), 49–62. 10.1016/S0024-3795(01)00574-2Search in Google Scholar

[8] P. Nylen and T. Y. Tam, Numerical range of a doubly stochastic matrix, Linear Algebra Appl. 153 (1991), 161–176. 10.1016/0024-3795(91)90216-JSearch in Google Scholar

[9] P. Y. Wu and H.-L. Gau, Numerical ranges of Hilbert space operators, Encyclopedia of Mathematics and its Applications, vol. 179, Cambridge University Press, Cambridge, 2021. Search in Google Scholar

Received: 2023-04-14

Revised: 2023-08-02

Accepted: 2023-08-14

Published Online: 2023-09-27

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

https://doi.org/10.1515/spma-2023-0103

Keywords for this article

row stochastic matrix; numerical range; Kippenhahn curve; unitary reducibility

Creative Commons

BY 4.0