Home Mathematics The diameter of the Birkhoff polytope
Article Open Access

The diameter of the Birkhoff polytope

  • Ludovick Bouthat , Javad Mashreghi and Frédéric Morneau-Guérin EMAIL logo
Published/Copyright: February 24, 2024

Abstract

The geometry of the compact convex set of all n × n doubly stochastic matrices, a structure frequently referred to as the Birkhoff polytope, has been an active subject of research as of late. Geometric characteristics such as the Chebyshev center and the Chebyshev radius with respect to the operator norms from n p to n p and the Schatten p -norms, both for the range 1 p , have only recently been studied in depth. In this article, we continue in this vein by determining the diameter of the Birkhoff polytope with respect to the metrics induced by the aforementioned matrix norms.

MSC 2010: 15B51; 46B20; 52B12; 47B10

1 Introduction

A square matrix A = [ a i j ] with nonnegative real entries is said to be row-stochastic (or right stochastic) if

j = 1 n a i j = 1 , i = 1 , , n .

Similarly, it is said to be column-stochastic (or left stochastic) if

i = 1 n a i j = 1 , j = 1 , , n .

A doubly stochastic matrix is one that is both row-stochastic and column-stochastic.

The theory of stochastic and doubly stochastic matrices was first developed alongside the Markov chain by the Russian mathematician Andrei Andreevich Markov (1856–1922) at the beginning of the twentieth century.

Markov first began developing his ideas about chains of linked events (where what happens next depends on the current state of the system) in 1906. The initial intended uses of the new branch of probability theory that he was elaborating were for linguistic analysis. Delving into the text of Alexander Pushkin’s novel in verse Eugene Onegin, Markov treated the text as a mere stream of letters. He spent hours sifting through patterns of vowels and consonants with the aim of estimating to what extent there is an exaggerated tendency in Pushkin’s text for vowels and consonants to alternate, thus violating the principle of independence. Although Markov’s analysis remains on a superficial level from a linguistic point of view, it made a lasting impression because the technique involved presciently extended the theory of probability in a new direction [27].

In the 100 years since Markov’s early work, stochastic and doubly stochastic matrices have found use in almost every field of science, from econometrics [31,35] to geology [37,38], ecology [23], and population genetics [24,33].

The ubiquity of stochastic and doubly stochastic matrices in science as a tool for statistical analysis alone justifies the study of these sets of matrices. But they are not without intrinsic interest. For instance, it is well known that the set of doubly stochastic matrices of size n × n , which we denote by Ω n , forms a semigroup with respect to matrix multiplication. It is also a convex polytope (i.e., a compact convex set with a finite number of extreme points) of dimension ( n 1 ) 2 in the Euclidean space of dimension n 2 . Most interestingly, it was shown by Birkhoff [3] that the extreme points of Ω n are precisely the n × n permutation matrices, denoted by P n , and that each D Ω n admits a (not necessarily unique) Birkhoff decomposition D = i = 1 r α i P i , where P i P n , α i 0 , and i = 1 r α i = 1 . Due to this fundamental characterization, Ω n is sometimes referred to as the Birkhoff polytope.

In the last few decades, the geometric features of the Birkhoff polytope have been an active subject of research. For instance, in the 1970s, Brualdi and Gibson studied the Euclidean geometry structure of Ω n . In a series of four articles [710], they characterized the faces, the edges, and the facets of Ω n . At the turn of the millennium, several teams of researchers sought to characterize the volume of the Birkhoff polytope. In particular, formulas for the volume of Ω n were given by Sturmfels [36] for n 7 , by Chan and Robbins [13] for n = 8 , and by Beck and Pixton [2] for n = 9 , 10 . As for the case of 10 n 15 , estimates were obtained by Emiris and Fisikopoulos in 2014 [21], and by Cousins and Vempala in 2016 [16]. In 2009, De Loera et al. [19] presented a combinatorial formula for the volume of Ω n , while Canfield and MacKay [11] discovered an asymptotic formula during the same year. In a different but related vein, an intensive investigation of combinatorial and geometric properties of the acyclic Birkhoff polytope has been done in [14,15,22,29], and interesting combinatorial peculiarities of the facial structure of the polytope consisting of the tridiagonal doubly stochastic matrices of order n have been extensively considered in [12,17,18].

Despite their frequent applications in some areas of mathematics, geometric notions such as the radius and the diameter of Ω n have so far received little attention. The few articles dealing with these issues include papers from 1998 by Glunt et al. [25,26], by Khoury [30], and more recently by the authors of this manuscript [5,6]. While studying numerical simulation of large linear semiconductor circuit networks, Glunt et al. [25] naturally came across the following question: given a matrix B of order n subject to the constraints e 1 D k e 1 = e 1 B k e 1 , k 1 , where e 1 = ( 1 , 0 , , 0 ) , which generalized doubly stochastic matrix D is closest to B in the sense of Frobenius? They were able to give an algorithm to numerically find the solution. Khoury [30] independently studied the same question, bar the constraints mentioned above. He showed that D = W B W + J n , where W = I n J n and J n is the n × n matrix with every entry uniformly equal to 1 n . This line of investigation was later pursued by Glunt et al. in [26] and further developed in the more specific case of doubly stochastic matrices (as opposed to generalized doubly stochastic matrices) in [1] by Bai et al. in 2007. Finally, in [5,6], the authors carried out a detailed analysis of Chebyshev’s center and radius of the Birkhoff polytope, when equipped with the operator norms from n p to n p ( 1 p ) and that of the Schatten p-norms ( 1 p < ). Along the way, they discussed classical problems such as finding the radius of a minimal bounding ball for the Birkhoff polytope and the smallest enclosing ball problem.

The present article is fully in line with the latter two above-mentioned papers in that the question addressed is centered around studying the geometry of the Birkhoff polytope when the ambient space is endowed with the metric induced by the operator norm from n p to n p ( 1 p ) and, in turn, with the Schatten p -norm ( p 1 ).

This article is structured as follows. In Section 2, we establish the preliminaries needed for later use. More precisely, we recall the definitions of the norms considered therein. We then state a few elementary results on the spectrum of doubly stochastic matrices as well as on a special matrix that plays a central role in the theory of doubly stochastic matrices. In Section 3, we briefly review the definition of the diameter of a nonempty set in a metric space. In Section 4, we determine by fairly elementary means that the diameter of the Birkhoff polytope in the case of the operator norms from n p to n p is 2 for 1 p . Finally, in Section 5, we first derive that the diameter of the Birkhoff polytope with respect to the metric induced by the Schatten p -norms ( 1 p < ) is given by max P P n I n P S p . A calculation then shows that

I n P S p = 2 k = 1 n 1 sin p π k n 1 + k = 1 n 2 sin p π k n 2 + + k = 1 n r sin p π k n r 1 p ,

for some positive integer n 1 , , n r such that n 1 + + n r = n . From this point onward, the problem of calculating the diameter of the Birkhoff polytope becomes a trigonometric maximization problem. Using numerous identities, we establish in turn that

diam S p ( Ω n ) = 2 cot π 2 n , if p = 1 , 2 k = 1 n sin p π k n 1 p , if 1 < p < 2 , 2 n , if p = 2 , 2 n sin 2 n π 2 min { 1 , 3 4 ( 3 2 ) p } 2 1 p , if p > 2 .

2 Definitions, properties, and preliminary results

In this section, we recall the definition of some standard matrix norms that will be considered in the rest of this article. We also explore some properties of doubly stochastic matrices that are used extensively in the rest of this article.

2.1 Operator norms induced by the vector p -norms

For p 1 , the p -norm of a given vector x = ( x 1 , , x n ) is defined by:

x p = k = 1 n x k p 1 p ,

and the -norm is defined by:

x = max { x 1 , , x n } .

We shall denote the space C n equipped with the vector p -norm by n p ( C ) , or by n p for short.

Any n × n matrix A can be interpreted as an operator from n p to n p . The operator norm of A induced by the vector p-norm is given by:

A n p n p sup x 0 A x p x p .

For all 1 p , we have B n p n p = B n q n q , where B denotes the conjugate transpose of B and q is the Hölder conjugate exponent of p , i.e., 1 p + 1 q = 1 [28, p. 357].

It is also worth mentioning two properties of the operator norms induced by the vector p -norms, which will prove useful in the remainder of this article. Both easily derive from the definition. These are:

  1. Sub-multiplicativity: For all p [ 1 , ] and all n × n matrices A and B ,

    A B n p n p A n p n p B n p n p ;

  2. Permutation invariance: For all p [ 1 , ] , all n × n matrix A , and all n × n permutation matrices P and Q ,

    P A Q n p n p = A n p n p .

Finally, since the sets of interest to us in what follows (i.e., Ω n and P n ) are invariant under the conjugate transpose action, many operators considered below have the same norm, whether we see them as n p n p or as n q n q mappings. For the sake of concision, when such situations arise, we shall only consider the case 1 p 2 .

2.2 Schatten p -norms

Given an n × n matrix A , let λ 1 , , λ n denote the eigenvalues of A A , with repetitions counted. Note that for all i = 1 , 2 , , n we have

λ i x 2 2 = λ i x , x = A A x , x = A x , A x 0 .

Hence, λ i s are non-negative real numbers. Order these so that λ 1 λ n 0 . Let σ i λ i , so that σ 1 σ n 0 . These latter numbers are called the singular values of A .

It follows from the Courant-Fischer min-max theorem [28, Thm 4.2.6] that the i th singular value of A is given by:

σ i ( A ) = inf V C n dim ( V ) = n i + 1 sup x V x 2 = 1 A x 2 , ( i = 1 , 2 , , n ) .

It is easy to see that the largest singular value of A is equal to the operator norm of A induced by the vector 2-norm. More explicitly,

(2.1) σ 1 ( A ) = A n 2 n 2 .

For a given p [ 1 , ] , the Schatten p -norm of an n × n matrix A , which we denote by A S p , is defined as the p -norm of the sequence of its singular values, i.e.,

A S p i = 1 n σ i ( A ) p 1 p , ( 1 p < ) ,

and

A S max { σ 1 ( A ) , , σ n ( A ) } = σ 1 ( A ) .

Remark that the definition of the Schatten -norm coincides with that of the spectral norm 2.1.

Being closely related to vector p -norms, Schatten p -norms naturally inherit the following monotonic behavior of the latter:

A S 1 A S p A S q A S ,

for all n × n matrix A and for 1 p q .

Finally, the following two properties of Schatten p -norms are immediate consequences of the singular value decomposition (SVD). These are as follows:

  1. Sub-multiplicativity: For all p [ 1 , ] and all n × n matrices A and B ,

    A B S p A S p B S p ;

  2. Unitarily invariance (and a fortiori permutation invariance): For all p [ 1 , ] , all n × n matrix A , and all n × n unitary matrices U and V ,

    U A V S p = A S p .

2.3 Elementary spectral properties of doubly stochastic matrices

Let us recall some basic yet useful facts concerning doubly stochastic matrices, which will be needed in our studies:

  1. The spectrum of a doubly stochastic matrix always includes 1. This eigenvalue is associated with the “obvious” eigenvector e = ( 1 , 1 , , 1 ) . All other eigenvalues have magnitude at most 1 [34, Lemma 1].

  2. The spectrum of a doubly stochastic matrix D is a subset of the unit circle if and only if D is a permutation matrix [34, Theorem 5].

  3. A convex real-valued function on Ω n attains its maximum at a permutation matrix [28, Corollary 8.7.4].

2.4 A special doubly stochastic matrix

The n × n matrix where every entry is equal to 1 n , which we denote by J n , plays a central role in the theory of doubly stochastic matrices and is quite special in a number of regards. But for the purposes of the following discussion, it suffices to note that it acts as the absorbing element of Ω n , i.e.,

D J n = J n D = J n ,

for every n × n doubly stochastic matrix D .

2.5 Minimum and maximum distance of an element of the Birkhoff polytope from the origin

For the purposes of our study of the diameter of the Birkhoff polytope with respect to various norms, it will be useful to have upper and lower bounds for the value of these norms when the operand runs through the Birkhoff polytope. The following lemma provides such elementary bounds.

Lemma 2.1

Let be a permutation invariant sub-multiplicative matrix norm and let D be an n × n doubly stochastic matrix. Then, 1 D I n , where I n is the n × n identity matrix.

Proof

On the one hand, it follows from the absorbing property of the special matrix J n that

(2.2) J n = D J n D J n .

Thus, D 1 . On the other hand, if i = 1 r α i P i is a Birkhoff decomposition of D , then

D = i = 1 r α i P i i = 1 r α i P i = i = 1 r α i I n = I n .

3 Definition of the diameter of a set

In any given metric space ( X , d ) , the diameter of a nonempty set of points X is defined as the supremum of the distances between pairs of points in , i.e.,

diam d ( ) sup x , y d ( x , y ) .

If diam d ( ) < , then B is called a bounded set (Figure 1).

Figure 1 
               Diameter of the nonempty closed bounded set 
                     
                        
                        
                           ℬ
                        
                        {\mathcal{ {\mathcal B} }}
                     
                   with respect to the metric 
                     
                        
                        
                           d
                        
                        d
                     
                  .
Figure 1

Diameter of the nonempty closed bounded set with respect to the metric d .

In the following two sections, we determine the diameter of Ω n with respect to the operator norms from n p to n p and also with respect to the Schatten p -norms ( 1 p ) . While the former is fairly straightforward, the latter requires a few non-trivial results and identities and gives rise to quite a surprising answer.

4 Diameter of the Birkhoff polytope relative to the operator norms from n p to n p for 1 p

First, we turn our attention to the diameter of the Birkhoff polytope relative to the operator norms from n p to n p ( 1 p ) . Here, we find that this diameter is constant and independent of the parameter p . This elegant and simple result mainly rests on Lemma 2.1.

Theorem 4.1

For every 1 p , diam n p ( Ω n ) = 2 .

Proof

One can easily check straight from the definition of the operator norm from n p to n p that I n n p n p = 1 for all 1 p . Lemma 2.1 therefore ensures that for any D Ω n , D n p n p = 1 . Thus,

D S n p n p D n p n p + S n p n p = 2

for every D , S Ω n . Hence, diam n p ( Ω n ) 2 . Moreover, let D = I n 2 0 1 1 0 and S = I n . Then, D S = 0 n 2 1 1 1 1 . The singular values of this matrix are 2 with multiplicity 1, and 0 with multiplicity n 1 . Consequently, the diameter of Ω n relative to the operator norm from n p to n p is equal to 2.□

Remark 4.2

In the space of n × n matrices with real coefficients equipped with the metric induced by the n p n p norm, the Birkhoff polytope Ω n has certain properties reminiscent of those of the unit circle. For example, D n p n p = 1 for all D Ω n and diam p ( Ω n ) = 2 .

Note, however, that the unit circle also verifies a property that is easily taken for granted: its diameter is equal to twice its radius (in Chebyshev’s sense). A set in a metric space that verifies both of the aforementioned properties of the unit circle as well as this latter property is called a centrable set.

The authors of this article have shown in [5, Theorem 6.8] that the Chebyshev radius of the Birkhoff polytope with respect to the metric induced by the spectral norm is 1, which makes Ω n a centrable set with respect to this metric space. However, for p 2 , Ω n is not a centrable set with respect to n p n p since its radius is strictly greater than 1.

5 Diameter of the Birkhoff polytope relative to the Schatten p -norms for p 1

We begin by using the fact that D S S p is convex relative to both operand D , S Ω n to deduce from property (iii) of doubly stochastic matrices in Section 2.3 that diam S p ( Ω n ) = diam S p ( P n ) . But, since Schatten p -norms are unitarily-invariant (and thus permutation-invariant), we have

diam S p ( Ω n ) = max P , Q P n P Q S p = max P , Q P n I n Q P S p = max P P n I n P S p .

Now observe that ( I n P ) ( I n P ) = 2 I n P P and that, if ( λ , x ) is an eigenpair of the permutation matrix P , then

( 2 I n P P ) x = ( 2 λ λ 1 ) x .

Thus, if λ k ( 1 k n ) are the eigenvalues of P , then the singular values of I n P are equal to 2 λ k λ k 1 . But, the eigenvalues of the permutation matrices are unimodular. So if λ k e i θ k , we find

2 λ k λ k 1 = 2 ( 1 cos ( θ k ) ) = 2 sin θ k 2 , ( 0 θ k 2 π ) .

It follows that

I n P S p = 2 k = 1 n sin p θ k 2 1 p ,

where θ k are the arguments of the eigenvalues of P . Recall that the eigenvalues of a permutation matrix are of the following particular form: there exist natural numbers n 1 , n 2 , , n r satisfying n 1 + n 2 + + n r = n for which the eigenvalues of P are e 2 π i k n 1 ( 1 k n 1 ) , e 2 π i k n 2 ( 1 k n 2 ) , , e 2 π i k n r ( 1 k n r ) [20]. Therefore,

I n P S p = 2 k = 1 n 1 sin p π k n 1 + k = 1 n 2 sin p π k n 2 + + k = 1 n r sin p π k n r 1 p ,

and thus,

diam S p ( Ω n ) = 2 max n j = n k = 1 n 1 sin p π k n 1 + + k = 1 n r sin p π k n r 1 p .

Finding this maximum is a difficult problem. Indeed, a partition maximizing the aforementioned quantity not only depends on the parameter p , but also, as we shall see, on the parity of n when p > 2 . We therefore divide the problem into three subcases. In the first case, where 1 p < 2 , we will use some non-trivial identities to achieve our goal. In the second case, certain peculiarities of the Schatten 2-norm will be exploited to find precisely for which doubly stochastic matrices the diameter is realized. Finally, we will characterize the case p > 2 using estimates.

5.1 Case 1 p < 2

In [4], it is shown that for every p [ 0 , 2 ) , the function

S ( n ) 1 n k = 1 n sin p π k n

is monotonically increasing relative to the natural numbers n . It thus follows that

diam S p ( Ω n ) = 2 max n j = n k = 1 n 1 sin p π k n 1 + + k = 1 n r sin p π k n r 1 p = 2 max n j = n ( n 1 S ( n 1 ) + + n r S ( n r ) ) 1 p 2 max n j = n ( n 1 S ( n ) + + n r S ( n ) ) 1 p = 2 ( n S ( n ) ) 1 p = 2 k = 1 n sin p π k n 1 p .

The reverse inequality being realized by the trivial partition ( n ) , it follows that

diam S p ( Ω n ) = 2 k = 1 n sin p π k n 1 p , ( 1 p < 2 ) .

Hence, the following holds true:

Theorem 5.1

For 1 p < 2 , the diameter of Ω n relative to the Schatten p-norm is

diam S p ( Ω n ) = 2 k = 1 n sin p π k n 1 p .

Remark 5.2

It can be shown (see [4]) that

lim n 1 n j = 1 n sin p π k n 0 1 sin p ( π x ) d x = ( p 1 ) 2 p 2 ,

where z w lim u z lim v w Γ ( u + 1 ) Γ ( v + 1 ) Γ ( u v + 1 ) Hence, for 1 p < 2 , we have

diam S p p ( Ω n ) = 2 p n 1 n k = 1 n sin p π k n 2 p ( p 1 ) 2 p 2 n .

The case for the Schatten 1-norm, also known as the nuclear norm, trace norm, or Ky Fan norm, is of marked interest. Using the trigonometric identity k = 1 n sin π k n = cot π 2 n , Theorem 5.1 yields:

Corollary 5.3

The diameter of Ω n relative to the nuclear norm is given by diam S 1 ( Ω n ) = 2 cot π 2 n .

Finally, note the argument presented in this section for the case 1 p < 2 holds just as true for p verifying 0 p < 1 . For such p , though the function S p defines only a quasi-norm, the function d ( x , y ) = x y S p p is still a metric and the following holds true:

Proposition 5.4

For 0 p < 1 , the diameter of Ω n relative to the metric d ( x , y ) = x y S p p is given by:

diam S p ( Ω n ) = 2 p k = 1 n sin p π k n .

5.2 Case p = 2

Recall that diam S p ( Ω n ) = diam S p ( P n ) . This implies that there exists a pair of permutation matrices P 1 and P 2 such that diam ( Ω n ) = P 1 P 2 S p . For the particular case of the Schatten 2-norm, we can use some more precise knowledge about the norm of a doubly stochastic matrix to derive a stronger result.

Theorem 5.5

The diameter of Ω n relative to the Schatten 2-norm is equal to 2 n . Moreover, it is realized by a pair D 1 , D 2 Ω n if and only if D 1 , D 2 P n and tr ( D 1 D 2 ) = 0 .

Proof

Let D 1 , D 2 Ω n . Using Lemma 2.1, we have

(5.1) D 1 D 2 S 2 2 = D 1 S 2 2 + D 2 S 2 2 2 tr ( D 1 D 2 ) 2 I n S 2 2 2 tr ( D 1 D 2 ) 2 n .

Taking the supremum on each side yields diam S 2 ( Ω n ) 2 n . We easily obtain the reverse inequality by taking the pair of doubly stochastic matrices I n and Q , where Q is any permutation matrix with trace zero. Hence, diam S 2 ( Ω n ) = 2 n .

Note that we have equality in both inequalities of (5.1) if and only if D 1 S 2 2 = D 2 S 2 2 = n and tr ( D 1 D 2 ) = 0 . The former condition holds if and only if σ i ( D j ) = 1 for all i = 1 , , n and for j = 1 , 2 . Hence, the SVD of D j is given by D j = U j I n V j , where U j and V j are the unitary matrices. Thus, the D j s are the unitary matrices and their eigenvalues must then all lie on the unit circle. It therefore follows from property (ii) mentioned in Section 2.3 that the D j s are permutation matrices.□

5.3 Case p > 2

Clearly, sin p ( x ) sin 2 ( x ) for any x [ 0 , π ] and p > 2 . Moreover, it is easy to show (using Euler identity) that k = 1 n sin 2 π k n = n 2 for n 2 and k = 1 1 sin 2 π k 1 = 0 . The combination of these basic facts gives us the following estimate:

k = 1 n 1 sin p π k n 1 + + k = 1 n r sin p π k n r k = 1 n 1 sin 2 π k n 1 + + k = 1 n r sin 2 π k n r n 1 2 + + n r 2 = n 2 .

It follows that diam S p ( Ω n ) 2 n 2 1 p for p > 2 . If n is even, the partition of n into 2s yields the reverse inequality, and thus,

diam S p ( Ω n ) = 2 n 2 1 p , ( p > 2 , n even ) .

If n is odd, the situation is trickier. Clearly, any partition of n must contain at least one odd term, say n r = m . Proceeding as before, we have

diam S p ( Ω n ) = 2 max n j = n k = 1 n 1 sin p π k n 1 + + k = 1 n r 1 sin p π k n r 1 + k = 1 m sin p π k m 1 p 2 max n j = n n 1 2 + + n r 1 2 + k = 1 m sin p π k m 1 p = 2 max 1 m n m odd n m 2 + k = 1 m sin p π k m 1 p .

The reverse inequality is readily obtained by considering the partition ( 2 , , 2 , m ) , and thus,

diam S p ( Ω n ) = 2 max 1 m n m odd n m 2 + k = 1 m sin p π k m 1 p = 2 n 2 + max 1 m n m odd S p ( m ) 1 p ,

where

S p ( m ) k = 1 m sin p π k m m 2 .

We therefore seek to determine for which odd numbers m the term S p ( m ) (seen as a function of p ) attains its maximum. To achieve this, we shall use the following trigonometric identity twice [32, Corollary 1]:

(5.2) k = 1 m sin 6 π k m = 0 , if m = 1 , 27 32 , if m = 3 , 5 m 16 , if m 5 .

Let us first suppose that p 6 . Note that under this additional hypothesis, the fact that 0 sin ( x ) 1 for x [ 0 , π ] implies that S p ( m ) decreases relative to p for any integer m . So,

S p ( 1 ) max 1 m n m odd S p ( m ) max 1 m n m odd S 6 ( m ) , ( p 6 ) .

The trigonometric identity (5.2) then ensures that

S 6 ( m ) = k = 1 m sin 6 π k m m 2 = 1 2 , if m = 1 , 21 32 , if m = 3 , 3 m 16 , if m 5 .

But 3 m 16 < 21 32 < 1 2 for every odd integer m 5 , and it follows that the maximum of S 6 ( m ) is realized when m = 1 . Moreover, observe that S p ( 1 ) = 1 2 for every p R . Therefore, we have

S p ( 1 ) max 1 m n m odd S p ( m ) max 1 m n m odd S 6 ( m ) = S 6 ( 1 ) = S p ( 1 ) , ( p 6 ) .

Thus, the maximum of S p ( m ) when m is an odd integer and p 6 is realized by m = 1 .

Let us then go back to the case where 2 < p < 6 . By Hölder’s inequality,

S p ( m ) + m 2 = k = 1 m sin p π k m = k = 1 m sin π k m p t sin π k m p ( 1 t ) k = 1 m sin π k m p t a 1 a k = 1 m sin π k m p ( 1 t ) b 1 b ,

for each 0 t 1 and every a , b 1 such that 1 a + 1 b = 1 . By choosing t = 6 p 2 p and a = 4 6 p , we find that 0 t 1 and 1 < a , b < and we obtain

S p ( m ) k = 1 m sin 2 π k m 6 p 4 k = 1 m sin 6 π k m p 2 4 m 2 m 2 6 p 4 k = 1 m sin 6 π k m p 2 4 m 2 T p ( m ) .

Once again, we use the trigonometric identity (5.2) to obtain

S p ( m ) T p ( m ) = 1 2 , if m = 1 , 2 3 4 p 2 3 2 , if m = 3 , 5 8 p 2 4 1 m 2 , if m 5 .

Note that 5 8 p 2 4 1 is a decreasing function of p , and thus,

5 8 p 2 4 1 5 8 2 2 4 1 = 0 .

Hence, T p ( m ) is decreasing relative to m for each p with 2 < p < 6 , and every odd integer m 5 . It follows that T p ( 5 ) T p ( m ) for any such p and every odd integer m 5 . Moreover,

T p ( 3 ) T p ( 5 ) 2 5 5 8 p 2 4 4 3 4 p 2 .

It is straightforward to verify that 5 5 8 p 2 4 4 3 4 p 2 is a decreasing function of p whenever p > 2 , and thus,

5 5 8 p 2 4 4 3 4 p 2 5 5 8 2 2 4 4 3 4 2 2 = 2 , ( p > 2 ) .

Hence, T p ( 3 ) T p ( 5 ) for every p with 2 < p < 6 , and it follows that

S p ( m ) T p ( m ) max { T p ( 1 ) , T p ( 3 ) } = max 1 2 , 2 3 2 p 3 2 , ( 2 < p < 6 ) .

Moreover, one can readily verify that S p ( 1 ) = 1 2 and S p ( 3 ) = 2 3 2 p 3 2 . Therefore, this upper bound on S p ( m ) is also realized by S p ( 1 ) and S p ( 3 ) . Hence,

max 1 m n m odd S p ( m ) = max { S p ( 1 ) , S p ( 3 ) } = max 1 2 , 2 3 2 p 3 2 , ( 2 < p < 6 ) .

It then follows from a few elementary computations that S p ( 1 ) S p ( 3 ) if and only if 1 2 2 3 2 p 3 2 , i.e., to say if and only if p log ( 4 ) log ( 4 3 ) c 4.819 . Therefore, for p satisfying 2 < p c , the maximum of S p ( m ) is reached for m = 3 , whereas for p verifying c p < 6 (and thus on for c p < ), the maximum is reached for m = 1 . Moreover, a direct calculation and some simplifications reveal that one can write the results of this section in the following closed form for every n N :

Theorem 5.6

For p > 2 , the diameter of Ω n relative to the Schatten p-norm is

(5.3) diam S p ( Ω n ) = 2 n sin 2 ( n π 2 ) min { 1 , 3 4 ( 3 2 ) p } 2 1 p .

In particular, if n is even, then diam S p ( Ω n ) = 2 ( n 2 ) 1 p .

5.4 Summary

The following theorem condenses the results derived in this section concerning the diameter of Birkhoff’s theorem when the metric used is the one induced by Schatten’s p -norms.

Theorem 5.7

Given p with p 1 , the diameter of Ω n relative to the Schatten p-norm is given by:

(5.4) diam S p p ( Ω n ) = 2 p k = 1 n sin p π k n , i f 1 p 2 , 2 p n sin 2 n π 2 min ( 1 , 3 4 ( 3 2 ) p ) 2 , i f 2 p < .

In particular,

  1. diam S 1 ( Ω n ) = 2 cot π 2 n ,

  2. diam S 2 ( Ω n ) = 2 n ,

  3. diam S p ( Ω n ) = 2 ( n 2 ) 1 p if n is even and p 2 .

Acknowledgements

We thank the anonymous reviewers for their careful reading of our manuscript and their many insightful comments and suggestions.

  1. Funding information: This work was partially supported by a research grant from NSERC, a research grant from Fonds de recherche du Québec-Nature et technologies, and the Vanier scholarship.

  2. Conflict of interest: The authors declare no conflicts of interest.

  3. Data availability statement: Data sharing is not applicable to this article as no datasets were generated or analysed during the current study.

References

[1] Z.-J. Bai, D. Chu, and R. C. E. Tan, Computing the nearest doubly stochastic matrix with a prescribed entry, SIAM J. Sci. Comput. 29 (2007), 2, 635–655. 10.1137/050639831Search in Google Scholar

[2] M. Beck and D. Pixton, The Ehrhart polynomial of the Birkhoff polytope, Discrete Comput. Geom. 30 (2003), no. 4, 623–637. 10.1007/s00454-003-2850-8Search in Google Scholar

[3] G. Birkhoff, Tres observaciones sobre el algebra lineal, Univ. Nac. Tucumán. Revista A. 5 (1946), 147–154. Search in Google Scholar

[4] L. Bouthat, J. Mashreghi, and F. Morneau-Guérin, Monotonicity of certain left and right Riemann sums, Recent developments in operator theory, mathematical physics and complex analysis, Oper. Theory Adv. Appl. vol. 290, Birkhäuser/, Springer, Cham, 2023, pp. 89–113.10.1007/978-3-031-21460-8_3Search in Google Scholar

[5] L. Bouthat, J. Mashreghi, and F. Morneau-Guérin, On the Geometry of the Birkhoff Polytope. I. The operator ℓp-norms, Acta Sci. Math. (Szeged) (2024, submitted).10.1007/s44146-024-00152-8Search in Google Scholar

[6] L. Bouthat, J. Mashreghi, and F. Morneau-Guérin, On the Geometry of the Birkhoff Polytope. II. The Schatten p-norms, Acta Sci. Math. (Szeged) (2024, submitted).10.1007/s44146-024-00153-7Search in Google Scholar

[7] R. A. Brualdi and P. M. Gibson, Convex polyhedra of doubly stochastic matrices, IV, Linear Algebra Appl. 15 (1976), no. 2, 153–172. 10.1016/0024-3795(76)90013-6Search in Google Scholar

[8] R. A. Brualdi and P. M. Gibson, Convex polyhedra of doubly stochastic matrices. I. Applications of the permanent function, J. Comb. Theory Ser. A 22 (1977), no. 2, 194–230. 10.1016/0097-3165(77)90051-6Search in Google Scholar

[9] R. A. Brualdi and P. M. Gibson, Convex polyhedra of doubly stochastic matrices: II. Graph of Ωn, J. Comb. Theory, Ser. B. 22 (1977), 175–198. 10.1016/0095-8956(77)90010-7Search in Google Scholar

[10] R. A. Brualdi and P. M. Gibson, Convex polyhedra of doubly stochastic matrices. III. Affine and combinatorial properties of Ωn, J. Comb. Theory Ser. A 22 (1977), no. 3, 338–351. 10.1016/0097-3165(77)90008-5Search in Google Scholar

[11] E. R. Canfield and B. D. McKay, The asymptotic volume of the Birkhoff polytope, Online Online J. Anal. Comb. (2009), no. 4, 4. Search in Google Scholar

[12] L. Cao, D. McLaren, and S. Plosker, The complete positivity of symmetric tridiagonal and pentadiagonal matrices, Spec. Matrices 11 (2023), 20220173. 10.1515/spma-2022-0173Search in Google Scholar

[13] C. S. Chan and D. P. Robbins, On the volume of the polytope of doubly stochastic matrices, Experiment. Math. 8 (1999), no. 3, 291–300. 10.1080/10586458.1999.10504406Search in Google Scholar

[14] L. Costa, C. M. da Fonseca, and E. A. Martins, The diameter of the acyclic Birkhoff polytope, Linear Algebra Appl. 428 (2008), no. 7, 1524–1537. 10.1016/j.laa.2007.09.028Search in Google Scholar

[15] L. Costa, C. M. da Fonseca, and E. A. Martins, Face counting on an acyclic Birkhoff polytope, Linear Algebra Appl. 430 (2009), no. 4, 1216–1235. 10.1016/j.laa.2008.10.015Search in Google Scholar

[16] B. Cousins and S. Vempala, A practical volume algorithm, Math. Program. Comput. 8 (2016), no. 2, 133–160. 10.1007/s12532-015-0097-zSearch in Google Scholar

[17] C. M. da Fonseca and E. Marques de Sá, Fibonacci numbers, alternating parity sequences and faces of the tridiagonal Birkhoff polytope, Discrete Math. 308 (2008), no. 7, 1308–1318. 10.1016/j.disc.2007.03.077Search in Google Scholar

[18] G. Dahl, Tridiagonal doubly stochastic matrices, Linear Algebra Appl. 390 (2004), 197–208. 10.1016/j.laa.2004.04.017Search in Google Scholar

[19] J. A. De Loera, F. Liu, and R. Yoshida, A generating function for all semi-magic squares and the volume of the Birkhoff polytope, J. Algebraic Combin. 30 (2009), no. 1, 113–139. 10.1007/s10801-008-0155-ySearch in Google Scholar

[20] J. J. Dionísio, A rule for computing the eigen-values and the eigen-vectors of a permutation matrix, Rev. Fac. Ci. Univ. Coimbra 23 (1954), 53–55. Search in Google Scholar

[21] I. Z. Emiris and V. Fisikopoulos, Efficient random-walk methods for approximating polytope volume, Computational geometry (SoCGa14), ACM, New York, 2014, pp. 318–327. 10.1145/2582112.2582133Search in Google Scholar

[22] R. Fernandes, Computing the degree of a vertex in the skeleton of acyclic Birkhoff polytopes, Linear Algebra Appl. 475 (2015), 119–133. 10.1016/j.laa.2015.02.005Search in Google Scholar

[23] J. Fieberg and S. P. Ellner, Stochastic matrix models for conservation and management: a comparative review of methods, Ecology Letters 4 (2001), no. 3, 244–266. 10.1046/j.1461-0248.2001.00202.xSearch in Google Scholar

[24] K. Gladstien, The characteristic values and vectors for a class of stochastic matrices arising in genetics, SIAM J. Appl. Math. 34 (1978), 4, 630–642. 10.1137/0134050Search in Google Scholar

[25] W. Glunt, T. L. Hayden, and R. Reams, The nearest ‘doubly stochastic’ matrix to a real matrix with the same first moment, Numer. Linear Algebra Appl. 5 (1998), no. 6, 475–482 (1999).10.1002/(SICI)1099-1506(199811/12)5:6<475::AID-NLA155>3.3.CO;2-XSearch in Google Scholar

[26] W. Glunt, T. L. Hayden, and R. Reams, The nearest generalized doubly stochastic matrix to a real matrix with the same first and second moments, Comput. Appl. Math. 27 (2008), no. 2, 201–210. 10.1590/S0101-82052008000200005Search in Google Scholar

[27] B. Hayes et al., First links in the Markov chain, Amer. Sci. 101 (2013), no. 2, 92. 10.1511/2013.101.92Search in Google Scholar

[28] R. A. Horn and C. R. Johnson, Matrix Analysis, second ed. , Cambridge University Press, Cambridge, 2013.Search in Google Scholar

[29] D. Jojić, Some remarks about acyclic and tridiagonal Birkhoff polytopes, Linear Algebra Appl. 495 (2016), 108–121.10.1016/j.laa.2016.01.035Search in Google Scholar

[30] R. N. Khoury, Closest matrices in the space of generalized doubly stochastic matrices, J. Math. Anal. Appl. 222 (1998), 2, 562–568. 10.1006/jmaa.1998.5970Search in Google Scholar

[31] J. LeSage and R. K. Pace, Introduction to Spatial Econometrics, Chapman and Hall/CRC, New York, 2009. 10.1201/9781420064254Search in Google Scholar

[32] M. Merca, On some power sums of sine or cosine, Amer. Math. Monthly. 121 (2014), no. 3, 244–248. 10.4169/amer.math.monthly.121.03.244Search in Google Scholar

[33] I. Paniello, Stochastic matrices arising from genetic inheritance, Linear Algebra Appl. 434 (2011), no. 3, 791–800. 10.1016/j.laa.2010.09.042Search in Google Scholar

[34] H. Perfect and L. Mirsky, Spectral properties of doubly-stochastic matrices, Monatsh. Math. 69 (1965), 35–57. 10.1007/BF01313442Search in Google Scholar

[35] R. Solow, On the structure of linear models, J. Econ. Soc. 20 (1952), no. 1, 29–46. 10.2307/1907805Search in Google Scholar

[36] B. Sturmfels, Equations defining toric varieties, Algebraic geometry–Santa Cruz 1995, Proc. Sympos. Pure Math., vol. 62, Amer. Math. Soc., Providence, RI, 1997, pp. 437–449. 10.1090/pspum/062.2/1492542Search in Google Scholar

[37] A. B. Vistelius, Mathematical geology and the progress of geological sciences, J. Geol. 84 (1976), no. 6, 629–651. 10.1086/628246Search in Google Scholar

[38] E. H. T. Whitten, Stochastic models in geology, J. Geol. 85 (1977), no. 3, 321–330. 10.1086/628302Search in Google Scholar

Received: 2023-10-30
Revised: 2024-01-10
Accepted: 2024-01-22
Published Online: 2024-02-24

© 2024 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

  1. Research Articles
  2. The diameter of the Birkhoff polytope
  3. Determinants of tridiagonal matrices over some commutative finite chain rings
  4. The smallest singular value anomaly: The reasons behind sharp anomaly
  5. Idempotents which are products of two nilpotents
  6. Two-unitary complex Hadamard matrices of order 36
  7. Lih Wang's and Dittert's conjectures on permanents
  8. On a unified approach to homogeneous second-order linear difference equations with constant coefficients and some applications
  9. Matrix equation representation of the convolution equation and its unique solvability
  10. Disjoint sections of positive semidefinite matrices and their applications in linear statistical models
  11. On the spectrum of tridiagonal matrices with two-periodic main diagonal
  12. γ-Inverse graph of some mixed graphs
  13. On the Harary Estrada index of graphs
  14. Complex Palais matrix and a new unitary transform with bounded component norms
  15. Computing the matrix exponential with the double exponential formula
  16. Special Issue in honour of Frank Hall
  17. Editorial Note for the Special Issue in honor of Frank J. Hall
  18. Refined inertias of positive and hollow positive patterns
  19. The perturbation of Drazin inverse and dual Drazin inverse
  20. The minimum exponential atom-bond connectivity energy of trees
  21. Singular matrices possessing the triangle property
  22. On the spectral norm of a doubly stochastic matrix and level-k circulant matrix
  23. New constructions of nonregular cospectral graphs
  24. Variations in the sub-defect of doubly substochastic matrices
  25. Eigenpairs of adjacency matrices of balanced signed graphs
  26. Special Issue - Workshop on Spectral Graph Theory 2023 - In honor of Prof. Nair Abreu
  27. Editorial to Special issue “Workshop on Spectral Graph Theory 2023 – In honor of Prof. Nair Abreu”
  28. Eigenvalues of complex unit gain graphs and gain regularity
  29. Note on the product of the largest and the smallest eigenvalue of a graph
  30. Four-point condition matrices of edge-weighted trees
  31. On the Laplacian index of tadpole graphs
  32. Signed graphs with strong (anti-)reciprocal eigenvalue property
  33. Some results involving the Aα-eigenvalues for graphs and line graphs
  34. A generalization of the Graham-Pollak tree theorem to even-order Steiner distance
  35. Nonvanishing minors of eigenvector matrices and consequences
  36. A linear algorithm for obtaining the Laplacian eigenvalues of a cograph
  37. Selected open problems in continuous-time quantum walks
  38. On the minimum spectral radius of connected graphs of given order and size
  39. Graphs whose Laplacian eigenvalues are almost all 1 or 2
  40. A Laplacian eigenbasis for threshold graphs
Downloaded on 9.1.2026 from https://www.degruyterbrill.com/document/doi/10.1515/spma-2023-0113/html
Scroll to top button