Functions operating on several multivariate distribution functions

Paul Ressel

doi:10.1515/demo-2023-0104

Article Open Access

Functions operating on several multivariate distribution functions

Paul Ressel

Published/Copyright: October 17, 2023

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Dependence Modeling Volume 11 Issue 1

Abstract

Functions f on [ 0 , 1 ] m such that every composition f ∘ ( g 1 , … , g m ) with d -dimensional distribution functions g 1 , … , g m is again a distribution function, turn out to be characterized by a very natural monotonicity condition, which for d = 2 means ultramodularity. For m = 1 (and d = 2 ), this is equivalent with increasing convexity.

Keywords: multivariate distribution function; ultramodular; Bernstein polynomials; Faà di Bruno’s formula; higher order; monotonicity

MSC 2010: 26E05; 26B40; 60E05

1 Introduction

Which functions of distribution functions (“d.f.s”) are again d.f.s? – this is a very general question, with an obvious answer only in dimension one. If g 1 and g 2 are both univariate d.f.s, and f : [ 0 , 1 ] 2 → R + is just increasing (i.e., f ( x ) ≤ f ( y ) for x ≤ y ), then [ f ∘ ( g 1 , g 2 ) ] ( t ) ≔ f ( g 1 ( t ) , g 2 ( t ) ) is again a d.f. (disregarding right continuity), but [ f ∘ ( g 1 × g 2 ) ] ( x ) ≔ f ( g 1 ( x 1 ) , g 2 ( x 2 ) ) need not be a bivariate d.f., as the example f ≔ 1 { ( s , t ) ∈ [ 0 , 1 ] 2 ∣ s ≤ t } and g 1 ( t ) = g 2 ( t ) ≔ ( t ∨ 0 ) ∧ 1 shows. In this situation, f itself has to be a two-dimensional d.f. in order to guarantee that f ∘ ( g 1 × g 2 ) is also of this type.

We see already that we are confronted with two related but different questions. The first one: given m multivariate d.f.s g 1 , … , g m of (possibly different) dimensions n 1 , … , n m , and f : [ 0 , 1 ] m → R + , under which general conditions at f is then f ∘ ( g 1 × ⋯ × g m ) again a d.f.? And the second question: if n 1 = ⋯ = n m ≕ d , what are necessary and sufficient conditions for f , such that f ∘ ( g 1 , … , g m ) is another d -variate d.f.?

The first question was solved some years ago: f has to be “ n - ↑ ” with n ≔ ( n 1 , … , n m ) , a notion to be explained shortly (see [12, Theorem 12]). The second question was posed already in [5] and will be answered in this article. It turns out that the function f then has to fulfill a very natural condition, known for d = 2 as being an “ultramodular aggregation function.”

An important role in the proof of the main result will be played by a multivariate generalization of the famous Faà di Bruno formula. In order to apply it, we have to use C ∞ approximations, in particular multivariate Bernstein polynomials.

Notations:

R + = [ 0 , ∞ [ , N = { 1 , 2 , 3 , … } , N 0 = { 0 , 1 , 2 , … } ,

∣ n ∣ = ∑ i = 1 d n i for n ∈ N 0 d , r d ≔ ( r , r , … , r ) ∈ N 0 d for r ∈ N 0 (mostly for r ∈ { 0 , 1 } ), [ d ] ≔ { 1 , 2 , … , d } ,

1 α ( i ) ≔ 1 , i ∈ α 0 , i ∈ α ∁ for α ⊆ [ d ] , x α ≔ ( x i ) i ∈ α ,

( f × g ) ( x , y ) ≔ ( f ( x ) , g ( y ) ) for mappings f , g ,

( f , g ) ( x ) ≔ ( f ( x ) , g ( x ) ) for mappings with the same domain,

( f ⊗ g ) ( x , y ) ≔ f ( x ) ⋅ g ( y ) for real-valued f , g ,

e 1 , … , e d are the usual unit vectors in R d ,

d.f. means distribution function.

2 Some notions of multivariate monotonicity

Let I 1 , … , I d ⊆ R be non-degenerate intervals, I ≔ I 1 × ⋯ × I d , and let f : I → R be any function. For s ∈ I and h ∈ R + d such that also s + h ∈ I , put

( E h f ) ( s ) ≔ f ( s + h )

and Δ h ≔ E h − E 0 , i.e., ( Δ h f ) ( s ) ≔ f ( s + h ) − f ( s ) . Since { E h ∣ h ∈ R + d } is commutative (where defined), so is also { Δ h ∣ h ∈ R + d } . In particular, with e 1 , … , e d denoting standard unit vectors in R d , Δ h 1 e 1 , … , Δ h d e d commute. As usual, Δ h 0 f ≔ f (also for h = 0 , but clearly Δ 0 f = 0 ). For n = ( n 1 , … , n d ) ∈ N 0 d and h = ( h 1 , … , h d ) ∈ R + d we put

Δ h n ≔ Δ h 1 e 1 n 1 Δ h 2 e 2 n 2 … Δ h d e d n d ,

so that ( Δ h n f ) ( s ) is defined for s , s + ∑ i = 1 d n i h i e i ∈ I .

Definition

f : I → R is n - ↑ (read “ n -increasing”) iff ( Δ h p f ) ( s ) ≥ 0 ∀ s ∈ I , ∀ h ∈ R + d , ∀ p ∈ N 0 d and 0 ≠ p ≤ n , such that s j + p j h j ∈ I j ∀ j ∈ [ d ] .

A specially important case is n = 1 d ; being 1 d - ↑ is the “crucial” property of d.f.s. More precisely, f : I → R + is the d.f. of a (non-negative) measure μ , i.e., f ( s ) = μ ( [ − ∞ , s ] ∩ I ¯ ) ∀ s ∈ I , if and only if f is 1 d - ↑ and right-continuous; c.f. [10, Theorem 7].

Let us for a moment consider the case d = 1 . Then, I ⊆ R , n = n ∈ N , we assume n ≥ 2 , and a famous old result of Boas and Widder [1, Lemma 1] shows that a continuous function f : I → R is n - ↑ (i.e., Δ h j f ≥ 0 ∀ j ∈ [ n ] , ∀ h > 0 ) iff

( Δ h 1 Δ h 2 … Δ h j f ) ( s ) ≥ 0

∀ j ∈ [ n ] , ∀ h 1 , … , h j > 0 such that s , s + h 1 + ⋯ + h j ∈ I . For n = 2 , f is 2 - ↑ iff it is increasing and convex (and automatically continuous on I ⧹ { sup I } ).

The following definition now seems to be natural:

Definition

Let I 1 , … , I d ⊆ R be non-degenerate intervals, I = I 1 × ⋯ × I d , f : I → R , and k ∈ N . Then, f is called k -increasing (“ k - ↑ ”) iff ∀ j ∈ [ k ] , ∀ h ( 1 ) , … , h ( j ) ∈ R + d , ∀ s ∈ I such that s + h ( 1 ) + ⋯ + h ( j ) ∈ I

( Δ h ( 1 ) … Δ h ( j ) f ) ( s ) ≥ 0 .

(We do not assume f to be continuous.)

We mentioned already that a univariate f is 2 - ↑ iff it is increasing and convex. But also multivariate 2 - ↑ functions are well known: they are called ultramodular, mostly ultramodular aggregation functions, the latter meaning they are also increasing, and defined as functions f : [ 0 , 1 ] d → [ 0 , 1 ] with f ( 0 d ) = 0 and f ( 1 d ) = 1 . The restriction to the standard unit cube [ 0 , 1 ] d is not one of course, but sometimes appropriate as we will see. In our terminology, if f is k - ↑ , it is by definition also j - ↑ for 1 ≤ j ≤ k , in particular just increasing.

In this connection, also increasing supermodular functions should be mentioned: in the bivariate case, they coincide with ( 1 , 1 ) - ↑ functions, and in higher dimensions, they are “pairwise ( 1 , 1 ) - ↑ ” in the obvious meaning; cf. [6].

Already in dimension two increasing convexity and being 2 - ↑ are uncomparable properties: on R + 2 the product is 2 - ↑ , but not convex; and the Euclidean norm is convex, however not 2 - ↑ .

Remark 1

Already in 2005, Bronevich [2] introduced k - ↑ functions, calling them “ k -monotone.” Later on, the name “strongly k -monotone” was used [5,8], a terminology usually associated with strict inequalities, therefore not really adequate.

Some simple properties of k - ↑ functions are shown first.

Lemma 1

Let f : [ 0 , 1 ] d → R be 2 - ↑ . Then,

f is continuous iff f is continuous in 1 d .
f is right-continuous and on [ 0 , 1 [ d continuous.
If f ( x 0 ) = f ( 1 d ) for some x 0 ≠ 1 d , hence α ≔ { i ≤ d ∣ x i 0 = 1 } ⊊ [ d ] , then for α ≠ ∅ f depends only on x α ≔ ( x i ) i ∈ α . In case, α = ∅ f is constant.
For each y ∈ [ 0 , 1 ] d ⧹ [ 0 , 1 [ d , the limit
f 0 ( y ) ≔ lim x → y x < y f ( x )
exists, f 0 ( y ) ≤ f ( y ) , f 0 is a 2 - ↑ and continuous extension of f ∣ [ 0 , 1 [ d . If f is k - ↑ , so is f 0 .

Proof

(i) For x , h ∈ [ 0 , 1 ] d such that also x ± h ∈ [ 0 , 1 ] d , we have (with 1 ≔ 1 d )

0 ≤ f ( x ) − f ( x − h ) ≤ f ( x + h ) − f ( x ) ≤ f ( 1 ) − f ( 1 − h ) , ( * )

from which the claim follows, f being increasing.

(ii) For any x ∈ [ 0 , 1 [ d , the univariate function [ 0 , 1 ] ∋ t ↦ f ( t x ) is convex, hence continuous, also in t = 1 (being defined and convex in a neighborhood of 1). Since f is increasing, f is continuous in x . For x ∈ [ 0 , 1 ] d ⧹ [ 0 , 1 [ d , x ≠ 1 d , let α ≔ { i ≤ d ∣ x i = 1 } . Then, for y ≥ x , also y i = 1 ∀ i ∈ α , and [ 0 , 1 [ α ∁ ∋ z ↦ f ( 1 α , z ) is continuous, in particular in the point x α ∁ , implying f to be right-continuous in x = ( 1 α , x α ∁ ) .

(iii) If α = ∅ , then x 0 = 1 − h 0 for some h 0 ∈ ] 0 , 1 ] d , f ( 1 ) = f ( 1 − h ) ∀ 0 ≤ h ≤ h 0 , and f is constant by ( * ) . For α ≠ ∅ , we define g : [ 0 , 1 ] α ∁ → R + by g ( z ) ≔ f ( 1 α , z ) . Also, g is 2 - ↑ , and

g ( ( x 0 ) α ∁ ) = f ( x 0 ) = f ( 1 d ) = g ( 1 α ∁ ) ,

hence g is constant. But then, ∀ y ∈ [ 0 , 1 ] α

0 ≤ f ( y , 1 α ∁ ) − f ( y , 0 α ∁ ) ≤ f ( 1 α , 1 α ∁ ) − f ( 1 α , 0 α ∁ ) = g ( 1 α ∁ ) − g ( 0 α ∁ ) = 0 ,

showing f ( y , z ) to be independent of z .

(iv) The existence of f 0 ( y ) is clear, f being increasing and bounded. The defining inequalities for f being k - ↑ prevail for f 0 , for any k ≥ 2 . Since f 0 is continuous in 1 , it is everywhere continuous.□

Our first theorem will state some equivalent conditions for f to be k - ↑ . An essential ingredient will be positive linear (or affine) mappings: a linear function ψ : R m → R d is called positive iff ψ ( R + m ) ⊆ R + d ; and an affine φ : R m → R d is positive iff its “linear part” φ − φ ( 0 ) is.

Theorem 1

Let I ⊆ R d be a non-degenerate interval, f : I → R , k , d ∈ N . Then, there are equivalent:

f is k - ↑ ,
f is n - ↑ ∀ n ∈ N 0 d with 0 < ∣ n ∣ ≤ k ,
∀ m ∈ N , ∀ non-degenerate interval J ⊆ R m , ∀ positive affine φ : R m → R d such that φ ( J ) ⊆ I , also f ∘ φ is k - ↑ ,
∀ m , J , φ as before, and ∀ n ∈ N 0 m with 0 < ∣ n ∣ ≤ k the function f ∘ φ is n - ↑ ,
∀ m , J , φ as before, and ∀ n ∈ { 0 , 1 } m with 0 < ∣ n ∣ ≤ k the function f ∘ φ is n - ↑ .

Remark 2

For k ≥ d ≥ 2 and f ≥ 0 , the aforementioned function f is not only right-continuous (Lemma 1(ii)), but also 1 d - ↑ , hence a d.f. on I ; however, with the extra property that f ∘ φ is a d.f., too, for any positive affine φ : R m → R d . In other words, if f ( x ) = P ( X ≤ x ) for some d -dimensional random vector X , and k ≥ m , then also y ↦ P ( X ≤ φ ( y ) ) is an m -dimensional d.f.

Proof

We show (i) ⇔ (ii) and (i) ⇒ (iii) ⇒ (iv) ⇒ (v) ⇒ (i).

( i ) ⇒ ( i i ) ̲ is clear.

( i i ) ⇒ ( i ) ̲ : We use induction on k , k = 1 being obvious. Let now k ≥ 2 and suppose the case k − 1 is already known. Fix some h ∈ R + d and consider g ≔ Δ h f , i.e., g ( s ) = f ( s + h ) − f ( s ) . With

g 1 ( s ) ≔ f ( s + h 1 e 1 ) − f ( s ) , g 2 ( s ) ≔ f ( s + h 1 e 1 + h 2 e 2 ) − f ( s + h 1 e 1 ) , . . .

we have g = g 1 + g 2 + ⋯ + g d . Each g i is n - ↑ for any n ∈ N 0 d with ∣ n ∣ ≤ k − 1 ; hence, ( k − 1 ) - ↑ by assumption, and so is then g . Since h ∈ R + d was arbitrary, f is k - ↑ .

( i ) ⇒ ( i i i ) : ̲ Let j ∈ [ k ] , h ( 1 ) , … , h ( j ) ∈ R + m , x ∈ J such that also x + ∑ i = 1 j h ( i ) ∈ J . Then, with ψ ≔ φ − φ ( 0 ) ,

[ Δ h ( 1 ) … Δ h ( j ) ( f ∘ φ ) ] ( x ) = f ∘ φ ( x + h ( 1 ) + ⋯ + h ( j ) ) ∓ ⋯ + ( − 1 ) j f ∘ φ ( x ) = f ( φ ( 0 ) + ψ ( x + h ( 1 ) + ⋯ + h ( j ) ) ) ∓ ⋯ + ( − 1 ) j f ( φ ( 0 ) + ψ ( x ) ) = f ( φ ( 0 ) + ψ ( x ) + ψ ( h ( 1 ) ) + ⋯ + ψ ( h ( j ) ) ) ∓ ⋯ + ( − 1 ) j f ( φ ( 0 ) + ψ ( x ) ) = [ Δ ψ ( h ( 1 ) ) … Δ ψ ( h ( j ) ) f ] ( φ ( 0 ) + ψ ( x ) ) = [ Δ ψ ( h ( 1 ) ) … Δ ψ ( h ( j ) ) f ] ( φ ( x ) ) ≥ 0 ,

(note that φ ( x ) + ∑ i = 1 j ψ ( h ( i ) ) = φ ( x + ∑ i = 1 j h ( i ) ) ∈ I ).

( i i i ) ⇒ ( i v ) ⇒ ( v ) ̲ is clear.

( v ) ⇒ ( i ) ̲ : Let j ∈ [ k ] , x ∈ I , h ( 1 ) , … , h ( j ) ∈ R + d such that x + h ( 1 ) + ⋯ + h ( j ) ∈ I . Denote by ψ : R j → R d the linear map whose matrix with respect to the standard bases is ( h ( 1 ) , … , h ( j ) ) ( h ( i ) as column vectors), φ ≔ x + ψ ; obviously, ψ (and φ ) are positive. For J ≔ [ 0 j , 1 j ] ⊆ R j , we have φ ( J ) ⊆ I , since ψ ( e i ) = h ( i ) ∀ i ≤ j , φ ( 0 j ) = x and φ ( 1 j ) = x + ∑ i = 1 j h ( i ) . By assumption,

0 ≤ [ Δ 1 j 1 j ( f ∘ φ ) ] ( 0 j ) = f ( x + h ( 1 ) + ⋯ h ( j ) ) ∓ ⋯ + ( − 1 ) j f ( x ) = ( Δ h ( 1 ) … Δ h ( j ) f ) ( x ) .□

Corollary 1

Let I ⊆ R d and B ⊆ R be non-degenerate intervals. If g : I → B and f : B → R are both k - ↑ , then so is f ∘ g .

Proof

We show Condition (iv) in Theorem 1 to hold for f ∘ g . We know that g ∘ φ is n - ↑ for ∣ n ∣ ≤ k . A special case of Theorem 12 in [12] implies f ∘ ( g ∘ φ ) to be also n - ↑ , and f ∘ ( g ∘ φ ) = ( f ∘ g ) ∘ φ .□

Theorem 2

Let I ⊆ R d 1 and J ⊆ R d 2 be non-degenerate intervals, f : I → R and g : J → R , both non-negative and k - ↑ . Then, also f ⊗ g is k - ↑ on I × J , and in case I = J , the product f ⋅ g is k - ↑ , too.

Proof

We first apply (ii) of Theorem 1. For 0 ≠ ( m , n ) ∈ N 0 d 1 × N 0 d 2 , ( x , y ) ∈ I × J , h ( 1 ) ∈ R + d 1 , h ( 2 ) ∈ R + d 2 we have

[ Δ ( h ( 1 ) , h ( 2 ) ) m , n ( f ⊗ g ) ] ( x , y ) = ( Δ h ( 1 ) m f ) ( x ) ⋅ ( Δ h ( 2 ) n g ) ( y ) ,

and for ∣ ( m , n ) ∣ = ∣ m ∣ + ∣ n ∣ ≤ k , both factors on the right-hand side are non-negative. Since m = 0 or n = 0 is allowed, only ( m , n ) ≠ 0 being required, we need in fact f ≥ 0 and g ≥ 0 .

For I = J (with d 1 = d 2 ≕ d ), let φ : R d → R 2 d be given by φ ( x ) ≔ ( x , x ) , a positive linear map. Then, φ ( I ) ⊆ I × I , and by Theorem 1 (iii), ( f ⊗ g ) ∘ φ = f ⋅ g is also k - ↑ .□

We see that any monomial f ( x ) = ∏ i = 1 d x i n i ( n i ∈ N ) is k - ↑ on R + d for each k ∈ N . If c i ∈ ] 0 , ∞ [ then ∏ i = 1 d x i c i is k - ↑ on R + d at least for c i ≥ k − 1 , i = 1 , … , d .

Examples 1

(a) For a > 0 , the function f ( x , y ) ≔ ( x y − a ) + is 2 - ↑ on R + 2 , since t ↦ ( t − a ) + is 2 - ↑ on R + , by Corollary 1. Its restriction to [ 0 , 1 + a ] 2 is therefore the d.f. of some random vector. In [12] on page 261, it was shown that f is not ( 2 , 1 ) - ↑ (resp. ( 1 , 2 ) - ↑ ), but it is of course ( 1 , 1 ) - ↑ . The tensor product g ( x , y ) ≔ ( x − a ) + ⋅ ( y − b ) + is ( 2 , 2 ) - ↑ ∀ a , b > 0 ; hence, certainly 2 - ↑ , but not 3 - ↑ since x ↦ ( x − a ) + is not.

Similarly, ( x y z − a ) + 2 is 3 - ↑ on R + 3 , for a > 0 , and of course, ( x y − a ) + 2 is 3 - ↑ on R + 2 . We will see later on that x y + x z + y z − x y z is 2 - ↑ on [ 0 , 1 ] 3 , but not 3 - ↑ .

(b) Consider f n ( t ) ≔ t n ∕ ( 1 + t ) for t ≥ 0 . It was shown in [7], Lemma 2.4, that f n is n - ↑ (it is not ( n + 1 ) - ↑ ). So for any non-negative n - ↑ function g on any interval in any dimension, g n ∕ ( 1 + g ) is n - ↑ , too.

If g is “only” an n -dimensional d.f., then so is also g n ∕ ( 1 + g ) – this follows from [11], Theorem 2, but is also a special case of Theorem 6 below.

3 Approximation by Bernstein polynomials

The proof of our main result relies heavily on these special polynomials, since they inherit the monotonicity properties of interest. To define them, we introduce for r ∈ N , i ∈ { 0 , 1 , … , r }

b i , r ( t ) ≔ r i t i ( 1 − t ) r − i , t ∈ R ,

and for i = ( i 1 , … , i d ) ∈ { 0 , 1 , … , r } d

B i , r ≔ b i 1 , r ⊗ ⋯ ⊗ b i d , r .

For any f : [ 0 , 1 ] d → R , the associated Bernstein polynomials f ( 1 ) , f ( 2 ) , … are defined by:

f ( r ) ≔ ∑ 0 d ≤ i ≤ r d f i r ⋅ B i , r .

It is perhaps not so well known, that for each continuity point x of f , we have

f ( r ) ( x ) → f ( x ) , r → ∞ .

This is shown for ex. in [14, page 296], based on the strong law of large numbers for independent Bernoulli trials. Well known is the uniform convergence of f ( r ) to f on [ 0 , 1 ] d for a continuous function f .

In the following, the “upper right boundary” of [ 0 , 1 ] d will play a role. Let for α ⊆ [ d ]

T α ≔ { x ∈ [ 0 , 1 ] d ∣ x i < 1 ⇔ i ∈ α } .

Then, [ 0 , 1 ] d = ⋃ α ⊆ [ d ] T α is a disjoint union, T ∅ = { 1 d } and T [ d ] = [ 0 , 1 [ d . The union ⋃ α ⊊ [ d ] T α is called the upper right boundary of [ 0 , 1 ] d .

For f : [ 0 , 1 ] d → R with Bernstein polynomials f ( 1 ) , f ( 2 ) , … , let f ( α ) ≔ f ∣ T α , where for α ≠ ∅ T α may be identified with [ 0 , 1 [ α . Since for y ∈ [ 0 , 1 [ α

B i , r ( y , 1 α ∁ ) = B i α , r if i α ∁ = r α ∁ 0 else ,

we obtain for ∅ ≠ α ⊊ [ d ]

( f ( α ) ) ( r ) ( y ) = ∑ i α ≤ r α f ( α ) i α r B i α , r ( y ) ,

where f ( α ) i α r = f i α r , 1 α ∁ = f ( i α , r α ∁ ) r , and then

f ( r ) ( y , 1 α ∁ ) = ∑ i ≤ r d f i r B i , r ( y , 1 α ∁ ) = ∑ i α ≤ r α f ( i α , r α ∁ ) r B i α , r ( y ) = ( f ( α ) ) ( r ) ( y ) .

In other words, the restriction of f to one of the parts T α of the upper right boundary has as its Bernstein polynomials the restrictions of the original ones to T α . This leads to the following.

Theorem 3

Let f : [ 0 , 1 ] d → R have the property that each restriction f ∣ T α for ∅ ≠ α ⊆ [ d ] is continuous. Then,

lim r → ∞ f ( r ) ( x ) = f ( x ) ∀ x ∈ [ 0 , 1 ] d ,

i.e., the Bernstein polynomials converge pointwise to f everywhere.

Proof

Each x ∈ [ 0 , 1 ] d ⧹ { 1 d } lies in exactly one T α , i.e., x = ( x α , 1 α ∁ ) with x α < 1 α , where ∅ ≠ α ⊆ [ d ] , and is thus a continuity point of f ( α ) ≔ f ∣ T α . As already mentioned, this implies

( f ( α ) ) ( r ) ( x α ) → f ( α ) ( x α ) = f ( x α , 1 α ∁ ) = f ( x ) ,

and we saw also that

( f ( α ) ) ( r ) ( x α ) = f ( r ) ( x α , 1 α ∁ ) = f ( r ) ( x ) .

Since f ( r ) ( 1 d ) = f ( 1 d ) ∀ r , the proof is complete.□

For a function f of d variables, we will use a short notation for its partial derivatives (if they exist). Let p ∈ N 0 d ⧹ { 0 } , then

f p ≔ ∂ ∣ p ∣ f ∂ x 1 p 1 … ∂ x d p d ,

complemented by f 0 d ≔ f .

Lemma 2

Let f : [ 0 , 1 ] d → R be arbitrary, 0 ≠ p ∈ N 0 d .

If Δ h p f ≥ 0 ∀ h ∈ R + d then ( f ( r ) ) p ≥ 0 ∀ r ∈ N .
If f is in addition C ∞ , then f p ≥ 0 .

Proof

(i) Applying the formula for derivatives of one-dimensional Bernstein polynomials [12, p. 273] d times, we obtain

( f ( r ) ) p = c p ⋅ ∑ i ≤ r d − p Δ 1 r ⋅ 1 d p f i r b i 1 , r − p 1 ⊗ ⋯ ⊗ b i d , r − p d

with c p ≔ ∏ i = 1 d r ( r − 1 ) ⋅ … ⋅ ( r − p i + 1 ) . Hence, ( f ( r ) ) p ≥ 0 .

(ii) By [14, Theorem 4] ( f ( r ) ) p → f p , even uniformly, so f p ≥ 0 , too.□

Theorem 4

Let f : [ 0 , 1 ] d → R be a C ∞ -function, n ∈ N d , k ∈ N . Then,

f is n - ↑ ⇔ f p ≥ 0 ∀ 0 ≠ p ≤ n , p ∈ N 0 d .
f is k - ↑ ⇔ f p ≥ 0 ∀ 0 < ∣ p ∣ ≤ k , p ∈ N 0 d .

Proof

(i) “ ⇒ ”: follows from Lemma 2.

“ ⇐ ”: Let for m ∈ N σ m : R m → R be the sum function, σ n ≔ σ n 1 × σ n 2 × ⋯ × σ n d . By [13, Theorem 5], we have

f is n - ↑ ⇔ f ∘ σ n is 1 ∣ n ∣ - ↑ on J ≔ ∏ i = 1 d 0 , 1 n i n i .

The chain rule gives

( f ∘ σ n ) 1 ∣ n ∣ = f n ∘ σ n ≥ 0 ,

so that for x , x + h ∈ J , h ≥ 0 by Fubini’s theorem

( Δ h 1 ∣ n ∣ ( f ∘ σ n ) ) ( x ) = ∫ [ x , x + h ] ( f ∘ σ n ) 1 ∣ n ∣ d λ ∣ n ∣ ≥ 0 .

The same reasoning can be applied to 0 ≨ q ≤ 1 ∣ n ∣ , so that indeed f ∘ σ n is 1 n - ↑ , i.e., f is n - ↑ .

(ii) This follows immediately from the first equivalence in Theorem 1.□

Examples 2

f ( x , y ) ≔ x 2 y − a x 2 y 2 + y 2 on [ 0 , 1 ] 2 , 0 < a ≤ 1 2 . Since f p ≥ 0 for p ∈ { ( 1 , 0 ) , ( 0 , 1 ) , ( 1 , 1 ) , ( 2 , 0 ) , ( 0 , 2 ) } , f is 2 - ↑ ; but f ( 1 , 2 ) ( x , y ) = − 4 a x shows that f is neither 3 - ↑ nor ( 2 , 2 ) - ↑ .
f : R + 2 → R is defined by f ( 0 , 0 ) ≔ 0 and else
f ( x , y ) ≔ x y ( x 2 − y 2 ) x 2 + y 2 + 13 ⋅ ( x 2 + y 2 ) + 3 x y ,
(see [9, p. 321]), where it is given as an example of an ultramodular function on R + 2 (which does not automatically include that it is increasing). However, all partial derivatives f p with 0 < ∣ p ∣ ≤ 2 are ≥ 0 ; hence, f is 2 - ↑ (and not 3 - ↑ ).
With the abbreviation x α ≔ ∏ i ∈ α x i for α ⊆ [ d ] , x ∅ ≔ 1 , a polynomial of the form
f ( x ) = ∑ α ⊆ [ d ] c α x α
is called multilinear. f is affine in each variable; therefore, f p = 0 whenever p i > 1 for some i . Hence, f is k - ↑ iff f p ≥ 0 ∀ p ≤ 1 d with 0 < ∣ p ∣ ≤ k , and n - ↑ iff f is ( n ∧ 1 d ) - ↑ . The example ( d = 3 )
f ( x ) ≔ x 1 x 2 + x 1 x 3 + x 2 x 3 − x 1 x 2 x 3
is thus 2 - ↑ on [ 0 , 1 ] 3 , but not 3 - ↑ , since f ( 1 , 1 , 1 ) = − 1 . And f is ( n , n , 0 ) - ↑ ∀ n .

Theorem 5

Let f : [ 0 , 1 ] d → R , 2 d ≤ n ∈ N 0 d , 2 ≤ k ∈ N . The Bernstein polynomials of f are denoted f ( 1 ) , f ( 2 ) , … .

If f is n - ↑ , then so is each f ( r ) , and f ( r ) → f pointwise.
If f is k - ↑ , then so is each f ( r ) , and f ( r ) → f pointwise.

Proof

In both cases, f is (at least) 2 - ↑ ; therefore (by Lemma 1(ii)), the restriction f ∣ [ 0 , 1 [ d is continuous, and so are the other restrictions f ∣ T α for each non-empty α ⊆ [ d ] . By Theorem 3, f ( r ) ( x ) → f ( x ) ∀ x .

Lemma 2 implies ( f ( r ) ) p ≥ 0 ∀ r and ∀ 0 ≠ p ≤ n ; hence, f ( r ) is n - ↑ ∀ r .
Similarly now, ( f ( r ) ) p ≥ 0 ∀ r and ∀ 0 < ∣ p ∣ ≤ k , showing f ( r ) to be k - ↑ .□

Of course, a similar result holds if [ 0 , 1 ] d is replaced by any non-degenerate compact interval in R d .

4 Main results

The proof of Theorem 6 makes use of a far-reaching simultaneous generalization of the usual multivariate chain rule and Faà di Bruno’s formula. This admirable result was shown by Constantine and Savits [3, Theorem 2.1], and we present it here, keeping (almost) their notation.

Let d , m ∈ N , let g 1 , … , g m be defined and C ∞ in a neighborhood of x ( 0 ) ∈ R d (real-valued), put g ≔ ( g 1 , … , g m ) , and let f be defined and C ∞ in a neighborhood of y ( 0 ) ≔ g ( x ( 0 ) ) ∈ R m .

For μ , ν ∈ N 0 d , the relation μ ≺ ν holds iff one of the following three assertions is true:

∣ μ ∣ < ∣ ν ∣ ,
∣ μ ∣ = ∣ ν ∣ and μ 1 < ν 1 ,
∣ μ ∣ = ∣ ν ∣ , μ 1 = ν 1 , … , μ k = ν k , μ k + 1 < ν k + 1 , ∃ k ∈ [ d − 1 ] ,

(implying μ ≠ ν ).

Examples : ̲

( 1 , 3 , 0 , 4 , 1 ) ≺ ( 1 , 3 , 1 , 1 , 3 ) , here k = 2 ,
e d ≺ e d − 1 ≺ ⋯ ≺ e 1 ,
For d = 1 we have μ ≺ ν ⇔ μ < ν .

We need some abbreviations:

D x ν ≔ ∂ ∣ ν ∣ ∂ x 1 ν 1 … ∂ x d ν d for ∣ ν ∣ > 0 , D x 0 f ≔ f x ν ≔ ∏ i = 1 d x i ν i , ν ! ≔ ∏ i = 1 d ν i ! , ∣ ν ∣ ≔ ∑ i = 1 d ν i g μ ( i ) ≔ ( D x μ g i ) ( x ( 0 ) ) , g μ ≔ ( g μ ( 1 ) , … , g μ ( m ) ) f λ ≔ ( D y λ f ) ( y ( 0 ) ) h ≔ f ∘ g , h ν ≔ ( D x ν h ) ( x ( 0 ) ) ,

and, for ν ∈ N 0 d , λ ∈ N 0 m , s ∈ N , s ≤ ∣ ν ∣

P s ( ν , λ ) ≔ ( k 1 , … , k s ; l 1 , … , l s ) ∣ ∣ k j ∣ > 0 , 0 ≺ l 1 ≺ ⋯ ≺ l s , ∑ j = 1 s k j = λ , ∑ j = 1 s ∣ k j ∣ l j = ν ,

where (of course) k j ∈ N 0 m and l j ∈ N 0 d . (For some values of s , these sets may be empty.)

The announced formula by Constantine and Savits then reads

(**) h ν = ∑ 1 ≤ ∣ λ ∣ ≤ ∣ ν ∣ f λ ⋅ ∑ s = 1 ∣ ν ∣ ∑ P s ( ν , λ ) ν ! ⋅ ∏ j = 1 s ( g l j ) k j ( k j ! ) ⋅ ( l j ! ) ∣ k j ∣ .

This formula reduces for d = 1 to the classical one of Faà di Bruno from 1855 (see [3,4]).

One more result is needed, allowing general d.f.s to be “replaced” by C ∞ ones:

Lemma 3

Let ( Ω , A , ρ ) be a finite measure space and ∅ ≠ ℬ ⊆ A a finite collection of measurable sets. Then, there is another finite measure ρ 0 on A with finite support such that ρ 0 ∣ ℬ = ρ ∣ ℬ .
Let F on R d be the d.f. of some finite measure and ∅ ≠ B ⊆ R d a finite subset. Then, there is a C ∞ d.f. F ˜ on R d such that F ˜ ∣ B = F ∣ B .

Proof

(i) The set algebra generated by ℬ is still finite, and thus generated by a (unique) partition { A 1 , … , A n } of Ω . Choose x i ∈ A i for each i ≤ n , and put ρ 0 ≔ ∑ i = 1 n ρ ( A i ) ⋅ ε x i .

(ii) Let F be the d.f. of ρ , i.e., F ( x ) = ρ ( ] − ∞ , x ] ) ∀ x ∈ R d . Then, apply (i) to ℬ ≔ { ] − ∞ , b ] ∣ b ∈ B } , and denote by F 0 the d.f. of ρ 0 . Since ρ 0 has finite support, Lemma 3 of [11] is applicable, whose (short) proof provides a d.f. F ˜ as desired.□

Theorem 6

Let f : [ 0 , 1 ] m → R + be d - ↑ ( d ≥ 2 ) and let g 1 , … , g m : R d → [ 0 , 1 ] be d.f.s of (subprobability) measures on R d . Then, also f ∘ ( g 1 , … , g m ) is a d.f. on R d .

Proof

Put g ≔ ( g 1 , … , g m ) : R d → [ 0 , 1 ] m , h ≔ f ∘ g . By Lemma 1, also h is right-continuous, and it remains to show that h is 1 d - ↑ , the crucial property of a d.f. on R d .

A consequence of Theorem 5 is that we may assume f to be C ∞ , and we first let also g 1 , … , g m be C ∞ functions.

Switching to the terminology in connection with the aforementioned generalized Faà di Bruno formula, we have to show h ν ≥ 0 for ν ≤ 1 d . Then, ∣ ν ∣ ≤ d , and for λ ∈ N 0 m with ∣ λ ∣ ≤ ∣ ν ∣ , we have f λ ≥ 0 , by Theorem 4(ii). The condition

∑ j = 1 s ∣ k j ∣ l j = ν

in the set P s ( ν , λ ) , together with ∣ k j ∣ > 0 and l j ≠ 0 ∀ j , reduces to ∣ k j ∣ = 1 ∀ j and

∑ j = 1 s l j = ν ,

so that the l j are “disjoint” in an obvious sense, i.e., l j ∈ { 0 , 1 } d ⧹ { 0 d } and l i ∧ l j = 0 d for i ≠ j . In particular, g l j ≥ 0 ∀ j , each g i being a d.f. Formula ( * * ) now shows h ν ≥ 0 .

Now to the general case: in order to see that h = f ∘ g is 1 d - ↑ , we have to show for given x ∈ R d and ξ ∈ R + d

( Δ ξ 1 d h ) ( x ) = h ( x + ξ ) ∓ ⋯ + ( − 1 ) d h ( x ) ≥ 0

(as well as the analogue for some variables fixed, which is shown similarly).

In Lemma 3, we choose the finite set

x + ∑ i ∈ α ξ i e i ∣ α ⊆ [ d ] ≕ B

and find C ∞ d.f.s g ˜ 1 , … , g ˜ m such that g ˜ i ∣ B = g i ∣ B for each i ≤ m . Then,

0 ≤ ( Δ ξ 1 d ( f ∘ g ˜ ) ) ( x ) = ( Δ ξ 1 d h ) ( x ) ,

thus finishing the proof.□

Remark 3

The aforementioned theorem answers positively a question in the concluding remarks of [5]. For d = 2 , this result was shown in [6], Theorem 3.1.

Remark 4

If for a given f the conclusion of Theorem 6 holds for all d.f.s g 1 , … , g m , then f must be d - ↑ . This follows from Theorem 1(v), since each component of an affine positive function φ is of course 1 d - ↑ .

Examples 3

We saw before that f ( x ) ≔ x 1 x 2 + x 1 x 3 + x 2 x 3 − x 1 x 2 x 3 is 2 - ↑ on [ 0 , 1 ] 3 . Hence, for arbitrary bivariate d.f.s g 1 , g 2 and g 3 also g 1 g 2 + g 1 g 3 + g 2 g 3 − g 1 g 2 g 3 is a d.f., while f itself is not a three-dimensional d.f.
Put f a ( t ) ≔ ( t − a ) + ∕ ( 1 − a ) for t ∈ [ 0 , 1 ] and a ∈ [ 0 , 1 [ , complemented by f 1 ≔ 1 { 1 } . Then, { f α n ∣ a ∈ [ 0 , 1 ] } are the “essential” extreme points for ( n + 1 ) - ↑ functions on [ 0 , 1 ] , and { f a 1 n 1 ⊗ ⋯ ⊗ f a d n d ∣ a ∈ [ 0 , 1 ] d } correspondingly for ( n + 1 d ) - ↑ functions on [ 0 , 1 ] d , cf. [12]. In the bivariate case, f a ⊗ f b is ( 2 , 2 ) - ↑ , in particular 2 - ↑ , so that f c ∘ ( f a ⊗ f b ) is 2 - ↑ on [ 0 , 1 ] 2 . For any bivariate d.f.s g 1 and g 2 , we see that
( g 1 − a ) + ⋅ ( g 2 − b ) + ( 1 − a ) ⋅ ( 1 − b ) − c + , ( a , b , c ) ∈ [ 0 , 1 [ 3
is again a bivariate d.f.

Another important property of k - ↑ functions is their “universal” compatibility and composability within their class, which is made precise in the following.

Theorem 7

Let m , d , k ∈ N , J ⊆ R m and I ⊆ R d be non-degenerate intervals, g = ( g 1 , … , g m ) : I → J , f : J → R , each g i and f being k - ↑ . Then, also f ∘ g is k - ↑ .

Proof

The case k = 1 being obvious, let us assume k ≥ 2 . Since any non-degenerate interval is an increasing union of compact non-degenerate subintervals, we may choose I = [ 0 , 1 ] d and J = [ 0 , 1 ] m .

By Theorem 1, we have to show that h ≔ f ∘ g is n - ↑ for any n ∈ N 0 d such that 0 < ∣ n ∣ ≤ k . Since the variables i with n i = 0 do not enter, we may and do assume n ∈ N d , in particular k ≥ d . Then, each g i is n - ↑ , or equivalently, by [13, Theorem 5], g i ∘ σ n is 1 ∣ n ∣ - ↑ on ∏ i ≤ d 0 , 1 n i n i . Theorem 6 above now implies that also

f ∘ ( g 1 ∘ σ n , … , g m ∘ σ n ) = h ∘ σ n

is 1 ∣ n ∣ - ↑ , which in turn means that h is n - ↑ .□

Remark 5

We mentioned earlier that k - ↑ functions were considered already in [2], where our Theorem 7 is stated as Theorem 2. However, the proof given there is not a real one, in my opinion: the function g disappears more or less after a few lines, the terminology and notation are nearly “chaotic,” and I consider the reasoning incomprehensible. Of course, in theory, a completely “elementary” proof might be possible, but then discrete analogues of formula ( * * ) would have to appear, and this might get “out of control.” In [5, 8], Bronevich’s Theorem 2 is cited, without any comments on the proof. The special case k = 2 is proved in [6].

An open problem

While n - ↑ functions on [ 0 , 1 ] d , non-negative and normalized, are a Bauer simplex, with “essentially” certain powers of { f a 1 ⊗ ⋯ ⊗ f a d ∣ a ∈ [ 0 , 1 ] d } as their extreme points (Example 3(b)), not much so far is known for k - ↑ functions. Let us consider d = k = 2 and

K ≔ { f : [ 0 , 1 ] 2 → [ 0 , 1 ] ∣ f is 2 - ↑ and f ( 1 , 1 ) = 1 } .

K is obviously convex and compact and also stable under (pointwise) multiplication. It is easy to see that each f c ∘ ( f a ⊗ f b ) is an extreme point of K – but that is it, for the time being.

Acknowledgments

Thanks are due to the two reviewers for their constructive suggestions that certainly improved the presentation.

Funding information: No funding is involved.
Conflict of interest: The author states no conflict of interest.

References

[1] Boas, R. P., & Widder, D. V. (1940). Functions with positive differences. Duke Mathematical Journal, 7, 496–503. 10.1215/S0012-7094-40-00729-3Search in Google Scholar

[2] Bronevich, A. G. (2005). On the closure of families of fuzzy measures under eventwise aggregations. Fuzzy Sets and Systems, 153, 45–70. 10.1016/j.fss.2004.12.005Search in Google Scholar

[3] Constantine, G. M., & Savits, T. H. (1996). A multivariate Faa di Bruno formula with applications. Transactions of the AMS, 348, 503–520. 10.1090/S0002-9947-96-01501-2Search in Google Scholar

[4] Faà di Bruno, F. (1855). Sullo sviluppo delle funzioni. Annali di Scienze Mathematiche e Fisiche, 6, 479–480. Search in Google Scholar

[5] Klement, E. P., Manzi, M., & Mesiar, R. (2010). Aggregation functions with stronger types of monotonicity, (pp. 418–424). In: LNAI 6178. New York: Springer. 10.1007/978-3-642-14049-5_43Search in Google Scholar

[6] Klement, E. P., Manzi, M., & Mesiar, R. (2011). Ultramodular aggregation functions. Information Sciences, 181, 4101–4111. 10.1016/j.ins.2011.05.021Search in Google Scholar

[7] Koumandos, S., & Pedersen, H. L. (2009). Completely monotonic functions of positive order and asymptotic expansions of the logarithm of Barnes double gamma function and Euler’s gamma function. Journal of Mathematical Analysis and Applications, 355, 33–40. 10.1016/j.jmaa.2009.01.042Search in Google Scholar

[8] Manzi, M. (2011). New construction methods for copulas and the multivariate case. Tesi (Padova), BN 2013-396T. Search in Google Scholar

[9] Marinacci, M., & Montrucchio, L. (2005). Ultramodular functions. Mathematics of Operations Research, 30, 311–332. 10.1287/moor.1040.0143Search in Google Scholar

[10] Ressel, P. (2011). Monotonicity properties of multivariate distribution and survival functions – With an application to Lévy-frailty copulas. Journal of Multivariate Analysis, 102, 393–404. 10.1016/j.jmva.2010.10.001Search in Google Scholar

[11] Ressel, P. (2012). Functions operating on multivariate distribution and survival functions – With applications to classical mean values and to copulas. Journal of Multivariate Analysis, 105, 55–67. 10.1016/j.jmva.2011.08.007Search in Google Scholar

[12] Ressel, P. (2014). Higher order monotonic functions of several variables. Positivity, 18, 257–285. 10.1007/s11117-013-0244-6Search in Google Scholar

[13] Ressel, P. (2019). Copulas, stable tail dependence functions, and multivariate monotonicity. Dependence Modeling, 7, 247–258. 10.1515/demo-2019-0013Search in Google Scholar

[14] Veretennikov, A. Y., & Veretennikova, E. V. (2016). On partial derivatives of multivariate Bernstein polynomials. Siberian Advances in Mathematics, 26, 294–305. 10.3103/S1055134416040039Search in Google Scholar

Received: 2023-06-28

Revised: 2023-08-17

Accepted: 2023-09-22

Published Online: 2023-10-17

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

https://doi.org/10.1515/demo-2023-0104

Keywords for this article

multivariate distribution function; ultramodular; Bernstein polynomials; Faà di Bruno’s formula; higher order; monotonicity

Creative Commons

BY 4.0