Home Adaptive Image Compression via Optimal Mesh Refinement
Article Open Access

Adaptive Image Compression via Optimal Mesh Refinement

  • Michael Feischl ORCID logo and Hubert Hackl EMAIL logo
Published/Copyright: January 2, 2024

Abstract

The JPEG algorithm is a defacto standard for image compression. We investigate whether adaptive mesh refinement can be used to optimize the compression ratio and propose a new adaptive image compression algorithm. We prove that it produces a quasi-optimal subdivision grid for a given error norm with high probability. This subdivision can be stored with very little overhead and thus leads to an efficient compression algorithm. We demonstrate experimentally, that the new algorithm can achieve better compression ratios than standard JPEG compression with no visible loss of quality on many images. The mathematical core of this work shows that Binev’s optimal tree approximation algorithm is applicable to image compression with high probability, when we assume small additive Gaussian noise on the pixels of the image.

MSC 2020: 65D15; 65D18

1 Introduction

The JPEG algorithm has been one of the standard image compression algorithm for decades. It decomposes any given image into a fixed mesh of 8 × 8 pixel blocks. On each of these blocks the image is transformed via the discrete 2D cosine transform. The resulting sequence of coefficients is then quantized by use of a quantization matrix Q and a rounding step which is designed in such a way that higher frequencies are stored with less precision (or completely omitted). This leads to a significant storage space reduction compared to a bitmap representation of the image. Particularly for photographs, which typically do not exhibit sharp color or brightness changes, this compression method is very effective. The goal of the present work is to generate the background mesh adaptively in an optimal way instead of using a fixed block size of 8 × 8 in order to improve the compression rate. The mesh refinement algorithms are inspired by adaptive mesh refinement for the approximation of partial differential equations, particularly the adaptive finite element method.

Adaptive mesh refinement for partial differential equations was first mathematically analyzed for the finite element method in the 1980s and 1990s. I. M. Babuška initiated a wave of mathematical research by developing the first a posteriori error estimators [3] for the finite element method. This opened a new field of numerical analysis and led to tremendous research activity, we only mention the books [2, 24, 19]. Dörfler [10] gave a first rigorous strategy for mesh refinement based on error estimators in 2D more than a decade later and after another five years Morin, Nochetto, and Siebert [18] presented the first unconditional convergence proof.

Optimality of the adaptive strategy, i.e., the fact that asymptotically the optimal meshes are generated, was first proven by Babuška and Vogelius in 1D [4] and for d 2 by Binev, Dahmen, and DeVore [6] in 2004 for adaptive FEM for the Poisson model problem. Building on that, Stevenson [21] and Cascon, Kreuzer, Nochetto, and Siebert [9] simplified the algorithms and extended the problem class. We refer to [8, 14] for an overview on adaptive mesh refinement. The present work is inspired by adaptive tree approximation, which is an important building block of adaptive algorithms (particularly for time-dependent problems, when mesh coarsening is necessary) developed in [7] and extended to the hp-setting in [5]. Adaptive tree approximation produces meshes on which a given target function can be approximated with near optimal accuracy vs. cost ratio. We use this algorithm for the JPEG compression by approximating the target image (which can be seen as a function that is constant on each pixel) on a mesh of rectangles. On each of these rectangles, we perform the standard quantization of the image via the discrete 2D cosine transform as is done for classical JPEG compression.

We propose a compression and a decompression algorithm: The first (encoding) algorithm takes a given image I and produces a vector S I which encodes the compressed image. The second (decoding) algorithm produces an approximation I app to the original image by (almost) reversing the compression steps in the vector S I . We roughly show the following properties

  1. Given a tolerance τ > 0 , the approximation quality of I app is not worse than the sum of τ and the standard JPEG approximation error.

  2. The number of mesh elements is (near) minimal (see Theorem 4 for details).

Several other adaptive versions of JPEG can be found in the literature: In [12, 20], the compression quality on each 8 × 8 subgrid is chosen adaptively. In [23], an adaptive grid is chosen via a greedy algorithm based on interpolation error and this algorithm is used in [22] to compress images, however, no (near) optimality of the algorithm is proved (or can be expected).

The present work is structured as follows: In Section 2, we establish the adaptive algorithm for general norms. In Section 3, we apply the algorithm for the L 2 -norm and derive a near-optimality result. Section 4 discusses effective storage of the compressed image and a final Section 5 tests the algorithm on realistic images.

1.1 The JPEG Compression Algorithm

The JPEG compression algorithm was first proposed in the 90s (standardized in [26]) and is a defacto standard for image compression since.

The compression algorithm transforms the image into the YCbCr color space, which separates the colors into blue-difference chroma, red-difference chroma, and a gamma corrected luminance channel. Since human eyes are more susceptible to changes in luminance than in color, most implementations of the JPEG algorithm use some form of subsampling of the chromatic channels. A common method is the 4:2:0 subsampling, which means that the resolution of the chromatic channels gets halved in each dimension (e.g., by averaging values on 2 × 2 pixel blocks) [17, p. 209].

However, the basic algorihm is the same on all of the channels: The image is split into 8 × 8 -blocks of pixels on which the discrete cosine transform is applied resulting in an 8 × 8 matrix containing the coefficients. Each entry of this matrix is divided by the corresponding entry of the so-called quantization matrix Q of the same size. This matrix is predefined in the JPEG standard and controls the compression ratio of the algorithm. The result of the division is then rounded to the nearest integer, which usually produces many zeros in the higher frequency coefficients. The resulting sequence of integers is further compressed via a lossless runlength encoding algorithm. Note that the rounding step is the only irreversible step of the compression algorithm and hence the reason for quality loss in JPEG (see, e.g., [16] for further details and references).

2 The Adaptive Compression Algorithm

In the following, we will develop a mathematical formulation of adaptive JPEG compression.

2.1 Representation of Images

We suppose the image with h (height) times w (width) pixels is given by a function I : { 1 , , h } × { 1 , , w } 3 , where I ( i , j ) corresponds to the RGB-values of the pixel ( i , j ) , which might be normalized to [ 0 , 1 ] (although the precise encoding does not matter for what follows). We will also use ( i , j ) I as a shortcut for ( i , j ) { 1 , , h } × { 1 , , w } . The coordinates of the pixels are chosen such that the pixel ( 1 , 1 ) is the top left pixel and ( h , w ) is the bottom right pixel of the image, hence we can think of the image as a matrix of size h × w with values in 3 .

The principal idea for obtaining an approximate image I app is to divide the image into smaller elements and approximate the image on those via the classical JPEG strategy.

To that end, we define elements R as rectangular subsets of { 1 , , h } × { 1 , , w } with the same aspect ratio as the original image, i.e.

R := { j R , , j R + h R - 1 } × { k R , , k R + w R - 1 } with  h R w R = h w .

The aspect ratio condition is not strictly necessary, but makes it easier to efficiently store the mesh, as only one dimension and the position of the element have to be stored. Without loss of generality, we may assume that w and h are powers of two in order to ensure that admissible elements exist. A collection of elements R which forms a disjoint union of { 1 , , h } × { 1 , , w } is called a mesh 𝒯 . Corresponding to a mesh 𝒯 , we consider the approximate image I app , 𝒯 = I app computed on that mesh, where we hide the dependency on 𝒯 when clear from the context. By 𝒯 0 , we denote the initial mesh which consists of only one element 𝒯 0 = { R 0 } covering the entire image, i.e., R 0 = { 1 , , h } × { 1 , , w } .

Finally, by I | R , we denote the restriction of the image onto a given element R. In order to produce an optimally adapted mesh, we consider a simple refinement rule via bisection illustrated in Figure 1. Given a mesh 𝒯 , refinement of an element R 𝒯 produces a new mesh 𝒯 such that

𝒯 𝒯 = { R 1 , , R 4 } ,

i.e., we do not consider mesh closure (since the approximation problems below will be entirely local). Note that due to the assumptions on the aspect ratio of the elements, all newly created elements are geometrically similar to R 0 .

Figure 1

Elements are refined into four children by two bisections of their sides.

A mesh 𝒯 is called coarser than a mesh 𝒯 if the mesh 𝒯 can be obtained by applying refinement steps on 𝒯 . In this case, we also say 𝒯 is finer than 𝒯 and write 𝒯 < 𝒯 or 𝒯 > 𝒯 . With this notion we further define the sets

𝕋 := { 𝒯 : 𝒯 0 < 𝒯 }

and

𝕋 n := { 𝒯 𝕋 : # 𝒯 - # 𝒯 0 n } .

Note that while two meshes might not be comparable, the refinement rule of our choice still gives local nestedness. The following result is well-known and repeated for the convenience of the reader.

Lemma 1.

Let T , T T . Any element R T satisfies either R R for some R T or

(2.1) R = R 𝒯 R R R .

Proof.

If (2.1) is not true, there must exist some element R 𝒯 with R R and R R . Since R and R are both generated starting from R 0 , we conclude R R . ∎

2.2 The Adaptive Algorithm

We follow the strategy of JPEG explained in Section 1.1 and convert the image to the YCbCr color basis, thus resulting in three image components I Y , I C b , and I C r . We downsample the C b and C r components to half the image size by computing means of 2 × 2 blocks of pixels. The adaptive algorithm below (Algorithm 1) is then applied to each of the components.

The general strategy of the adaptive algorithm will be to start with 𝒯 0 and to iteratively refine the mesh until the compression error in the given norm is sufficiently small. To that end, we require local errors which steer the refinement.

Definition 1.

Let be a norm of our choice on the space of images of size h × w , I an image of size h × w and 𝒯 a mesh. We define:

  1. The local error

    (2.2) η ( R ) := ( I - I app , 𝒯 ) | R for all  R 𝒯 .

  2. The global error

    (2.3) E ( 𝒯 ) := ( I - I app , 𝒯 ) .

  3. The best approximation error for n elements

    E n := min 𝒯 𝕋 n E ( 𝒯 ) .

As the adaptive algorithm below is essentially a greedy algorithm, these straightforward definitions will not lead to optimal results for most error norms . It is necessary to consider a modified error first proposed and analyzed in [7].

We denote the modifed local error by η ~ and set η ~ ( R 0 ) = η ( R 0 ) for the initial element R 0 . Let 𝒯 𝕋 and we assume that η ~ ( R ) is already defined. We define η ~ ( R i ) for the children R i , i = 1 , , 4 , of R by

η ~ ( R i ) 2 := i = 1 4 η ( R i ) 2 η ( R ) 2 + η ~ ( R ) 2 η ~ ( R ) 2 .

This means η ~ is constant among siblings. The general idea behind this definition is that through the relation to the parent element introduced here, the modified error implicitly depends upon the level of the element, i.e., how many refinements were necessary to produce the element. This forces a non-local behavior of the algorithm which would be impossible with a standard greedy algorithm and turns out to be key to show optimality of the prodedure [7].

We are now in a position to formulate the adaptive algorithm.

Algorithm 1.

Input: Image component I of size h × w such that h , w are powers of two and initial mesh 𝒯 0 , tolerance τ > 0 , counter = 0 .While E ( 𝒯 ) > τ do:

  1. Compute modifed errors η ~ ( R ) for all R 𝒯 .

  2. Mark R 𝒯 if η ~ ( R ) = max R 𝒯 η ~ ( R ) and min { h R , w R } 16 .

  3. Refine marked elements to generate 𝒯 + 1 and set = + 1 . If no marked elements exist: Stop.

Output: A sequence of meshes 𝒯 .

Remark 1.

Note that the algorithm does not mark elements R, which have less than sixteen pixels in at least one dimension for refining. Hence, the newly obtained elements consist of at least eight pixels in each dimension and the final grid is not finer than the standard JPEG grid.

Remark 2.

For square images with side length being a power of two, the minimum size criterion in Line 2 of Algorithm 1 becomes redundant, because the only rectangles R which do not satisfy it are of size 8 × 8 , which is the same size as in the JPEG grid, hence η ( R ) = η ~ ( R ) = 0 implying the global error already has to be zero at this point and the algorithm must have terminated before entering this iteration. In other cases, it can happen that no rectangles are being refined due to the minimum size criterion even though the error bound has not been reached. In this case, rectangles R with minimal size and maximal local error should be compressed with the standard JPEG algorithm leading to η ~ ( R ) = 0 .

The main goal of the following section will be to prove near-optimality of Algorithm 1 by employing the framework developed in [7].

Definition 2.

Let an algorithm compute an approximation of a given image I on some (adaptively refined) mesh 𝒯 I 𝕋 . We call the algorithm near-optimal (with respect to the number of refinements) if there exist positive constants C 1 , C 2 such that for n , k with k C 1 n ,

𝒯 I 𝕋 k E ( 𝒯 I ) C 2 E n .

The constants are universal, i.e., they do not depend on the image I.

3 Adaptive JPEG Compression in the L 2 -Norm

In this section, we employ Algorithm 1 with denoting the L 2 -norm weighted with ( w h ) - 1 / 2 , i.e., the inverse square root of the total number of pixels. It is a natural question whether the L 2 -norm appropriately reflects human tolerance for quality loss in images (the answer is likely negative, see, e.g., [25]) and we refer to Sections 3.3 and Section 5 for further discussion and experiments.

3.1 Image Compression on Mesh Elements

The approximate image on a given mesh 𝒯 is produced similarly to the original JPEG algorithm, i.e., we use the discrete cosine transform (DCT) and a quantization Matrix Q to quantize the image. We use a quantization matrix defined in the JPEG standard [26]:

Q := ( 16 11 10 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 ) .

In the following, we will require some operators to formalize the adaptive compression: Let R be an element of size h ~ × w ~ with h ~ h and w ~ w . We define a restriction to the top-left 8 × 8 block of R

(3.1)

(3.1a) TL R : h ~ × w ~ 8 × 8 , ( a i j ) 1 i h ~ , 1 j w ~ ( a i j ) 1 i , j 8 ,

as well as its right-inverse, an embedding from 8 × 8 into h ~ × w ~ by adding zeros,

(3.1b) TL R - 1 : 8 × 8 h ~ × w ~ , TL R TL R - 1 = Id : 8 × 8 8 × 8 .

Furthermore, let DCT R denote the 2D discrete cosine transform on h ~ × w ~ , i.e.,

(3.1c) DCT R : h ~ × w ~ h ~ × w ~ , ( a i j ) 1 i h ~ , 1 j w ~ ( 2 h ~ w ~ C ( i ) C ( j ) k = 0 h ~ - 1 l = 0 w ~ - 1 a k l cos ( 2 k + 1 ) i π 2 h ~ cos ( 2 l + 1 ) j π 2 w ~ ) 1 i h ~ ,  1 j w ~

and IDCT R its inverse, i.e.,

(3.1d) IDCT R : h ~ × w ~ h ~ × w ~ , ( a i j ) 1 i h ~ , 1 j w ~ ( 2 h ~ w ~ k = 0 h ~ - 1 l = 0 w ~ - 1 C ( k ) C ( l ) a k l cos ( 2 k + 1 ) i π 2 h ~ cos ( 2 l + 1 ) j π 2 w ~ ) 1 i h ~ ,  1 j w ~ ,

where C ( n ) = 1 2 if n = 0 and C ( n ) = 1 otherwise.

Note that these operators solely depend on the size of R and are oblivious to the position of R within the image. Since no elements R with size less then 8 × 8 will occur in Algorithm 1, TL R will always be well-defined. In the following, let the symbols and . / denote entrywise multiplication and division for matrices of the same size. For a given image I and a mesh 𝒯 we compute an approximate image I app , 𝒯 final via

(3.2) I app , 𝒯 final | R = IDCT R ( TL R - 1 ( Q round ( TL R ( DCT R ( I | R ) ) . / Q ) ) ) for all  R 𝒯 .

However, for the adaptive procedure we will ignore the quantization and therefore only compute the approximate image for the final mesh this way. For a mesh 𝒯 < 𝒯 , which occurs during the adaptive procedure, we compute an approximate image I app , 𝒯 by just restricting the image to the first 8 × 8 DCT coefficients on each element R 𝒯 , i.e.,

(3.3) I app , 𝒯 | R = IDCT R ( TL R - 1 ( TL R ( DCT ( I | R ) ) ) ) for all  R 𝒯 .

In the next sections we will only use (3.3) and return to (3.2) when the goal is to efficiently store and reconstruct the approximate image. The local and global errors defined in (2.2)–(2.3) thus read

η ( R ) 2 := 1 h w ( i , j ) R ( I app , 𝒯 ( i , j ) - I ( i , j ) ) 2

and

E ( 𝒯 ) 2 := R 𝒯 η ( R ) 2 = 1 h w ( i , j ) I ( I app , 𝒯 ( i , j ) - I ( i , j ) ) 2 .

3.2 The Refinement Property

The key for optimality of Algorithm 1 is the so-called refinement property (also called the subadditivity of the estimator): There exists a universal constant C 0 > 0 such that

(3.4) i = 1 4 η ( R i ) 2 C 0 η ( R ) 2 with  R 1 , , R 4  children of  R  for all  R 𝒯 𝕋 .

Essentially, the refinement property (3.4) states that refinement of the mesh does not significantly increase the approximation error. Unfortunately, it turns out that this is not true in general in our application as is demonstrated in Figure 2. However, we show that under some assumptions on the distribution of images, there is a high probability that the refinement property (3.4) holds for a given image. To this end, we assume there is a distortion of each pixel I ( i , j ) in the original image I of the form ε ξ i , j , where the ξ i , j are i.i.d. standard normal random numbers and ε is the amplitude of the distortion. The squared L 2 -norm of the distorted image I ~ is then a sum of squares of non-central normal distributions, i.e.,

I ~ 2 2 = ( i , j ) I ( I ( i , j ) + ε ξ i , j ) 2 .

The sum of squared normal distributions is Chi-squared distributed. The Chi-squared distribution has two parameters: The number of degrees of freedom k, which is the number of terms in the sum, and the non-centrality parameter λ, which is the squared sum of the means of the underlying normal distributions. Thus I ~ 2 2 is non-central Chi-squared distributed with wh degees of freedom and non-centrality parameter λ = I 2 2 . By Φ k , λ , we denote the cummulative distribution function (cdf) of the non-central Chi-squared distribution (see, e.g., [11] for explicit formulas). The following lemma shows some elementary monotonicity properties of Φ k , λ , which will help us to deal with different non-centrality parameters in the proof of Theorem 3 below.

Figure 2 
                  Bottom-left: Coefficient matrix 
                        
                           
                              
                                 
                                    DCT
                                    R
                                 
                                 ⁢
                                 
                                    (
                                    
                                       
                                          I
                                          |
                                       
                                       R
                                    
                                    )
                                 
                              
                           
                           
                           {{\rm DCT}_{R}(I|_{R})}
                        
                      of element R with size 
                        
                           
                              
                                 32
                                 ×
                                 32
                              
                           
                           
                           {32\times 32}
                        
                      and only one non-zero coefficient 
                        
                           
                              
                                 (
                                 5
                                 ,
                                 5
                                 )
                              
                           
                           
                           {(5,5)}
                        
                      with amplitude 15. Top: Corresponding image 
                        
                           
                              
                                 
                                    I
                                    |
                                 
                                 R
                              
                           
                           
                           {I|_{R}}
                        
                      with children 
                        
                           
                              
                                 
                                    R
                                    1
                                 
                                 ,
                                 …
                                 ,
                                 
                                    R
                                    4
                                 
                              
                           
                           
                           {R_{1},\ldots,R_{4}}
                        
                      marked in red. Bottom-right: Coefficient matrices 
                        
                           
                              
                                 
                                    DCT
                                    
                                       R
                                       i
                                    
                                    
                                       -
                                       1
                                    
                                 
                                 ⁢
                                 
                                    (
                                    
                                       
                                          I
                                          |
                                       
                                       
                                          R
                                          i
                                       
                                    
                                    )
                                 
                              
                           
                           
                           {{\rm DCT}_{R_{i}}^{-1}(I|_{R_{i}})}
                        
                     . The green lines mark the block of 
                        
                           
                              
                                 8
                                 ×
                                 8
                              
                           
                           
                           {8\times 8}
                        
                      coefficients which do not contribute to 
                        
                           
                              
                                 η
                                 ⁢
                                 
                                    (
                                    R
                                    )
                                 
                              
                           
                           
                           {\eta(R)}
                        
                      or 
                        
                           
                              
                                 η
                                 ⁢
                                 
                                    (
                                    
                                       R
                                       i
                                    
                                    )
                                 
                              
                           
                           
                           {\eta(R_{i})}
                        
                      respectively. Thus, for this example, the right-hand side of (3.4) is zero (no non-zero coefficient outside the green block in the left image), while the left-hand side is not.
Figure 2 
                  Bottom-left: Coefficient matrix 
                        
                           
                              
                                 
                                    DCT
                                    R
                                 
                                 ⁢
                                 
                                    (
                                    
                                       
                                          I
                                          |
                                       
                                       R
                                    
                                    )
                                 
                              
                           
                           
                           {{\rm DCT}_{R}(I|_{R})}
                        
                      of element R with size 
                        
                           
                              
                                 32
                                 ×
                                 32
                              
                           
                           
                           {32\times 32}
                        
                      and only one non-zero coefficient 
                        
                           
                              
                                 (
                                 5
                                 ,
                                 5
                                 )
                              
                           
                           
                           {(5,5)}
                        
                      with amplitude 15. Top: Corresponding image 
                        
                           
                              
                                 
                                    I
                                    |
                                 
                                 R
                              
                           
                           
                           {I|_{R}}
                        
                      with children 
                        
                           
                              
                                 
                                    R
                                    1
                                 
                                 ,
                                 …
                                 ,
                                 
                                    R
                                    4
                                 
                              
                           
                           
                           {R_{1},\ldots,R_{4}}
                        
                      marked in red. Bottom-right: Coefficient matrices 
                        
                           
                              
                                 
                                    DCT
                                    
                                       R
                                       i
                                    
                                    
                                       -
                                       1
                                    
                                 
                                 ⁢
                                 
                                    (
                                    
                                       
                                          I
                                          |
                                       
                                       
                                          R
                                          i
                                       
                                    
                                    )
                                 
                              
                           
                           
                           {{\rm DCT}_{R_{i}}^{-1}(I|_{R_{i}})}
                        
                     . The green lines mark the block of 
                        
                           
                              
                                 8
                                 ×
                                 8
                              
                           
                           
                           {8\times 8}
                        
                      coefficients which do not contribute to 
                        
                           
                              
                                 η
                                 ⁢
                                 
                                    (
                                    R
                                    )
                                 
                              
                           
                           
                           {\eta(R)}
                        
                      or 
                        
                           
                              
                                 η
                                 ⁢
                                 
                                    (
                                    
                                       R
                                       i
                                    
                                    )
                                 
                              
                           
                           
                           {\eta(R_{i})}
                        
                      respectively. Thus, for this example, the right-hand side of (3.4) is zero (no non-zero coefficient outside the green block in the left image), while the left-hand side is not.
Figure 2 
                  Bottom-left: Coefficient matrix 
                        
                           
                              
                                 
                                    DCT
                                    R
                                 
                                 ⁢
                                 
                                    (
                                    
                                       
                                          I
                                          |
                                       
                                       R
                                    
                                    )
                                 
                              
                           
                           
                           {{\rm DCT}_{R}(I|_{R})}
                        
                      of element R with size 
                        
                           
                              
                                 32
                                 ×
                                 32
                              
                           
                           
                           {32\times 32}
                        
                      and only one non-zero coefficient 
                        
                           
                              
                                 (
                                 5
                                 ,
                                 5
                                 )
                              
                           
                           
                           {(5,5)}
                        
                      with amplitude 15. Top: Corresponding image 
                        
                           
                              
                                 
                                    I
                                    |
                                 
                                 R
                              
                           
                           
                           {I|_{R}}
                        
                      with children 
                        
                           
                              
                                 
                                    R
                                    1
                                 
                                 ,
                                 …
                                 ,
                                 
                                    R
                                    4
                                 
                              
                           
                           
                           {R_{1},\ldots,R_{4}}
                        
                      marked in red. Bottom-right: Coefficient matrices 
                        
                           
                              
                                 
                                    DCT
                                    
                                       R
                                       i
                                    
                                    
                                       -
                                       1
                                    
                                 
                                 ⁢
                                 
                                    (
                                    
                                       
                                          I
                                          |
                                       
                                       
                                          R
                                          i
                                       
                                    
                                    )
                                 
                              
                           
                           
                           {{\rm DCT}_{R_{i}}^{-1}(I|_{R_{i}})}
                        
                     . The green lines mark the block of 
                        
                           
                              
                                 8
                                 ×
                                 8
                              
                           
                           
                           {8\times 8}
                        
                      coefficients which do not contribute to 
                        
                           
                              
                                 η
                                 ⁢
                                 
                                    (
                                    R
                                    )
                                 
                              
                           
                           
                           {\eta(R)}
                        
                      or 
                        
                           
                              
                                 η
                                 ⁢
                                 
                                    (
                                    
                                       R
                                       i
                                    
                                    )
                                 
                              
                           
                           
                           {\eta(R_{i})}
                        
                      respectively. Thus, for this example, the right-hand side of (3.4) is zero (no non-zero coefficient outside the green block in the left image), while the left-hand side is not.
Figure 2

Bottom-left: Coefficient matrix DCT R ( I | R ) of element R with size 32 × 32 and only one non-zero coefficient ( 5 , 5 ) with amplitude 15. Top: Corresponding image I | R with children R 1 , , R 4 marked in red. Bottom-right: Coefficient matrices DCT R i - 1 ( I | R i ) . The green lines mark the block of 8 × 8 coefficients which do not contribute to η ( R ) or η ( R i ) respectively. Thus, for this example, the right-hand side of (3.4) is zero (no non-zero coefficient outside the green block in the left image), while the left-hand side is not.

Lemma 2.

The cdf of the non-central chi-squared distribution with k N degrees of freedom and non-centrality parameter λ 0 satisfies Φ k , λ ( x ) Φ k , λ ( x ) for all 0 x < and all λ λ .

Proof.

We use the well-known identity Φ k , λ ( x ) = 1 - Q k / 2 ( λ , x ) , where Q denotes the Marcum Q-function. According to [13, equation (2)], the derivative can be represented via

μ Q k / 2 ( μ , y ) = μ ( Q k / 2 + 1 ( μ , y ) - Q k / 2 ( μ , y ) ) .

Together with [1, equation (6)], this leads to

(3.5) λ Φ k , λ ( x ) = λ ( 1 - Q k / 2 ( λ , x ) ) = - 1 2 λ ( λ ( x λ ) k 2 e - λ + x 2 I k / 2 ( x λ ) ) ,

where the modified Bessel function of the first kind is defined as

I k / 2 := k = 0 1 k ! Γ ( k + k / 2 + 1 ) ( x 2 ) 2 k + k 2 .

This shows λ Φ k , λ ( x ) 0 for all 0 x < and concludes the proof. ∎

Theorem 3.

Let I R h × w denote the original image and let I ~ R h × w denote a random distortion obtained by adding ε ξ i , j to the ( i , j ) -th pixel, where the ξ i , j are i.i.d. standard normal random numbers. Then there exists a constant 0 < δ 1 such that for any choice of parameter C > δ , the refinement property (3.4) with constant C 0 = 2 + 2 C is true with probability

p ref ( 1 - Φ 448 , 0 ( δ 2 z C 2 - δ 2 ) ) Φ 64 , ε - 2 ( z ) for all  z 0 .

The constant δ can numerically be tracked to δ 0.13 for realistic image sizes.

Proof.

Let R denote an element with size h ~ × w ~ and let R i , i { 1 , , 4 } . denote the children of R. We recall the operators TL R i , TL R i - 1 , DCT R i , and IDCT R i defined in (3.1) above. For i { 1 , , 4 } , we consider the operators

A R i : h ~ , w ~ h ~ 2 × w ~ 2 , A R i := ( Id - TL R i - 1 TL R i ) DCT R i P R i IDCT R TL R - 1 TL R DCT R ,

where P R i denotes the restriction of R to R i as well as

A : h ~ , w ~ h ~ , w ~ , A ( I | R ) | R i := A R i ( I | R ) for  i = 1 , , 4 .

As we will see below, the operator A R encodes the difference between the error on the coarse element R and the fine elements R i . As compositions of orthogonal projections, embeddings and (inverse) discrete cosine transforms, the operators A R i are bounded uniformly in h ~ and w ~ . Given h ~ and w ~ , we may compute their norm exactly by calculating the corresponding matrix norm. Obviously, there holds δ := A 1 , but we can track the norm numerically for various realistic image sizes (see Figure 3 for the results). This suggests that δ = 0.13 is a realistic upper bound and improves the constant in the statement compared to δ = 1 . Note that all the remaining arguments of the proof remain valid with the naive bound δ = 1 .

Figure 3 
                        Left: The norm of A for various sizes of R. Right: Upper bound on 
                              
                                 
                                    
                                       1
                                       -
                                       
                                          p
                                          ref
                                       
                                    
                                 
                                 
                                 {1-p_{\rm ref}}
                              
                            for different values of ε and 
                              
                                 
                                    
                                       z
                                       ≥
                                       0
                                    
                                 
                                 
                                 {z\geq 0}
                              
                           .
Figure 3 
                        Left: The norm of A for various sizes of R. Right: Upper bound on 
                              
                                 
                                    
                                       1
                                       -
                                       
                                          p
                                          ref
                                       
                                    
                                 
                                 
                                 {1-p_{\rm ref}}
                              
                            for different values of ε and 
                              
                                 
                                    
                                       z
                                       ≥
                                       0
                                    
                                 
                                 
                                 {z\geq 0}
                              
                           .
Figure 3

Left: The norm of A for various sizes of R. Right: Upper bound on 1 - p ref for different values of ε and z 0 .

We aim to argue that the refinement property (3.4) is satisfied for most images. By assumption, we have I ~ ( i , j ) = I ( i , j ) + ε ξ i , j for i.i.d. standard normal variables ξ i , j . Note that I ~ is a random variable and the law of its L 2 -norm follows a non-central chi-squared distribution. The local error η ( R i ) can be identified with the difference of I R i and the corresponding approximate image on R i . Therefore, we compute for the children R i of R

i = 1 4 η ( R i ) 2 = i = 1 4 ( Id - TL R i - 1 TL R i ) DCT R i P R i ( I ~ ) 2 2
i = 1 4 2 ( Id - TL R i - 1 TL R i ) DCT R i P R i IDCT R ( Id - TL R - 1 TL R ) DCT R ( I ~ ) 2 2 + 2 A ( I ~ ) 2 2
i = 1 4 2 P R i IDCT R ( Id - TL R - 1 TL R ) DCT R ( I ~ ) 2 2 + 2 A ( I ~ ) 2 2 ,

where we used the triangle inequality and the identity

Id = IDCT R ( Id - TL R - 1 TL R ) DCT R + IDCT R ( TL R - 1 TL R ( DCT R ) ) .

Since i = 1 4 P R i J 2 2 = J 2 2 for any image J with size of R and IDCT R = 1 , we can further estimate

i = 1 4 η ( R i ) 2 2 ( Id - TL R - 1 TL R ) DCT R ( I ~ ) 2 2 + 2 δ I ~ 2 2 = 2 η ( R ) 2 + 2 δ I ~ 2 2 .

Next, we show for C > 0 that

(3.6) δ C I ~ 2 = δ C DCT R ( I ~ ) 2 ( Id - TL R - 1 TL R ) DCT R ( I ~ ) 2 = η ( R )

holds with high probability. We denote this probability with 0 p ref 1 . Note that adding the noise to I is equivalent to adding the noise to DCT R ( I ) , since the discrete cosine transform is just a change of orthonormal bases. Thus, it remains to estimate the norm of

( ( Id - TL R - 1 TL R ) J ~ ) i j = { 0 , i 8  or  j 8 , J ~ i j , else ,

where J ~ := ( DCT R ( I ) i , j + ε ξ i , j ) . Thus, it suffices to estimate the probability for

δ C J ~ 2 ( Id - TL R - 1 TL R ) J ~ 2 .

With M := { ( i , j ) 2 :  1 i h R ,  1 j w R } and N := M { 1 , , 8 } 2 , this can be rewritten as

p ref = ( δ 2 C 2 ( i , j ) M ( DCT R ( I ) i , j + ε ξ i , j ) 2 ( i , j ) N ( DCT R ( I ) i , j + ε ξ i , j ) 2 ) .

Without loss of generality, we may assume that I 2 = 1 and hence 1 i , j 8 DCT R ( I ) i , j 2 1 , which leads to

p ref ( δ 2 C 2 X ( 1 - δ 2 C 2 ) Y ) ,

with independent random variables X := 1 i , j 8 ( DCT R ( I ) i , j / ε + ξ i , j ) 2 and Y := ( i , j ) N ( ε - 1 DCT R ( I ) i , j + ξ i , j ) 2 . We set μ = δ 2 C 2 and denote by ϕ X , ϕ Y and Φ X , Φ Y the respective density and cumulative density functions of X and Y. This leads to

p ref = 0 μ x 1 - μ ϕ Y ( y ) 𝑑 y ϕ X ( x ) 𝑑 x
= 0 ( 1 - Φ Y ( μ x 1 - μ ) ) ϕ X ( x ) 𝑑 x
0 ( 1 - Φ 448 , 0 ( μ x 1 - μ ) ) ϕ X ( x ) 𝑑 x ,

where we used Lemma 2 for the last inequality as well as the fact that Y contains at least 32 16 - 64 = 448 terms. This follows from the fact that Algorithm 1 only refines elements with h ~ , w ~ 16 . For h ~ = w ~ = 16 , the left-hand side of (3.4) is zero by definition. Since w R , h R are powers of two by assumption, the next smallest element R satisfies max { h R , w R } 32 , such that # B 32 16 - 64 = 448 . For z 0 , we may further estimate

p ref 1 - 0 Φ 448 , 0 ( μ x 1 - μ ) ϕ X ( x ) 𝑑 x
1 - 0 z Φ 448 , 0 ( μ x 1 - μ ) ϕ X ( x ) 𝑑 x - ( 1 - Φ X ( z ) )
( 1 - Φ 448 , 0 ( μ z 1 - μ ) ) Φ X ( z )
( 1 - Φ 448 , 0 ( μ z 1 - μ ) ) Φ 64 , ε - 2 ( z ) .

For the last inequality, we used Lemma 2 and the fact that X is the sum of 64 squared normal distributions and the non-centrality parameter is given by the squared sum of the means, i.e.

1 i , j 8 ( DCT R ( I ) i , j ε ) 2 ε - 2 I 2 2 = ε - 2 .

Combining the estimates, we obtain with probability p ref that

i = 1 4 η ( R ~ i ) 2 ( 2 + 2 C ) η ( R ) 2 .

This concludes the proof. ∎

The cdfs of non-centralized chi squared distributions can be found, e.g., in [11] and can be numerically calculated in, e.g., Matlab via the command ncx2cdf. This allows us to evaluate the probability bound from Theorem 3 for realistic values of δ = 0.13 and C = 1 . This implies the refinement property (3.4) with C 0 = 4 with probability

p ref { 1 - 7.3 10 - 6 , ε = 0.0075 , 1 - 5.4 10 - 9 , ε = 0.008 , 1 - 1.6 10 - 12 , ε = 0.0085 .

See Figure 3 for the behavior of the lower bound for p ref from Theorem 3 with respect to z.

Remark 3.

We note that Theorem 3 implies that a noise amplitude ε = 0.0075 , ensures a probability of 99.99 % to satisfy the refinement property (3.4) with C 0 = 4 . We also emphasize that such a noise is not noticeable to the human eye as demonstrated in Figure 4 even for an amplitude of 0.015. This suggests that images which do not satisfy (3.4) are the rare exception.

Figure 4 
                  Left: The original image. Right: The original image with an additional noise of amplitude 0.015. Image from unsplash.com by artist Kadyn Pierce.
Figure 4 
                  Left: The original image. Right: The original image with an additional noise of amplitude 0.015. Image from unsplash.com by artist Kadyn Pierce.
Figure 4

Left: The original image. Right: The original image with an additional noise of amplitude 0.015. Image from unsplash.com by artist Kadyn Pierce.

The following result states the optimality of Algorithm 1.

Theorem 4.

Let I be such that the refinement property (3.4) is true. Then, for all iterations = 0 , 1 , of Algorithm 1, the mesh T satisfies

(3.7) E ( 𝒯 ) 12 C 0 2 E m for all  m # 𝒯 10 ,

where C 0 > 0 is the constant from (3.4). To create T the algorithm uses at most 12 C 0 2 ( # T - # T 0 + 1 ) arithmetic operations and computations of η.

Proof.

Since the refinement property (3.4) is true, the statement follows from [7, Theorem 5.2]. In our case the number of children is K = 4 (using notation from [7]), leading to the constant C in [7, Theorem 5.2] being equal to 12 C 0 2 . Similarly, the upper bound for m stated in [7] contains the factor of 1 10 . We note that we use a slightly different definition of 𝕋 n and hence E n . In [7], 𝕋 n denotes the set of meshes generated by exactly n refinements, while we count the number of elements. Since each refinement increases the element count by three, the number of refinements corresponds to 1 3 ( # 𝒯 - # 𝒯 0 ) . Thus, both definitions are equivalent and the statement above is proved. ∎

Remark 4.

Using the values ε = 0.0075 and C = 1 from Remark 3, Theorems 34 imply E ( 𝒯 ) 192 E m for all m # 𝒯 10 with probability over 99.99 % .

Remark 5.

In Algorithm 1 the use of the largest η ~ ( R ) may require a sorting procedure with complexity O ( N log N ) . To keep the number of operations of order O ( N ) we can use binary bins instead of sorting. The result and proof can be changed accordingly with only the absolute constant in the estimate of the theorem changing.

3.3 Adaptive Compression in Other Norms

A norm which might be better suited to capture the quality loss in images is the BV-norm. A heuristic explanation for this is the fact that the human eye is more sensitive to sharp color/brightness changes which are not captured in the L 2 -norm. The BV-norm for a function u BV L 1 ( ( 0 , h ] × ( 0 , w ] ) is defined by

(3.8) u BV := u 1 + V ( u ) ,

where

V ( u ) := sup { ( 0 , h ] × ( 0 , w ] u ( x ) div ( ϕ ( x ) ) 𝑑 x : ϕ C c 1 ( ( 0 , h ] × ( 0 , w ] , n ) , ϕ 1 }

denotes the total variation of u. For weakly differentiable functions w L 1 , i.e., w W 2 , 1 the BV-norm of w is given by

w BV = w 1 + w 1 .

However, in the present setting, the images I and I app represent functions which are constant on each pixel of the image. For such a function u, the total variation computes to

V ( u ) = sup ϕ 1 I u div ( ϕ ) 𝑑 x = sup ϕ 1 I I u n ϕ 𝑑 x = e F [ u ] L 1 ( e ) ,

where n denotes the unit normal vector on , I denotes the open subset of 2 occupied by the pixels of I, and F denotes the set of interior interfaces between the pixels. We also use the common notation [ u ] | e = u | - u | to denote the jump of u over the interface e = F between two pixels and . Thus the BV-norm of the approximation error ( I - I app ) can be written as

I - I app BV = 1 h w ( I - I app 1 + e F | [ I - I app ] | e | ) .

The new norm induces new local and global errors

(3.9) η ( R ) = ( I - I app ) | R BV and E ( 𝒯 ) = R 𝒯 η ( R )

which can be used to drive Algorithm 1.

Remark 6.

Since we consider norms on finite-dimensional spaces, we immediately obtain equivalence of the L 2 -norm and BV-norm. Therefore, the refinement property (3.4) also holds for the BV-norm for most images and hence also the near-optimality result Theorem 4. However, the constants in (3.4) for the BV-norm depend on the norm equivalence constants, which in turn depend on the image size. A direct computation of the constant seems very difficult as it would involve mapping properties of the discrete cosine transform as an operator on BV.

4 Efficient Storage of the Adaptive Compression

The adaptive grid produced by Algorithm 1 has to be stored in order to reconstruct the compressed image. This potentially produces overhead compared to standard JPEG compression, which uses a predefined grid. However, using some simple optimizations, we demonstrate that this overhead is not significant.

4.1 Storing I app

Algorithm 1 is applied to each of the color components X { Y , C b , C r } and produces a corresponding mesh 𝒯 X . As we neglected the rounding step of the JPEG compression in the adaptive algorithm, we compress the X component of I app , I app X , for each element R 𝒯 X by

  1. Compute F X ( R ) := round ( TL R ( DCT R ( I X | R ) ) . / Q ) .

  2. Store entries from F X ( R ) in the order depicted in Figure 5 in a vector S R X until the last non-zero entry.

Note that the discrete cosine transform returns a matrix of coefficients of basis functions, where the coefficients towards bottom and right correspond to basis functions of higher frequency. Typical images usually have small coefficients corresponding to basis functions of high frequency, which means that after rounding they become zero and are dropped.

Figure 5

Visualizing the enumeration N of the matrix entries.

To reconstruct I app X from the vectors S R X , we also need to store the width w of I app , the ratio h w of I app as well as the width of R 𝒯 X . Furthermore, we need some separator bits denoted by sep in order to mark the start of the next element in S I . Surprisingly, it is not necessary to store the location of the element R within I. To that end, we introduce a total order on the elements R within a mesh 𝒯 X . Recall that an element R is given by the position of its top-left pixel and its width. Therefore, we can simply order the top-left pixels of the elements:

We define the lexicographic order < on the pixels ( i , j ) of I by

(4.1) ( i , j ) < ( i , j ) ( i < i )  or  ( ( i = i )  and  ( j < j ) ) .

We then sort the elements 𝒯 X = { R 1 X , , R n X X } such that their top-left pixels ( j R i , k R i ) are ordered with respect to < in an ascending fashion.

We consider the vector S I for an approximate image I app with corresponding meshes 𝒯 X consisting of elements R i X , i = 1 , , n X with X { Y , C b , C r } . Then the vector S I has the form

S I = ( h w , w , width ( R 1 Y ) , S R 1 Y Y , sep , width ( R 2 Y ) , S R 2 Y Y , sep , , width ( R n Y Y ) , S R n Y Y Y ,
width ( R 1 C b ) , S R 1 C b C b , sep , , width ( R n C b C b ) , S R n C b C b C b ,
width ( R 1 C r ) , S R 1 C r C r , sep , , width ( R n C r C r ) , S R n C r C r C r ) .

A further lossless runlength compression step [15] is applied to S I before storing the data. This is similar to the standard JPEG compression and removes the typically many zeros appearing in S R X .

4.2 Reconstruction of the Compressed Image

To reconstruct I app , we first reverse the lossless compression of S I . We remove h w , and w from S I and store them. Then we reconstruct each color component X { Y , C b , C r } individually with the following algorithm.

Algorithm 2.

  1. Initialize empty I app X with size h × w .

  2. Find the < -minimal empty pixel ( i , j ) in I app X , if no empty pixel exists move to the next color component.

  3. Remove the first available block ( width ( R X ) , S R X X ) from S I .

  4. Compute w R X = width ( R X ) and h R X = w R X h w .

  5. Set

    I app X | { i , , i + h R X - 1 } × { j , , j + w R X - 1 } = IDCT R ( TL R X - 1 ( Q F X ( R X ) ) ) .

    and goto Step (2).

Remark 7.

Searching for the minimal pixel in Step (2) of Algorithm 2 is quite costly if done in a naive way. However, we notice that the minimum is always located next to the corner pixels of previously reconstructed elements R (precisely, next to the top-right and bottom-left pixels, see Figure 6). Updating a list of those candidate pixels reduces the search effort significantly.

Figure 6

Visualizing the list of candidates for the “ < ”-minimal empty pixel. The arrow marks the minimum.

5 Practical Performance of the Adaptive Compression Algorithm

We perform a limited number of experiments with Algorithm 1 in order to test its performance on realistic images depicted in Figure 7. As shown in Figures 910, we choose the tolerance τ in Algorithm 1 such that no quality difference between original and compression is immediately visible. Figure 8 compares the storage size of the original JPEG file and the adaptive recompression with Algorithm 1.

We observe that storage space for larger JPEG Files can be significantly reduced with adaptive compression, while smaller files show little improvement. Since Algorithm 1 does not produce elements smaller than 8 × 8 , the finest obtainable mesh is the one of the standard JPEG algorithm. Hence, heuristically, the required storage space for any given image is at most about the same as JPEG, although a rigorous statement on the worst case is non-trivial.

Figure 7 
               From left to right, top to bottom: Test images from unsplash.com (artist and size): (1) Kadyn Pierce (4000x6000), (2) Willian Justen de Vasconcellos (3024x4032), (3) Emre Raizyq (6240x4160), (4) Kevin Charit (5780x3853), (5) Vishwajeet Nishad (2560x1920), (6) Christoph Nolte (2400x1920), (7) Lukasz Rawa (960x640), (8) Mark Stuckey (1440x1920).
Figure 7 
               From left to right, top to bottom: Test images from unsplash.com (artist and size): (1) Kadyn Pierce (4000x6000), (2) Willian Justen de Vasconcellos (3024x4032), (3) Emre Raizyq (6240x4160), (4) Kevin Charit (5780x3853), (5) Vishwajeet Nishad (2560x1920), (6) Christoph Nolte (2400x1920), (7) Lukasz Rawa (960x640), (8) Mark Stuckey (1440x1920).
Figure 7 
               From left to right, top to bottom: Test images from unsplash.com (artist and size): (1) Kadyn Pierce (4000x6000), (2) Willian Justen de Vasconcellos (3024x4032), (3) Emre Raizyq (6240x4160), (4) Kevin Charit (5780x3853), (5) Vishwajeet Nishad (2560x1920), (6) Christoph Nolte (2400x1920), (7) Lukasz Rawa (960x640), (8) Mark Stuckey (1440x1920).
Figure 7 
               From left to right, top to bottom: Test images from unsplash.com (artist and size): (1) Kadyn Pierce (4000x6000), (2) Willian Justen de Vasconcellos (3024x4032), (3) Emre Raizyq (6240x4160), (4) Kevin Charit (5780x3853), (5) Vishwajeet Nishad (2560x1920), (6) Christoph Nolte (2400x1920), (7) Lukasz Rawa (960x640), (8) Mark Stuckey (1440x1920).
Figure 7 
               From left to right, top to bottom: Test images from unsplash.com (artist and size): (1) Kadyn Pierce (4000x6000), (2) Willian Justen de Vasconcellos (3024x4032), (3) Emre Raizyq (6240x4160), (4) Kevin Charit (5780x3853), (5) Vishwajeet Nishad (2560x1920), (6) Christoph Nolte (2400x1920), (7) Lukasz Rawa (960x640), (8) Mark Stuckey (1440x1920).
Figure 7 
               From left to right, top to bottom: Test images from unsplash.com (artist and size): (1) Kadyn Pierce (4000x6000), (2) Willian Justen de Vasconcellos (3024x4032), (3) Emre Raizyq (6240x4160), (4) Kevin Charit (5780x3853), (5) Vishwajeet Nishad (2560x1920), (6) Christoph Nolte (2400x1920), (7) Lukasz Rawa (960x640), (8) Mark Stuckey (1440x1920).
Figure 7 
               From left to right, top to bottom: Test images from unsplash.com (artist and size): (1) Kadyn Pierce (4000x6000), (2) Willian Justen de Vasconcellos (3024x4032), (3) Emre Raizyq (6240x4160), (4) Kevin Charit (5780x3853), (5) Vishwajeet Nishad (2560x1920), (6) Christoph Nolte (2400x1920), (7) Lukasz Rawa (960x640), (8) Mark Stuckey (1440x1920).
Figure 7 
               From left to right, top to bottom: Test images from unsplash.com (artist and size): (1) Kadyn Pierce (4000x6000), (2) Willian Justen de Vasconcellos (3024x4032), (3) Emre Raizyq (6240x4160), (4) Kevin Charit (5780x3853), (5) Vishwajeet Nishad (2560x1920), (6) Christoph Nolte (2400x1920), (7) Lukasz Rawa (960x640), (8) Mark Stuckey (1440x1920).
Figure 7

From left to right, top to bottom: Test images from unsplash.com (artist and size): (1) Kadyn Pierce (4000x6000), (2) Willian Justen de Vasconcellos (3024x4032), (3) Emre Raizyq (6240x4160), (4) Kevin Charit (5780x3853), (5) Vishwajeet Nishad (2560x1920), (6) Christoph Nolte (2400x1920), (7) Lukasz Rawa (960x640), (8) Mark Stuckey (1440x1920).

Figure 8 
               Storage space required for the original JPEG files (JPG) and their adaptive recompression using Algorithm 1 (AJPG).
Figure 8 
               Storage space required for the original JPEG files (JPG) and their adaptive recompression using Algorithm 1 (AJPG).
Figure 8 
               Storage space required for the original JPEG files (JPG) and their adaptive recompression using Algorithm 1 (AJPG).
Figure 8 
               Storage space required for the original JPEG files (JPG) and their adaptive recompression using Algorithm 1 (AJPG).
Figure 8 
               Storage space required for the original JPEG files (JPG) and their adaptive recompression using Algorithm 1 (AJPG).
Figure 8 
               Storage space required for the original JPEG files (JPG) and their adaptive recompression using Algorithm 1 (AJPG).
Figure 8 
               Storage space required for the original JPEG files (JPG) and their adaptive recompression using Algorithm 1 (AJPG).
Figure 8 
               Storage space required for the original JPEG files (JPG) and their adaptive recompression using Algorithm 1 (AJPG).
Figure 8 
               Storage space required for the original JPEG files (JPG) and their adaptive recompression using Algorithm 1 (AJPG).
Figure 8

Storage space required for the original JPEG files (JPG) and their adaptive recompression using Algorithm 1 (AJPG).

Figure 9 
               Comparison of the original image (top) with the two output of the adaptive algorithm (middle) for small and large τ. Closeup of original (bottom-left) and recompression (bottom-right) used in Figure 8. Image from unsplash.com by artist Kadyn Pierce.
Figure 9 
               Comparison of the original image (top) with the two output of the adaptive algorithm (middle) for small and large τ. Closeup of original (bottom-left) and recompression (bottom-right) used in Figure 8. Image from unsplash.com by artist Kadyn Pierce.
Figure 9 
               Comparison of the original image (top) with the two output of the adaptive algorithm (middle) for small and large τ. Closeup of original (bottom-left) and recompression (bottom-right) used in Figure 8. Image from unsplash.com by artist Kadyn Pierce.
Figure 9 
               Comparison of the original image (top) with the two output of the adaptive algorithm (middle) for small and large τ. Closeup of original (bottom-left) and recompression (bottom-right) used in Figure 8. Image from unsplash.com by artist Kadyn Pierce.
Figure 9 
               Comparison of the original image (top) with the two output of the adaptive algorithm (middle) for small and large τ. Closeup of original (bottom-left) and recompression (bottom-right) used in Figure 8. Image from unsplash.com by artist Kadyn Pierce.
Figure 9

Comparison of the original image (top) with the two output of the adaptive algorithm (middle) for small and large τ. Closeup of original (bottom-left) and recompression (bottom-right) used in Figure 8. Image from unsplash.com by artist Kadyn Pierce.

Figure 10 
               Comparison of the original image (top) with the two output of the adaptive algorithm (middle) for small and large τ. Closeup of original (bottom-left) and recompression (bottom-right) used in Figure 8. Image from unsplash.com by artist Willian Justen de Vasconcellos.
Figure 10 
               Comparison of the original image (top) with the two output of the adaptive algorithm (middle) for small and large τ. Closeup of original (bottom-left) and recompression (bottom-right) used in Figure 8. Image from unsplash.com by artist Willian Justen de Vasconcellos.
Figure 10 
               Comparison of the original image (top) with the two output of the adaptive algorithm (middle) for small and large τ. Closeup of original (bottom-left) and recompression (bottom-right) used in Figure 8. Image from unsplash.com by artist Willian Justen de Vasconcellos.
Figure 10 
               Comparison of the original image (top) with the two output of the adaptive algorithm (middle) for small and large τ. Closeup of original (bottom-left) and recompression (bottom-right) used in Figure 8. Image from unsplash.com by artist Willian Justen de Vasconcellos.
Figure 10 
               Comparison of the original image (top) with the two output of the adaptive algorithm (middle) for small and large τ. Closeup of original (bottom-left) and recompression (bottom-right) used in Figure 8. Image from unsplash.com by artist Willian Justen de Vasconcellos.
Figure 10

Comparison of the original image (top) with the two output of the adaptive algorithm (middle) for small and large τ. Closeup of original (bottom-left) and recompression (bottom-right) used in Figure 8. Image from unsplash.com by artist Willian Justen de Vasconcellos.

Figure 11 
               Comparison of the original image (top) with the output of the adaptive algorithm with respect to the L2 norm (middle-left) and the BV-norm (middle-right). On the bottom row a detail of the L2 output (left) and the BV output (right) is compared, which suggests that the BV-norm may be better suited. The respective threshold parameters are chosen such that both outputs would take up almost the same storage space. Image from unsplash.com by artist Kevin Charit.
Figure 11 
               Comparison of the original image (top) with the output of the adaptive algorithm with respect to the L2 norm (middle-left) and the BV-norm (middle-right). On the bottom row a detail of the L2 output (left) and the BV output (right) is compared, which suggests that the BV-norm may be better suited. The respective threshold parameters are chosen such that both outputs would take up almost the same storage space. Image from unsplash.com by artist Kevin Charit.
Figure 11 
               Comparison of the original image (top) with the output of the adaptive algorithm with respect to the L2 norm (middle-left) and the BV-norm (middle-right). On the bottom row a detail of the L2 output (left) and the BV output (right) is compared, which suggests that the BV-norm may be better suited. The respective threshold parameters are chosen such that both outputs would take up almost the same storage space. Image from unsplash.com by artist Kevin Charit.
Figure 11 
               Comparison of the original image (top) with the output of the adaptive algorithm with respect to the L2 norm (middle-left) and the BV-norm (middle-right). On the bottom row a detail of the L2 output (left) and the BV output (right) is compared, which suggests that the BV-norm may be better suited. The respective threshold parameters are chosen such that both outputs would take up almost the same storage space. Image from unsplash.com by artist Kevin Charit.
Figure 11 
               Comparison of the original image (top) with the output of the adaptive algorithm with respect to the L2 norm (middle-left) and the BV-norm (middle-right). On the bottom row a detail of the L2 output (left) and the BV output (right) is compared, which suggests that the BV-norm may be better suited. The respective threshold parameters are chosen such that both outputs would take up almost the same storage space. Image from unsplash.com by artist Kevin Charit.
Figure 11

Comparison of the original image (top) with the output of the adaptive algorithm with respect to the L2 norm (middle-left) and the BV-norm (middle-right). On the bottom row a detail of the L2 output (left) and the BV output (right) is compared, which suggests that the BV-norm may be better suited. The respective threshold parameters are chosen such that both outputs would take up almost the same storage space. Image from unsplash.com by artist Kevin Charit.

6 Conclusion

We demonstrated the feasibility of adaptive JPEG compression both in practical experiments and in mathematical optimality results. From an efficiency standpoint, it would make sense to start with a finer uniform grid 𝒯 0 . Both Algorithm 1 and Algorithm 2 parallelize in a straighforward fashion over the elements of the initial grid R 𝒯 0 .

The method can be extended in several other ways: A first idea is to apply the algorithm to videos instead of images. This would require us to run Algorithm 1 on a mesh consisting of three-dimensional cuboid elements R (each cuboid can be split into eight children) and to approximate I | R by a discrete 3D cosine transform (or a variant which treats the time coordinate separately).

Another extension of the method applies Binev’s hp-approximation algorithm [5]. Instead of increasing the polynomial degree (as is done in the finite element method), p refinement can modify the quantization matrix locally on each element R, or it can alter the number of frequencies that are stored for each R.

Finally, it would be interesting to replace the norm in the local errors by some more sophisticated method which judges subjective quality loss of images.

Award Identifier / Grant number: 258734477

Funding statement: Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 258734477 – SFB 1173 as well as the Austrian Science Fund (FWF) under the special research program Taming complexity in PDE systems (grant SFB F65). The authors acknowledge TU Wien Bibliothek for financial support through its Open Access Funding Programme.

References

[1] A. Annamalai, C. Tellambura and J. Matyjas, A new twist on the generalized marcum q-function qm(a,b) with fractional-order m and its applications, 2009 6th IEEE Consumer Communications and Networking Conference, IEEE Press, Piscataway (2009), 1–5. 10.1109/CCNC.2009.4784840Search in Google Scholar

[2] M. Ainsworth and J. T. Oden, A Posteriori Error Estimation in Finite Element Analysis, Pure Appl. Math. (New York), Wiley-Interscience, New York, 2000. 10.1002/9781118032824Search in Google Scholar

[3] I. Babuška and A. Miller, A feedback finite element method with a posteriori error estimation. I. The finite element method and some basic properties of the a posteriori error estimator, Comput. Methods Appl. Mech. Engrg. 61 (1987), no. 1, 1–40. 10.1016/0045-7825(87)90114-9Search in Google Scholar

[4] I. Babuška and M. Vogelius, Feedback and adaptive finite element solution of one-dimensional boundary value problems, Numer. Math. 44 (1984), no. 1, 75–102. 10.1007/BF01389757Search in Google Scholar

[5] P. Binev, Tree approximation for hp-adaptivity, SIAM J. Numer. Anal. 56 (2018), no. 6, 3346–3357. 10.1137/18M1175070Search in Google Scholar

[6] P. Binev, W. Dahmen and R. DeVore, Adaptive finite element methods with convergence rates, Numer. Math. 97 (2004), no. 2, 219–268. 10.1007/s00211-003-0492-7Search in Google Scholar

[7] P. Binev and R. DeVore, Fast computation in adaptive tree approximation, Numer. Math. 97 (2004), no. 2, 193–217. 10.1007/s00211-003-0493-6Search in Google Scholar

[8] C. Carstensen, M. Feischl, M. Page and D. Praetorius, Axioms of adaptivity, Comput. Math. Appl. 67 (2014), no. 6, 1195–1253. 10.1016/j.camwa.2013.12.003Search in Google Scholar PubMed PubMed Central

[9] J. M. Cascon, C. Kreuzer, R. H. Nochetto and K. G. Siebert, Quasi-optimal convergence rate for an adaptive finite element method, SIAM J. Numer. Anal. 46 (2008), no. 5, 2524–2550. 10.1137/07069047XSearch in Google Scholar

[10] W. Dörfler, A convergent adaptive algorithm for Poisson’s equation, SIAM J. Numer. Anal. 33 (1996), no. 3, 1106–1124. 10.1137/0733054Search in Google Scholar

[11] B. Elpelt, J. Hartung and K. Klösener, Statistik, Oldenbourg Verlag, München, 1999. Search in Google Scholar

[12] F. Ernawan, N. A. Abu and N. Suryana, An adaptive jpeg image compression using psychovisual model, Adv. Sci. Lett. 20 (2014), 26–31. 10.1166/asl.2014.5255Search in Google Scholar

[13] R. Esposito, Comment on partial differentials of marcum’s q function, Proc. IEEE 56 (1968), 2195–2195. 10.1109/PROC.1968.6856Search in Google Scholar

[14] M. Feischl, Inf-sup stability implies quasi-orthogonality, Math. Comp. 91 (2022), no. 337, 2059–2094. 10.1090/mcom/3748Search in Google Scholar

[15] J. Hopkins, Compression routines, matlab central file exchange, accessed 31-may-2022, 2021. Search in Google Scholar

[16] G. Hudson, A. Léger, B. Niss, I. Sebestyén and J. Vaaben, JPEG-1 standard 25 years: Past, present, and future reasons for a success, J. Electron. Imag. 27 (2018), no. 4, Article ID 040901. 10.1117/1.JEI.27.4.040901Search in Google Scholar

[17] C. Lambrecht, Vision models and applications to image and video processing, 01/2001. Search in Google Scholar

[18] P. Morin, R. H. Nochetto and K. G. Siebert, Data oscillation and convergence of adaptive FEM, SIAM J. Numer. Anal. 38 (2000), no. 2, 466–488. 10.1137/S0036142999360044Search in Google Scholar

[19] S. Repin, A Posteriori Estimates for Partial Differential Equations, Radon Ser. Comput. Appl. Math. 4, Walter de Gruyter, Berlin, 2008. 10.1515/9783110203042Search in Google Scholar

[20] R. Rosenholtz and A. B. Watson, Perceptual adaptive jpeg coding, Proceedings of 3rd IEEE International Conference on Image Processing, IEEE Press, Piscataway (1996), 901–904. 10.1109/ICIP.1996.559645Search in Google Scholar

[21] R. Stevenson, Optimality of a standard adaptive finite element method, Found. Comput. Math. 7 (2007), no. 2, 245–269. 10.1007/s10208-005-0183-0Search in Google Scholar

[22] U. K. Tak, N. Ji, D. Qi and Z. Tang, An adaptive quantization technique for jpeg based on non-uniform rectangular partition, Future Wireless Networks and Information Systems, Springer, Berlin (2012), 179–187. 10.1007/978-3-642-27323-0_23Search in Google Scholar

[23] U. K. Tak, Z. Tang and D. Qi, A non-uniform rectangular partition coding of digital image and its application, 2009 International Conference on Information and Automation, IEEE Press, Piscataway (2009), 995–999. 10.1109/ICINFA.2009.5205063Search in Google Scholar

[24] R. Verfürth, A Posteriori Error Estimation Techniques for Finite Element Methods, Numer. Math. Sci. Comput., Oxford University, Oxford, 2013. 10.1093/acprof:oso/9780199679423.001.0001Search in Google Scholar

[25] Z. Wang and A. C. Bovik, Modern Image Quality Assessment, Springer, Cham, 2006. 10.1007/978-3-031-02238-8Search in Google Scholar

[26] ISO Standard. Iso/iec 10918-1:1994, Information technology - Digital compression and coding of continuous-tone still images: Requirements and guidelines, 1994. Search in Google Scholar

Received: 2023-04-04
Revised: 2023-11-06
Accepted: 2023-11-08
Published Online: 2024-01-02
Published in Print: 2024-04-01

© 2024 Walter de Gruyter GmbH, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 18.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/cmam-2023-0097/html
Scroll to top button