Startseite Refining uniform approximation algorithm for low-rank Chebyshev embeddings
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Refining uniform approximation algorithm for low-rank Chebyshev embeddings

  • Stanislav Morozov EMAIL logo , Dmitry Zheltkov und Alexander Osinsky
Veröffentlicht/Copyright: 31. Oktober 2024

Abstract

Nowadays, low-rank approximations are a critical component of many numerical procedures. Traditionally the problem of low-rank approximation of matrices is solved in unitary invariant norms such as Frobenius or spectral norm due to the existence of efficient methods for constructing approximations. However, recent results discover the potential of low-rank approximations in the Chebyshev norm, which naturally arises in many applications. In this paper, we investigate the problem of uniform approximation of vectors, which is the main component in the low-rank approximations of matrices in the Chebyshev norm. The principal novelty of this paper is the accelerated algorithm for solving uniform approximation problems. We also analyze the iterative procedure of the proposed algorithm and demonstrate that it has a geometric convergence rate. Finally, we provide an extensive numerical evaluation, which demonstrates the effectiveness of the proposed procedures.

MSC 2010: 65F30

Funding statement: This work was supported by the Moscow Center of Fundamental and Applied Mathematics at INM RAS (Agreement with the Ministry of Education and Science of the Russian Federation No. 075-15-2022-286).

References

[1] S. Budzinskiy, Quasioptimal alternating projections and their use in low-rank approximation of matrices and tensors. arXiv preprint arXiv:2308.16097, 2023.Suche in Google Scholar

[2] S. Budzinskiy, On the distance to low-rank matrices in the maximum norm. Linear Algebra and its Applications 688 (2024), 44–58.10.1016/j.laa.2024.02.012Suche in Google Scholar

[3] V. Daugavet, Uniform approximation of a function of two variables, tabulated as the product of functions of a single variable. USSR Computational Mathematics and Mathematical Physics 11 (1971), No. 2, 1–16.10.1016/0041-5553(71)90160-1Suche in Google Scholar

[4] V. K. Dzyadyk, On the approximation of functions on sets consisting of a finite number of points. Theory of Function Approximation and its Applications, 1974, pp. 69–80.Suche in Google Scholar

[5] V. K. Dzyadyk, Introduction to the Theory of Uniform Approximation of Functions by Polynomials. Nauka, Moscow, 1977 (in Russian).Suche in Google Scholar

[6] N. Gillis and Y. Shitov, Low-rank matrix approximation in the infinity norm. Linear Algebra and its Applications 581 (2019), 367–382.10.1016/j.laa.2019.07.017Suche in Google Scholar

[7] G. H. Golub and C. F. Van Loan, Matrix Computations. JHU Press, Baltimore–London, 2013.10.56021/9781421407944Suche in Google Scholar

[8] X. He, H. Zhang, M.-Y. Kan, and T.-S. Chua, Fast matrix factorization for online recommendation with implicit feedback. In: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016, pp. 549–558.10.1145/2911451.2911489Suche in Google Scholar

[9] S. A. Matveev, A. P. Smirnov, and E. Tyrtyshnikov, A fast numerical method for the Cauchy problem for the Smoluchowski equation. Journal of Computational Physics 282 (2015), 23–32.10.1016/j.jcp.2014.11.003Suche in Google Scholar

[10] S. Morozov, M. Smirnov, and N. Zamarashkin, On the optimal rank-1 approximation of matrices in the Chebyshev norm. Linear Algebra and its Applications 679 (2023), 4–29.10.1016/j.laa.2023.09.007Suche in Google Scholar

[11] A. Osinsky, Rectangular maximum volume and projective volume search algorithms. arXiv preprint arXiv:1809.02334, 2018.Suche in Google Scholar

[12] T. N. Sainath, B. Kingsbury, V. Sindhwani, E. Arisoy, and B. Ramabhadran, Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2013, 6655–6659.10.1109/ICASSP.2013.6638949Suche in Google Scholar

[13] S. W. Son, Z. Chen, W. Hendrix, A. Agrawal, W.-k. Liao, and A. Choudhary, Data compression for the exascale computing era-survey. Supercomputing Frontiers and Innovations 1 (2014), No. 2, 76–88.10.14529/jsfi140205Suche in Google Scholar

[14] M. Udell and A. Townsend, Why are big data matrices approximately low rank? SIAM Journal on Mathematics of Data Science 1 (2019), No. 1, 144–160.10.1137/18M1183480Suche in Google Scholar

[15] N. Zamarashkin, S. Morozov, and E. Tyrtyshnikov, On the best approximation algorithm by low-rank matrices in Chebyshev’s norm. Computational Mathematics and Mathematical Physics 62 (2022), No. 5, 701–718.10.1134/S0965542522050141Suche in Google Scholar

A Proof of the lemma

In this section we provide the proof for Lemma 4.1. We will need several technical lemmas.

Lemma A.1

For any pairwise distinct i, j, and k if

sign(D^jiDjD^kiDk)=1

then

sign(D^ijDiD^kjDk)=1.

Proof

Let us assume the opposite, that sign(D^ijDiD^kjDk)=1. Since D^ij=(1)ij+1D^ji, we have

sign(D^jiDjD^kiDk)=1sign(D^jiDiD^kjDk)=(1)ij.

Let sign(DkD^ji)=δ, where δ = ± 1. Then

sign(DjD^ki)=δsign(DiD^kj)=(1)ijδsign(DkD^ji)=δ.

Since V is a Chebyshev matrix, the vector vi can be uniquely represented as

vi=αvj+γvk+z1 (A.1)

where z1 belongs to the linear hull of v1, …, vr+1, which do not include vi, vj, and vk. We also represent

vj=βvi+z2 (A.2)

where z2 belongs to the linear hull of v1, …, vr+1, vr+2, which do not include vi, vj, and vk.

From (A.1) and the properties of the determinant we obtain that

Dj=α(1)ij+1Di.

Similarly, from (A.2) and the properties of the determinant, we can deduce that

D^ki=βD^kj.

Hence,

DjD^ki=α(1)ij+1Di(1)βD^kj=αβ(1)ijDiD^kj.

Taking the signs from both sides we get that

δ=sign(αβ)(1)ij(1)ijδ

whence

sign(αβ)=1.

Note that the coefficients in (A.1) can be expressed by Cramer's formulas, and due to V being the Chebyshev matrix, we have α, γ ≠ 0. Then we can express the vector vk from (A.1) as

vk=1γviαγvj1γz1.

Then

Dj=αγ(1)kj+1Dk=αγ(1)kjDk. (A.3)

From (A.1) and (A.2) we have

vj=β(αvj+γvk+z1)+z2

hence

(1αβ)vj=βγvk+βz1+z2. (A.4)

Note that βz1 + z2 is a linear combination of vectors v1, …, vr+1, vr+2, which do not include vi, vj, and vk. Since from the arguments above βγ ≠ 0, on the right-hand side of (A.4), there is a non-trivial linear combination of r linearly independent vectors, whence βγvk + βz1 + z2 ≠ 0. Therefore, 1 – αβ ≠ 0. Then

vj=βγ1αβvk+βz1+z21αβ.

and we can derive that

D^ki=βγ1αβ(1)kj+1D^ji (A.5)

so we have from (A.3) and (A.5) that

DjD^ki=αγ(1)kjDkβγ1αβ(1)kj+1D^ji=αβ1αβ(1)DkD^ji.

Taking the signs from both sides we get that

δ=sign(αβ)sign(1αβ)(1)δ

therefore,

sign(αβ)=sign(1αβ)=1

which is the contradiction.□

Lemma A.2

Let all components of the vector w ∈ ℝr+2 be non-zero and such that the signs in the sequence

w1D1,w2D2,,wn+1Dn+1

alternate. Let

sign(w^jiD^jiw^kiD^ki)=(1)jk+1.

Then

  1. if i = j, then

    sign(w^jkD^jkw^kkD^kk)=(1)jk;
  2. if i = k, then

    sign(w^kjD^kjw^jjD^jj)=(1)jk;
  3. if ij and ik, then

    sign(w^ijD^ijw^kjD^kj)=(1)ik

    and

    sign(w^ikD^ikw^jkD^jk)=(1)ij.

Proof

Let j = k. Then the condition of the lemma cannot be fulfilled, since

sign(w^jiD^jiw^jiD^ji)=1(1)jk+1=1.

Let i = jk. Then the condition of the lemma can be written as

sign(w^iiD^iiw^kiD^ki)=(1)ik+1

which by the definitions of D^ii and ŵi writes as

sign(wr+2DiwkD^ki)=(1)ik+1.

Since the signs in the sequence

w1D1,w2D2,,wr+1Dr+1

alternate,

sign(wiDiwkDk)=(1)ik.

Let us multiply the last two equations and get that

sign(wr+2wiDkD^ki)=sign(wiDiwkDk)sign(wr+2DiwkD^ki)=(1)ik(1)ik+1=1.

Since D^ik=(1)ik+1D^ki, we have that

sign(wn+2DkwiD^ik)=(1)(1)ik+1=(1)ik.

It remains to note that w^kk=wr+2,w^ik=wi, and D^kk=Dk, whence

sign(w^kkD^kkw^ikD^ik)=(1)ik.

Thus, the first part of the lemma is proved. The second part can be obtained from the first one by rearranging the factors.

Let us prove the third part. Let i, j, and k be pairwise distinct. Then the condition of the lemma can be written as

sign(wjD^jiwkD^ki)=(1)jk+1. (A.6)

Let us prove that in this case

sign(wiD^ijwkD^kj)=(1)ik.

On the contrary, let

sign(wiD^ijwkD^kj)=(1)ik+1. (A.7)

Multiplying (A.6) and

sign(wjDjwkDk)=(1)jk

we get that

sign(D^jiDjD^kiDk)=sign(wjD^jiwkD^ki)sign(wjDjwkDk)=(1)jk+1(1)jk=1. (A.8)

Similarly, multiplying (A.7) and

sign(wiDiwkDk)=(1)ik

we get that

sign(D^ijDiD^kjDk)=sign(wiD^ijwkD^kj)sign(wiDiwkDk)=(1)ik+1(1)ik=1. (A.9)

It remains to note that (A.8) and (A.9) contradict Lemma A.1, whence

sign(wiD^ijwkD^kj)=(1)ik. (A.10)

If we multiply (A.6) and (A.10), we get that

sign(wjD^jiwkD^ki)sign(wiD^ijwkD^kj)=sign(wjD^kiwiD^kjD^ijD^ji)=(1)ij+1sign(wjD^kjwiD^ki)

whence

sign(wjD^kjwiD^ki)=(1)ij+1(1)jk+1(1)ik=1.

Due to D^ki=(1)ki+1D^ik and D^kj=(1)jk+1D^jk, we get that

sign(wjD^jkwiD^ik)=(1)ki+1(1)jk+1=(1)ji. (A.11)

It remains to note that (A.10) and (A.11), up to notation, correspond to the statement of the lemma.□

Lemma A.3

Let a matrix S ∈ ℝr×r be such that sij ∈ {–1, 1} and satisfy the following property: if for a triple (i, j, k), where 1 ⩽ i, j, kr, we have sijsik = (–1)jk+1, then

  1. if i = j, then

    skiskk=(1)ik;
  2. if i = k, then

    sjisjj=(1)ij;
  3. if i, j, and k are pairwise distinct, then

    sjisjk=(1)ik

    and

    skiskj=(1)ij.

Then the matrix S has a row with the alternating signs.

Proof

Let us prove the lemma by induction. Let r = 2. If s11 s12 = –1, then the first row has the alternating signs. Otherwise, s11s12 = 1 and the condition of the lemma is fulfilled for i = 1, j = 1, k = 2. Then

s21s22=1

and the second row has alternating signs.

Let us assume the statement is true for r – 1 and prove it for the matrix of size r × r. If the condition of the lemma is fulfilled for a matrix, it is also fulfilled for its principal submatrix. By the induction hypothesis, it contains a row t with the alternating signs. If st,r–1st,r = –1, then the signs alternate in the row t in the whole matrix. Let st,r–1st,r = 1. Then, due to the alternation of signs in the row t, we have

st,ist,r=(1)ir+1,i=1,,r.

Then for i = t we have

sr,tsr,r=(1)tr

and for it, we have

sr,tsr,i=(1)ti,it

so

sr,tsr,i=(1)ti,i=1,,r

which corresponds to the alternance of signs in the last row.□

Proof of Lemma 4.1

Let us define the matrix S ∈ ℝ(r+1)×(r+1) such that ski=sign(w^ikD^ik). It is easy to see that the conditions of Lemma A.2 are met, from which the requirements of Lemma A.3 follow, which in turn implies the statement of Lemma 4.1.□

Received: 2024-08-21
Accepted: 2024-08-29
Published Online: 2024-10-31
Published in Print: 2024-11-26

© 2024 Walter de Gruyter GmbH, Berlin/Boston

Heruntergeladen am 3.12.2025 von https://www.degruyterbrill.com/document/doi/10.1515/rnam-2024-0027/pdf
Button zum nach oben scrollen