Gaussian variant of Freivalds’ algorithm for efficient and reliable matrix product verification

Hao Ji; Michael Mascagni; Yaohang Li

doi:10.1515/mcma-2020-2076

Artikel

Gaussian variant of Freivalds’ algorithm for efficient and reliable matrix product verification

Hao Ji , Michael Mascagni und Yaohang Li

Veröffentlicht/Copyright: 8. Oktober 2020

Veröffentlicht von

Veröffentlichen auch Sie bei De Gruyter Brill

Manuskript einreichen Informationen für Autor*innen Erkunden Sie dieses Fachgebiet

Aus der Zeitschrift Monte Carlo Methods and Applications Band 26 Heft 4

Abstract

In this article, we consider the general problem of checking the correctness of matrix multiplication. Given three n×n matrices 𝐴, 𝐵 and 𝐶, the goal is to verify that A×B=C without carrying out the computationally costly operations of matrix multiplication and comparing the product A×B with 𝐶, term by term. This is especially important when some or all of these matrices are very large, and when the computing environment is prone to soft errors. Here we extend Freivalds’ algorithm to a Gaussian Variant of Freivalds’ Algorithm (GVFA) by projecting the product A×B as well as 𝐶 onto a Gaussian random vector and then comparing the resulting vectors. The computational complexity of GVFA is consistent with that of Freivalds’ algorithm, which is O⁢(n2). However, unlike Freivalds’ algorithm, whose probability of a false positive is 2-k, where 𝑘 is the number of iterations, our theoretical analysis shows that, when A×B≠C, GVFA produces a false positive on set of inputs of measure zero with exact arithmetic. When we introduce round-off error and floating-point arithmetic into our analysis, we can show that the larger this error, the higher the probability that GVFA avoids false positives. Moreover, by iterating GVFA 𝑘 times, the probability of a false positive decreases as pk, where 𝑝 is a very small value depending on the nature of the fault on the result matrix and the arithmetic system’s floating-point precision. Unlike deterministic algorithms, there do not exist any fault patterns that are completely undetectable with GVFA. Thus GVFA can be used to provide efficient fault tolerance in numerical linear algebra, and it can be efficiently implemented on modern computing architectures. In particular, GVFA can be very efficiently implemented on architectures with hardware support for fused multiply-add operations.

Keywords: Fault-tolerance; algorithmic resilience; Gaussian Variant of Freivalds’ Algorithm; matrix multiplication; Gaussian random vector; failure probability

MSC 2010: 65F99; 65C05; 62P99

Funding source: National Science Foundation

Award Identifier / Grant number: 1066471

Funding statement: This work is partially supported by National Science Foundation grant 1066471 for Yaohang Li, and Hao Ji acknowledges support from an ODU Modeling and Simulation Fellowship. Michael Mascagni’s contribution to this paper was partially supported by National Institute of Standards and Technology (NIST) during his sabbatical. The mention of any commercial product or service in this paper does not imply an endorsement by NIST or the Department of Commerce.

Acknowledgements

We would like to thank Dr. Stephan Olariu for his valuable suggestions on the manuscript.

References

[1] N. Alon, O. Goldreich, J. Hå stad and R. Peralta, Simple constructions of almost 𝑘-wise independent random variables, 31st Annual Symposium on Foundations of Computer Science, Vol. I, II (St. Louis 1990), IEEE Press, Los Alamitos (1990), 544–553. 10.1109/FSCS.1990.89575Suche in Google Scholar

[2] P. Banerjee and J. A. Abraham, Bounds on algorithm-based fault tolerance in multiple processor systems, IEEE Trans. Comput. 100 (1986), no. 4, 296–306. 10.1109/TC.1986.1676762Suche in Google Scholar

[3] P. Banerjee, J. T. Rahmeh, C. Stunkel, V. S. Nair, K. Roy, V. Balasubramanian and J. A. Abraham, Algorithm-based fault tolerance on a hypercube multiprocessor, IEEE Trans. Comput. 39 (1990), no. 9, 1132–1145. 10.1109/12.57055Suche in Google Scholar

[4] S. Boldo and J.-M. Muller, Exact and approximated error of the FMA, IEEE Trans. Comput. 60 (2011), no. 2, 157–164. 10.1109/TC.2010.139Suche in Google Scholar

[5] G. Bosilca, R. Delmas, J. Dongarra and J. Langou, Algorithm-based fault tolerance applied to high performance computing, J. Parallel Distrib. Comput. 69 (2009), no. 4, 410–416. 10.1016/j.jpdc.2008.12.002Suche in Google Scholar

[6] K. L. Cheng, C. W. Wang and J. N. Lee, Fame: A fault-pattern based memory failure analysis framework, International Computer Aided Design Conference, IEEE Press, Piscataway (2003), 595–598. Suche in Google Scholar

[7] D. D. Chinn and R. K. Sinha, Bounds on sample space size for matrix product verification, Inform. Process. Lett. 48 (1993), no. 2, 87–91. 10.1016/0020-0190(93)90183-ASuche in Google Scholar

[8] D. Coppersmith and S. Winograd, Matrix multiplication via arithmetic progressions, Proceedings of the 19th annual ACM symposium on Theory of computing, ACM, New York (1987), 1–6. 10.1145/28395.28396Suche in Google Scholar

[9] J. W. Demmel and N. J. Higham, Stability of block algorithms with fast level-3 blas, ACM Trans. Math. Softw. 18 (1992), no. 3, 274–291. 10.1145/131766.131769Suche in Google Scholar

[10] J. J. Dongarra, J. Du Croz, S. Hammarling and I. Duff, Algorithm 679: A set of level 3 basic linear algebra subprograms: Model implementation and test programs, ACM Trans. Math. Softw. 16 (1990), no. 1, 18–28. 10.1145/77626.77627Suche in Google Scholar

[11] P. Drineas, R. Kannan and M. W. Mahoney, Fast Monte Carlo algorithms for matrices. I. Approximating matrix multiplication, SIAM J. Comput. 36 (2006), no. 1, 132–157. 10.1137/S0097539704442684Suche in Google Scholar

[12] P. Drineas, R. Kannan and M. W. Mahoney, Fast Monte Carlo algorithms for matrices. II. Computing a low-rank approximation to a matrix, SIAM J. Comput. 36 (2006), no. 1, 158–183. 10.1137/S0097539704442696Suche in Google Scholar

[13] P. Drineas, R. Kannan and M. W. Mahoney, Fast Monte Carlo algorithms for matrices. III. Computing a compressed approximate matrix decomposition, SIAM J. Comput. 36 (2006), no. 1, 184–206. 10.1137/S0097539704442702Suche in Google Scholar

[14] E. N. Elnozahy, L. Alvisi, Y. M. Wang and D. B. Johnson, A survey of rollback-recovery protocols in message-passing systems, ACM Comput. Surv. 34 (2002), no. 3, 375–408. 10.1145/568522.568525Suche in Google Scholar

[15] S. Eriksson-Bique, M. Solbrig, M. Stefanelli, S. Warkentin, R. Abbey and I. C. F. Ipsen, Importance sampling for a Monte Carlo matrix multiplication algorithm, with application to information retrieval, SIAM J. Sci. Comput. 33 (2011), no. 4, 1689–1706. 10.1137/10080659XSuche in Google Scholar

[16] R. Freivalds, Probabilistic machines can use less running time, Information Processing 77, North-Holland, Amsterdam (1977), 839–842. Suche in Google Scholar

[17] K. Gallivan, W. Jalby and U. Meier, The use of blas3 in linear algebra on a parallel processor with a hierarchical memory, SIAM J. Sci. Stat. Comp. 8 (1987), no. 6, 1079–1084. 10.1137/0908086Suche in Google Scholar

[18] L. Ga̧sieniec, C. Levcopoulos and A. Lingas, Efficiently correcting matrix products, Algorithms and Computation, Lecture Notes in Comput. Sci. 8889, Springer, Cham (2014), 53–64. 10.1007/978-3-319-13075-0_5Suche in Google Scholar

[19] J. N. Glosli, D. F. Richards, K. J. Caspersen, R. E. Rudd, J. A. Gunnels and F. H. Streitz, Extending stability beyond cpu millennium: A micron-scale atomistic simulation of Kelvin–Helmholtz instability, Proceedings of the 2007 ACM/IEEE conference on Supercomputing, ACM, New York (2007), 1–11. 10.1145/1362622.1362700Suche in Google Scholar

[20] J. A. Gunnels, D. S. Katz, E. S. Quintana-Orti and R. A. Van de Gejin, Fault-tolerant high-performance matrix multiplication: Theory and practice, Proceedings of International Conference on Dependable Systems and Networks, IEEE Press, Piscataway (2001), 47–56. 10.1109/DSN.2001.941390Suche in Google Scholar

[21] N. Halko, P. G. Martinsson and J. A. Tropp, Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions, SIAM Rev. 53 (2011), no. 2, 217–288. 10.1137/090771806Suche in Google Scholar

[22] E. Hokenek, R. K. Montoye and P. W. Cook, Second-generation risc floating point with multiply-add fused, IEEE J. Solid-State Circuits 25 (1990), no. 5, 1207–1213. 10.1109/4.62143Suche in Google Scholar

[23] K. H. Huang and J. A. Abraham, Algorithm-based fault tolerance for matrix operations, IEEE Trans. Comput. 100 (1984), no. 6, 518–528. 10.1109/TC.1984.1676475Suche in Google Scholar

[24] I. Korec and J. Wiedermann, Deterministic verification of integer matrix multiplication in quadratic time, SOFSEM 2014: Theory and Practice of Computer Science, Lecture Notes in Comput. Sci. 8327, Springer, Cham (2014), 375–382. 10.1007/978-3-319-04298-5_33Suche in Google Scholar

[25] A. Kumar and J. L. Roch, Algorithm-based secure and fault tolerant outsourcing of matrix computations, preprint (2013), https://hal.archives-ouvertes.fr/hal-00876156/file/JC2S.pdf. Suche in Google Scholar

[26] X. Lei, X. Liao, T. Huang and H. Li, Cloud computing service: The case of large matrix determinant computation, IEEE Trans. Serv. Comput. 8 (2015), no. 5, 688–700. 10.1109/TSC.2014.2331694Suche in Google Scholar

[27] J. L. Leva, A fast normal random number generator, ACM Trans. Math. Softw. 18 (1992), no. 4, 449–453. 10.1145/138351.138364Suche in Google Scholar

[28] Y. Li and M. Mascagni, Grid-based Monte Carlo application, Proceedings of Grid Computing Third International Workshop/Conference—GRID02, Springer, Berlin (2002), 13–25. 10.1007/3-540-36133-2_2Suche in Google Scholar

[29] Y. Li and M. Mascagni, Improving performance via computational replication on a large-scale computational grid, Proceedings of the Third IEEE/ACM International Symposium on Cluster Computing and the Grid, IEEE Press, Piscataway (2003), 442–448. Suche in Google Scholar

[30] Y. Li and M. Mascagni, Analysis of large-scale grid-based Monte Carlo applications, Int. J. High Perform. Comput. Appl. 17 (2003), 369–382. 10.1177/10943420030174003Suche in Google Scholar

[31] J. W. Lindeberg, Eine neue Herleitung des Exponentialgesetzes in der Wahrscheinlichkeitsrechnung, Math. Z. 15 (1922), no. 1, 211–225. 10.1007/BF01494395Suche in Google Scholar

[32] C. Lisboa, M. Erigson and L. Carro, A low cost checker for matrix multiplication, IEEE Latin-American Test Workshop, 2007. Suche in Google Scholar

[33] F. T. Luk and H. Park, An analysis of algorithm-based fault tolerance techniques, J. Parallel Distrib. Comput. 5 (1988), no. 2, 172–184. 10.1117/12.936896Suche in Google Scholar

[34] E. Lukacs and E. P. King, A property of the normal distribution, Ann. Math. Statistics 25 (1954), 389–394. 10.1214/aoms/1177728796Suche in Google Scholar

[35] S. E. Michalak, K. W. Harris, N. W. Hengartner, B. E. Takala and S. A. Wender, Predicting the number of fatal soft errors in los alamos national laboratory’s asc q supercomputer, IEEE Trans Device Mater. Rel. 5 (2005), no. 3, 329–335. 10.1109/TDMR.2005.855685Suche in Google Scholar

[36] R. J. Muirhead, Aspects of Multivariate Statistical Theory, John Wiley & Sons, New York, 1982, 10.1002/9780470316559Suche in Google Scholar

[37] J. Naor and M. Naor, Small-bias probability spaces: Efficient constructions and applications, SIAM J. Comput. 22 (1993), no. 4, 838–856. 10.1145/100216.100244Suche in Google Scholar

[38] B. Schroeder, E. Pinheiro and W. D. Weber, Dram errors in the wild: A large-scale field study, Commun. ACM 54 (2011), no. 2, 100–107. 10.1145/1555349.1555372Suche in Google Scholar

[39] P. Shivakumar, M. Kistler, S. W. Keckler, D. Burger and L. Alvisi, Modeling the effect of technology trends on the soft error rate of combinational logic, Proceedings of International Conference on Dependable Systems and Networks, IEEE Press, Piscataway (2002), 389–398. 10.1109/DSN.2002.1028924Suche in Google Scholar

[40] A. J. Van de Goor, Testing Semiconductor Memories: Theory and Practice, John Wiley & Sons, New York, 1991. Suche in Google Scholar

[41] V. V. Williams, Multiplying matrices faster than Coppersmith–Winograd, Proceedings of the 2012 ACM Symposium on Theory of Computing—STOC’12, ACM, New York (2012), 887–898. 10.1145/2213977.2214056Suche in Google Scholar

[42] W. Yu, Y. Gu and Y. Li, Efficient randomized algorithms for the fixed-precision low-rank matrix approximation, SIAM J. Matrix Anal. Appl. 39 (2018), no. 3, 1339–1359. 10.1137/17M1141977Suche in Google Scholar

[43] B. Zhang and M. Mascagni, Pass-efficient randomized LU algorithms for computing low-rank matrix approximation, preprint (2020), https://arxiv.org/abs/2002.07138. Suche in Google Scholar

Received: 2020-03-31

Accepted: 2020-09-16

Published Online: 2020-10-08

Published in Print: 2020-12-01

Sie haben derzeit keinen Zugang zu diesem Inhalt.

Artikel in diesem Heft

https://doi.org/10.1515/mcma-2020-2076

Schlagwörter für diesen Artikel

Fault-tolerance; algorithmic resilience; Gaussian Variant of Freivalds’ Algorithm; matrix multiplication; Gaussian random vector; failure probability