Home Outlier Detection by means of Monte Carlo Estimation including resistant Scale Estimation
Article
Licensed
Unlicensed Requires Authentication

Outlier Detection by means of Monte Carlo Estimation including resistant Scale Estimation

  • Christian Marx EMAIL logo
Published/Copyright: May 1, 2015
Become an author with De Gruyter Brill

Abstract

The identification of outliers in measurement data is hindered if they are present in leverage points as well as in rest of the data. A promising method for their identification is the Monte Carlo estimation (MCE), which is subject of the present investigation. In MCE the data are searched for data subsamples without leverage outliers and with few (or no) non-leverage outliers by a random generation of subsamples. The required number of subsamples by which several of such subsamples are generated with a given probability is derived. Each generated subsample is rated based on the residuals resulting from an adjustment. By means of a simulation it is shown that a least squares adjustment is suitable. For the rating of the subsamples, the sum of squared residuals is used as a measure of the fit. It is argued that this (unweighted) sum is also appropriate if data have unequal weights. An investigation of the robustness of a final Bayes estimation with the result of the Monte Carlo search as prior information reveals its inappropriateness. Furthermore, the case of an unknown variance factor is considered. A simulation for different scale estimators for the variance factor shows their impracticalness. A new resistant scale estimator is introduced which is based on a generalisation of the median absolut deviation. Taking into account the results of the investigations, a new procedure for MCE considering a scale estimation is proposed. Finally, this method is tested by simulation. MCE turns out to be more reliable in the identification of outliers than a conventional resistant estimation method.

References

[1] W. Baarda, A Testing Procedure for Use in Geodetic Networks, Netherl. Geod. Comm., Publ. on Geodesy New Series Vol. 2 No. 5, Delft, 1968.10.54419/t8w4sgSearch in Google Scholar

[2] L. Biagi and S. Caldera, An efficient Leave One Block Out approach to identify outliers, J Appl Geod 7 (2013), 11–19.Search in Google Scholar

[3] S. Choi, T. Kim and W. Yu, Performance Evaluation of RANSAC Family, in: Proceedings of the British Machine Vision Conference, pp. 81.1–81.12, 2009.10.5244/C.23.81Search in Google Scholar

[4] C. Croux and P. J. Rousseeuw, Time-Efficient Algorithms for Two Highly Robust Estimators of Scale, Computational Statistics, 1, Physika-Verlag, Heidelberg, 1992.10.1007/978-3-662-26811-7_58Search in Google Scholar

[5] D. L. Donoho and P. J. Huber, The notion of breakdown point, A Festschrift For Erich L. Lehmann, Wadsworth, Belmont, 1983.Search in Google Scholar

[6] M. A. Fischler and R. C. Bolles, Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography, Comm. of the ACM 24 (1981), 381–395.10.1145/358669.358692Search in Google Scholar

[7] C. F. Gauß, Bestimmung der Genauigkeit der Beobachtungen, Zeitschrift für Astronomie und verwandte Wissenschaften 1 (1816), 185–197.Search in Google Scholar

[8] F. R. Hampel, The Influence Curve and Its Role in Robust Estimation, J Am Stat Assoc 69 (1974), 383–393.10.1080/01621459.1974.10482962Search in Google Scholar

[9] F. R. Hampel, Beyond Location Parameters: Robust Conceptsand Methods, Bull. 40th Session of the ISI XLVI (1975), 375–382, Book 1.Search in Google Scholar

[10] S. Hekimoglu and K.-R. Koch, How can reliability of the test for outliers be measured?, Allgemeine Vermessungs-Nachrichten 107 (2000), 247–254.Search in Google Scholar

[11] P. J. Huber, Robust estimation of a location parameter, Annals of Mathematical Statistics 35 (1964), 73–101.10.1214/aoms/1177703732Search in Google Scholar

[12] P. J. Huber and E. M. Ronchetti, Robust Statistics, John Wiley & Sons, Hoboken, 2009.10.1002/9780470434697Search in Google Scholar

[13] W. Jordan, Handbuch der Vermessungskunde, 5 ed, 1, J. B. Metzlerscher Verlag, Stuttgart, 1904.Search in Google Scholar

[14] G. Kampmann, Auswertetechniken bei der überbestimmten-Koordinatentransformation, BDVI Forum 3 (1993), 139–152.Search in Google Scholar

[15] K. R. Koch, Parameter Estimation and Hypothesis Testing in Linear Models, 2 ed, Springer, Berlin, 1999.10.1007/978-3-662-03976-2Search in Google Scholar

[16] K. R. Koch, Introduction to Bayesian Statistics, Springer, New York, 2007.Search in Google Scholar

[17] K. R. Koch, Outlier Detection in Observations Including Leverage Points by Monte Carlo Simulations, Allgemeine Vermessungs-Nachrichten 114 (2007), 330–336.Search in Google Scholar

[18] K. R. Koch, Comparison of two robust estimations by expectation maximization algorithms with Huber’s method and outlier tests, J Appl Geod 7 (2013), 115–123.Search in Google Scholar

[19] T. Krarup, J. Juhl and K. Kubik, Götterdämmerung over Least Squares Adjustment, in: 14th Congress of the International Society of Photogrammetry, B3, pp. 369–378, Hamburg, 1980.Search in Google Scholar

[20] L. Makkonen, M. Pajari and M. Tikanmäki, Discussion on “Plotting positions for fitting distributions and extreme value analysis”, Can J Civ Eng 40 (2013), 130–139.10.1139/cjce-2013-0227Search in Google Scholar

[21] C. Marx, On resistant Lp-Norm Estimation by means of iteratively reweighted least Squares, J Appl Geod 7 (2013),1–10.10.1515/jag-2012-0042Search in Google Scholar

[22] G. Merle and H. Späth, Computational experiences with discrete Lp-approximation, Computing 12 (1974), 315–321.10.1007/BF02253335Search in Google Scholar

[23] A. M. Mood, F. A. Graybill and D. C. Boes, Introduction to the Theory of Statistics, 3 ed, McGraw-Hill, Tokyo, 1974.Search in Google Scholar

[24] F. Neitzel, Identifizierung konsistenter Datengruppen am Beispiel der Kongruenzuntersuchung geodätischer Netze, Deutsche Geodätische Kommission, Reihe C no. 565, München, 2004.Search in Google Scholar

[25] H. Pelzer, Influence of Systematic Effects in Stochastic and Functional Models, in: Proceedings Survey Control Networks, 1982.Search in Google Scholar

[26] P. J. Rousseeuw, Least Median of Squares Regression, J Am Stat Assoc 79 (1984), 871–880.10.1080/01621459.1984.10477105Search in Google Scholar

[27] P. J. Rousseeuw and A. M. Leroy, Robust Regression and Outlier Detection, John Wiley and Sons, New York, 1987.10.1002/0471725382Search in Google Scholar

[28] W. A. Stahel, Robuste Schätzungen: Infinitesimale Optimalität und Schätzungen von Kovarianzmatrizen, Ph. D. thesis, ETH Zürich, 1981.Search in Google Scholar

[29] M. J. Wichura, Algorithm AS 241: The Percentage Points of the Normal Distribution, J R Stat Soc: Series C (Applied Statistics) 37 (1988), 477–484.Search in Google Scholar

[30] P. Xu, Sign-constrained robust least squares, subjective-breakdown point and the effect of weights of observations on robustness, J Geod 79 (2005), 146–159.10.1007/s00190-005-0454-1Search in Google Scholar

[31] M. Yetkin and M. Berber, Application of the Sign-Constrained Robust Least-Squares Method to Surveying Networks, J Surv Eng 139 (2013), 59–65.10.1061/(ASCE)SU.1943-5428.0000088Search in Google Scholar

Received: 2014-12-21
Accepted: 2015-3-30
Published Online: 2015-5-1
Published in Print: 2015-6-1

© 2015 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 23.11.2025 from https://www.degruyterbrill.com/document/doi/10.1515/jag-2014-0029/html
Scroll to top button