Startseite Local performance evaluation of AI-algorithms with the generalized spatial recall index
Artikel
Lizenziert
Nicht lizenziert Erfordert eine Authentifizierung

Local performance evaluation of AI-algorithms with the generalized spatial recall index

  • Patrick Müller

    Patrick Müller received his M.Sc. in 2018. His Master’s thesis examined the influence of a Point Spread Function Model to Digital Image Processing algorithms. He is currently pursuing his Doctorate at Hochschule Düsseldorf and University of Siegen with a focus on the application of optical models to digital images, their assessment and correlation with the performance of Computer Vision algorithms.

    EMAIL logo
    und Alexander Braun

    Alexander Braun received his diploma in physics with a focus on laser fluorescence spectroscopy from the University of Göttingen in 2001. His PhD research in quantum optics and quantum computers was carried out at the University of Hamburg, resulting in a doctorate from the University of Siegen in 2007. He started working as an optical designer for camera-based ADAS with the company Kostal, and later became responsible for the optical quality of the series mass production. Next, he became a professor of physics at the University of Applied Sciences Düsseldorf in 2013, where he now researches optical metrology and optical models for simulation in the context of autonomous driving. He is a member of DPG, SPIE, IS&T and VDI, participating in norming efforts at IEEE (P2020) and VDI (FA 8.13), and currently serves on the advisory board for the AutoSens conference and for the VDI Optical Technologies (Fachbeirat 8).

    ORCID logo
Veröffentlicht/Copyright: 24. Mai 2023

Abstract

We have developed a novel metric to gauge the performance of artificial intelligence (AI) or machine learning (ML) algorithms, called the Spatial Recall Index (SRI). The novelty is the spatial resolution of a standard performance indicator, as a Recall value is assigned to each individual pixel. This generates a distribution of the performance of a given AI-algorithm with the resolution of the images in the dataset. While the mathematical basis has already been presented before, here we demonstrate the usage on more datasets and delve into in-depth application examples. We examine both the MS COCO and the Berkeley Deep Drive datasets, using a state-of-the-art object detection algorithm. The dataset is degraded using a physical-realistic lens-model, where the optical performance varies over the field of view, as a real camera would. This study highlights the usefulness of the SRI, as every image has been taken by realistic optics. A generalization, the GSRI is introduced, from which we derive SRIA, weighting with object area and SRIrisk intended for autonomous driving. Finally, these metrics are compared.

Zusammenfassung

Wir haben eine neuartige Metrik zur Bewertung der Leistung von Algorithmen der künstlichen Intelligenz (KI) oder des maschinellen Lernens (ML) entwickelt, den sogenannten Spatial Recall Index (SRI). Neu ist dabei die räumliche Auflüsung eines Standard-Leistungsindikators, da jedem einzelnen Pixel ein Recall-Wert zugewiesen wird. Es lässt sich eine Leistungsverteilung eines bestimmten KI-Algorithmus mit der Bildauflösung im Datensatz erzeugen. Während die mathematischen Grundlagen bereits zuvor vorgestellt wurden, demonstrieren wir hier die Anwendung auf weiteren Datensätzen und vertiefen die Anwendungsbeispiele. Wir untersuchen sowohl den MS COCO- als auch den Berkeley Deep Drive-Datensatz (BDD) unter Verwendung eines state-of-the-art Objekterkennungsalgorithmus. Der BDD-Datensatz wird mit einem physikalisch-realistischen Linsenmodell degradiert, bei dem die optische Leistung über das Sichtfeld variiert, wie es bei einer echten Kamera der Fall wäre. Diese Studie unterstreicht die Nützlichkeit des SRI, da jedes Bild mit einer realistischen Optik aufgenommen wurde. Es wird eine Verallgemeinerung, der GSRI, eingeführt, aus dem wir SRIA, die Gewichtung mit der Objektfläche und SRIrisk für das autonome Fahren ableiten. Schließlich werden diese Metriken miteinander verglichen.


Corresponding author: Patrick Müller, University of Applied Sciences, Düsseldorf, Deutschland, E-mail:

About the authors

Patrick Müller

Patrick Müller received his M.Sc. in 2018. His Master’s thesis examined the influence of a Point Spread Function Model to Digital Image Processing algorithms. He is currently pursuing his Doctorate at Hochschule Düsseldorf and University of Siegen with a focus on the application of optical models to digital images, their assessment and correlation with the performance of Computer Vision algorithms.

Alexander Braun

Alexander Braun received his diploma in physics with a focus on laser fluorescence spectroscopy from the University of Göttingen in 2001. His PhD research in quantum optics and quantum computers was carried out at the University of Hamburg, resulting in a doctorate from the University of Siegen in 2007. He started working as an optical designer for camera-based ADAS with the company Kostal, and later became responsible for the optical quality of the series mass production. Next, he became a professor of physics at the University of Applied Sciences Düsseldorf in 2013, where he now researches optical metrology and optical models for simulation in the context of autonomous driving. He is a member of DPG, SPIE, IS&T and VDI, participating in norming efforts at IEEE (P2020) and VDI (FA 8.13), and currently serves on the advisory board for the AutoSens conference and for the VDI Optical Technologies (Fachbeirat 8).

  1. Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: None declared.

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

[1] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, MIT Press, 2016. Available at: http://www.deeplearningbook.org.Suche in Google Scholar

[2] G. Litjens, T Kooi, B. E Bejnordi, et al.., “A survey on Deep learning in medical image analysis,” Med. Image Anal., vol. 42, pp. 60–88, 2017, arXiv: 1702.05747[cs]. http://arxiv.org/abs/1702.05747.10.1016/j.media.2017.07.005Suche in Google Scholar PubMed

[3] F. Yu,H. Chen,X. Wang, et al.., “BDD100K: a diverse driving dataset for heterogeneous multitask learning,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 2633–2642.10.1109/CVPR42600.2020.00271Suche in Google Scholar

[4] C. Michaelis, B. Mitzkus,R. Geirhos, et al.., “Benchmarking robustness in object detection: autonomous driving when winter is coming,” arXiv:1907.07484 [cs, stat], 2020. http://arxiv.org/abs/1907.07484.Suche in Google Scholar

[5] H. Caesar, V. Bankiti,A. Lang, et al.., “nuScenes: a multimodal dataset for autonomous driving,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, IEEE, 2020, pp. 11618–11628. Available at: https://ieeexplore.ieee.org/document/9156412/ [accessed: Jun. 29, 2022].10.1109/CVPR42600.2020.01164Suche in Google Scholar

[6] P. Sun, H. Kretzschmar,D. Dotiwalla, et al.., “Scalability in perception for autonomous driving: Waymo open dataset,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 2443–2451.10.1109/CVPR42600.2020.00252Suche in Google Scholar

[7] P. Dollar, C Wojek, B Schiele, and P Perona, “Pedestrian detection: an evaluation of the state of the art,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 4, pp. 743–761, 2012. https://doi.org/10.1109/tpami.2011.155.Suche in Google Scholar PubMed

[8] M. Menze and A. Geiger, “Object scene flow for autonomous vehicles,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, IEEE, 2015, pp. 3061–3070. Available at: http://ieeexplore.ieee.org/document/7298925/ [accessed: Feb. 16, 2023].10.1109/CVPR.2015.7298925Suche in Google Scholar

[9] Y. Wu,A. Kirillov,F. Massa,W. Lo, and R. Girshick, Detectron2, 2019. Available at: https://github.com/facebookresearch/detectron2.Suche in Google Scholar

[10] L. Liu, W. Ouyang, X Wang, et al.., “Deep learning for generic object detection: a survey,” Int. J. Comput. Vis., vol. 128, no. 2, pp. 261–318, 2020. https://doi.org/10.1007/s11263-019-01247-4.Suche in Google Scholar

[11] Z. Cai and N. Vasconcelos, “Cascade R-CNN: high quality object detection and instance segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 5, pp. 1483–1498, 2021. https://doi.org/10.1109/tpami.2019.2956516.Suche in Google Scholar PubMed

[12] O. Russakovsky, J Deng, H Su, et al.., “ImageNet large scale visual recognition challenge,” Int. J. Comput. Vis., vol. 115, no. 3, pp. 211–252, 2015. https://doi.org/10.1007/s11263-015-0816-y.Suche in Google Scholar

[13] D. Hendrycks and T. Dietterich, “Benchmarking neural network robustness to common corruptions and perturbations, arXiv:1903.12261 [cs, stat], 2019. http://arxiv.org/abs/1903.12261.Suche in Google Scholar

[14] O. Ronneberger, P. Fischer, and T. Brox. “U-net: convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer- Assisted Intervention – MICCAI 2015. Lecture Notes in Computer Science, vol. 9351, Nassir Navab et al.., Ed., Cham, Springer International Publishing, 2015, pp. 234–241. Available at: http://link.springer.com/10.1007/978-3-319-24574-4_28 [accessed: Mar. 08, 2022].10.1007/978-3-319-24574-4_28Suche in Google Scholar

[15] G. Neuhold,T. Ollmann,S. Bulo, and P. Kontschieder, “The mapillary Vistas dataset for semantic understanding of street scenes,” in 2017 IEEE International Conference on Computer Vision (ICCV), Venice, IEEE, 2017, pp. 5000–5009. Available at: http://ieeexplore.ieee.org/document/8237796/ [accessed: May. 23, 2022].10.1109/ICCV.2017.534Suche in Google Scholar

[16] T.-Y. Lin, M. Maire, S. Belongie, et al.., “Microsoft COCO: common objects in context,” in Computer Vision – ECCV 2014. Lecture Notes in Computer Science, David Fleet et al.., Ed., Cham, Springer International Publishing, 2014, pp. 740–755.10.1007/978-3-319-10602-1_48Suche in Google Scholar

[17] K. He, G. Gkioxari,P. Dollár, and R. Girshick, “Mask R-CNN,” in 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2980–2988.10.1109/ICCV.2017.322Suche in Google Scholar

[18] E. Ilg,N. Mayer,T. Saikia,M. Keuper,A. Dosovitskiy, and T. Brox, “FlowNet 2.0: evolution of optical flow estimation with Deep networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, IEEE, 2017, pp. 1647–1655. Available at: http://ieeexplore.ieee.org/document/8099662/ [accessed: Apr. 12, 2022].10.1109/CVPR.2017.179Suche in Google Scholar

[19] R. Padilla, S. L. Netto, and E. A. B. da Silva, “A survey on performance metrics for object- detection algorithms,” in 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), 2020, pp. 237–242.10.1109/IWSSIP48289.2020.9145130Suche in Google Scholar

[20] C. Szegedy, S. Ioffe,V. Vanhoucke, and A. Alemi, “Inception-v4, inception-ResNet and the impact of residual connections on learning,” arXiv: 1602.07261[cs], 2016. http://arxiv.org/abs/1602.07261.10.1609/aaai.v31i1.11231Suche in Google Scholar

[21] J. Davis and M. Goadrich, “The relationship between Precision-Recall and ROC curves,” in Machine Learning, Proceedings of the Twenty-Third International Conference (ICML 2006). ACM International Conference Proceeding Series. ACM, 2006, vol. 148, W. W. Cohen and A. W. Moore, Eds., Pittsburgh, Pennsylvania, USA, 2006, pp. 233–240.10.1145/1143844.1143874Suche in Google Scholar

[22] P. Müller, M. Brummel, and A. Braun, “Spatial recall index for machine learning algorithms,” in London Imaging Meeting, London, Society for Imaging Science and Technology, 2021.10.2352/issn.2694-118X.2021.LIM-58Suche in Google Scholar

[23] M. Brummel, M. Müller, and A. Braun, “Spatial precision and recall indices to assess the performance of instance segmentation algorithms,” Electron. Imaging, vol. 34, no. 16, pp. 101–106, 2022.10.2352/EI.2022.34.16.AVM-101Suche in Google Scholar

[24] P. Müller and A. Braun, “Simulating optical properties to access novel metrological parameter ranges and the impact of different model approximations,” in 2022 IEEE International Workshop on Metrology for Automotive (MetroAutomotive), 2022, pp. 133–138.10.1109/MetroAutomotive54295.2022.9855079Suche in Google Scholar

[25] C. Krebs, P. Müller, and A. Braun, “Impact of windshield optical aberrations on visual range camera based classification tasks performed by CNNs,” in London Imaging Meeting, London: Society for Imaging Science and Technology, 2021.10.2352/issn.2694-118X.2021.LIM-83Suche in Google Scholar

[26] F. Croce, M. Andriushchenko,V. Sehwag, et al.., “RobustBench: a standardized adversarial robustness benchmark,” arXiv: 2010.09670[cs,stat], 2021. http://arxiv.org/abs/2010.09670.Suche in Google Scholar

[27] Z. Pezzementi, T. Trenton,S. Yim, et al.., “Putting Image Manipulations in Context: robustness Testing for Safe Perception,” in 2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), Philadelphia, PA, IEEE, 2018, pp. 1–8. Available at: https://ieeexplore.ieee.org/document/8468619/ [accessed: Sep. 25, 2020].10.1109/SSRR.2018.8468619Suche in Google Scholar

[28] P. Mueller, M. Lehmann, and A. Braun, “Simulating tests to test simulation,” in International Symposium on Electronic Imaging 2020, Society for Imaging Science and Technology, 2020.Suche in Google Scholar

[29] G. D. Boreman, Modulation Transfer Function in Optical and Electro-Optical Systems, SPIE, 2001. Available at: https://spiedigitallibrary.org/ebooks/TT/Modulation-Transfer-Function-in-Optical-and-Electro-Optical-Systems/eISBN-9780819480453/10.1117/3.419857 [accessed: Sep. 25, 2020].10.1117/3.419857Suche in Google Scholar

[30] P. D. Burns and D. Williams, “Camera resolution and distortion: advanced edge fitting,” Electron. Imaging, vol. 2018, no. 12, pp. 171-1–175-5, 2018.10.2352/ISSN.2470-1173.2018.12.IQSP-171Suche in Google Scholar

[31] J. W. Goodman, Introduction to Fourier Optics, 4th ed. New York, W.H. Freeman, Macmillan Learning, 2017.Suche in Google Scholar

[32] K. Saad and S.-A. Schneider, “Camera vignetting model and its effects on Deep neural networks for object detection,” in 2019 IEEE International Conference on Connected Vehicles and Expo (ICCVE), Graz, Austria, IEEE, 2019, pp. 1–5. Available at: https://ieeexplore.ieee.org/document/8965233/ [accessed: Apr. 09, 2021].10.1109/ICCVE45908.2019.8965233Suche in Google Scholar

[33] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-CAM: visual explanations from Deep networks via gradient-based localization,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 618–626. Available at: https://openaccess.thecvf.com/content_iccv_2017/html/Selvaraju_Grad-CAM_Visual_Explanations_ICCV_2017_paper.html [accessed: Feb. 16, 2023].10.1109/ICCV.2017.74Suche in Google Scholar

[34] B. Zhou,A. Lapedriza,A. Oliva, and A. Torralba, “Learning Deep features for discriminative localization,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, IEEE, 2016, pp. 2921–2929. Available at: http://ieeexplore.ieee.org/document/7780688/ [accessed: Apr. 20, 2022].10.1109/CVPR.2016.319Suche in Google Scholar

[35] R. Levin,M. Shu,E. Borgnia,F. Huang,M. Goldblum, and T. Goldstein, “Where do Models go wrong? Parameter-space saliency maps for explainability,” 2022, arXiv: 2108.01335[cs]. http://arxiv.org/abs/2108.01335.Suche in Google Scholar

[36] S. B. Omer, P. Malani, and C. del Rio, “The COVID-19 pandemic in the US: a clinical update,” JAMA, vol. 323, no. 18, pp. 1767–1768, 2020. https://doi.org/10.1001/jama.2020.5788.Suche in Google Scholar PubMed

[37] Y. Fisher, Image Sensor Specification ⋅ Issue #143 ⋅ bdd100k/bdd100k, 2021. Available at: https://github.com/bdd100k/bdd100k/issues/143 [accessed: Feb. 05, 2023].Suche in Google Scholar

[38] J. D. Hunter, “Matplotlib: a 2D graphics environment,” Comput. Sci. Eng., vol. 9, no. 3, pp. 90–95, 2007. https://doi.org/10.1109/mcse.2007.55.Suche in Google Scholar

[39] D. Chicco and G. Jurman, “The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation,” BMC Genom., vol. 21, pp. 1–13, 2020. https://doi.org/10.1186/s12864-019-6413-7.Suche in Google Scholar PubMed PubMed Central

Received: 2023-02-16
Accepted: 2023-04-26
Published Online: 2023-05-24
Published in Print: 2023-07-27

© 2023 Walter de Gruyter GmbH, Berlin/Boston

Heruntergeladen am 29.9.2025 von https://www.degruyterbrill.com/document/doi/10.1515/teme-2023-0013/html
Button zum nach oben scrollen