Machine learning in sensor identification for industrial systems
- 
            
            
        Lucas Weber
        
 Dipl.-Ing. Lucas Weber studied Electrical Engineering and Information Technology at the Technical University of Dresden (Dipl.-Ing. 2019). Currently, he is a research assistant at the chair of Evolutionary Data Management in the Computer Science Department of the FAU Erlangen. His research interests include pattern recognition and signal processing for robust machine learning systems as well as knowledge discovery and mining in time series.and Richard Lenz
 Prof. Dr. Richard Lenz is Professor for Evolutionary Data Management in the Department of Science Department at FAU Erlangen. He leads research on evolutionary information systems, healthcare information systems, data quality and integration, document and workflow management. Professor Lenz studied computer science in Kaiserslautern. He received his Ph.D. from the University of Erlangen in 1997 with a thesis on “Adaptive Data Replication in Distributed Systems”. From 1997 to 2007 he was research assistant and later on substitute professor of medical informatics at the University of Marburg, where he also received his habilitation for practical and applied computer science with his works on “Evolutionary Information Systems in Healthcare”. 
Abstract
This paper explores the potential and limitations of machine learning for sensor signal identification in complex industrial systems. The objective is a tool to assist engineers in finding the correct inputs to digital twins and simulations from a set of unlabeled sensor signals. A naive end-to-end machine learning approach is usually not applicable to this task, as it would require many comparable industrial systems to learn from. We present a semi-structured approach that uses observations from the manual classification of time series and combines different algorithms to partition the set of signals into smaller groups of signals that share common characteristics. Using a real-world dataset from several power plants, we evaluate our solution for scaling-invariant measurement identification and functional relationship inference using change-point correlations.
About the authors

Dipl.-Ing. Lucas Weber studied Electrical Engineering and Information Technology at the Technical University of Dresden (Dipl.-Ing. 2019). Currently, he is a research assistant at the chair of Evolutionary Data Management in the Computer Science Department of the FAU Erlangen. His research interests include pattern recognition and signal processing for robust machine learning systems as well as knowledge discovery and mining in time series.

Prof. Dr. Richard Lenz is Professor for Evolutionary Data Management in the Department of Science Department at FAU Erlangen. He leads research on evolutionary information systems, healthcare information systems, data quality and integration, document and workflow management. Professor Lenz studied computer science in Kaiserslautern. He received his Ph.D. from the University of Erlangen in 1997 with a thesis on “Adaptive Data Replication in Distributed Systems”. From 1997 to 2007 he was research assistant and later on substitute professor of medical informatics at the University of Marburg, where he also received his habilitation for practical and applied computer science with his works on “Evolutionary Information Systems in Healthcare”.
Acknowledgments
We would like to thank Christian Lauth, Jan Weustink, Burkhard Frese, and Frank Schneider from SIEMENS Energy for their valuable insights into the engineering process for power plant optimization. These insights were essential in understanding the requirements and challenges for time series classification.
- 
Research ethics: Not applicable.
 - 
Author contributions: The authors have accepted responsibility for the entire content of this manuscript and approved its submission. L.W. developed the methods and carried out the experiments. R.L. supervised the project and took part in discussions and idea finding. Both L.W. and R.L. contributed to the final version of the manuscript.
 - 
Competing interests: The authors state no conflict of interest.
 - 
Research funding: This research was funded by SIEMENS Energy AG in the context of Project SIML.
 - 
Data availability: The algorithms for change point detection have been published in a public GitHub repository and as a python package (https://github.com/Lucew/changepoynt).
 
References
[1] P. Esling and C. Agon, “Time-series data mining,” ACM Comput. Surv., vol. 45, no. 1, pp. 1–34, 2012. https://doi.org/10.1145/2379776.2379788.Search in Google Scholar
[2] S. Aghabozorgi, A. S. Shirkhorshidi, and T. Y. Wah, “Time-series clustering – a decade review,” Inf. Syst., vol. 53, pp. 16–38, 2015. https://doi.org/10.1016/j.is.2015.04.007.Search in Google Scholar
[3] H. Schöning, “Industry 4.0,” Inf. Technol., vol. 60, no. 3, pp. 121–123, 2021. https://doi.org/10.1515/itit-2018-0015.Search in Google Scholar
[4] Y. Lu, C. Liu, K. I. K. Wang, H. Huang, and X. Xu, “Digital Twin-driven smart manufacturing: connotation, reference model, applications and research issues,” Robot. Comput.-Integr. Manuf., vol. 61, p. 101837, 2020. https://doi.org/10.1016/j.rcim.2019.101837.Search in Google Scholar
[5] Q. Qi and F. Tao, “Digital twin and big data towards smart manufacturing and industry 4.0: 360 degree comparison,” IEEE Access, vol. 6, pp. 3585–3593, 2018. https://doi.org/10.1109/access.2018.2793265.Search in Google Scholar
[6] L. Wang, “Heterogeneous data and big data analytics,” Autom. Control Inform. Sci., vol. 3, no. 1, pp. 8–15, 2017. https://doi.org/10.12691/acis-3-1-3.Search in Google Scholar
[7] H. B. Gunay, W. Shen, and G. Newsham, “Data analytics to improve building performance: a critical review,” Autom. Constr., vol. 97, pp. 96–109, 2019. https://doi.org/10.1016/j.autcon.2018.10.020.Search in Google Scholar
[8] M. Sofos, J. T. Langevin, M. Deru, et al.., Innovations in Sensors and Controls for Building Energy Management: Research and Development Opportunities Report for Emerging Technologies, Golden, CO, National Renewable Energy Lab.(NREL), 2020.Search in Google Scholar
[9] Y. Hegenbarth, T. Bartsch, and G. H. Ristow, “Efficient and fast monitoring and disruption management for a pressure diecast system,” Inf. Technol., vol. 60, no. 3, pp. 165–171, 2021. https://doi.org/10.1515/itit-2017-0039.Search in Google Scholar
[10] Siemens Energy SE, Siemens Energy: Press Page, 2023. Available at: https://press.siemens-energy.com/global/en [accessed: May 17, 2023].Search in Google Scholar
[11] J.-P. Calbimonte, Z. Yan, H. Jeung, O. Corcho, and K. Aberer, “Deriving semantic sensor metadata from raw measurements,” in Proceedings of the 5th International Conference on Semantic Sensor Networks, CEUR-WS, 2012, pp. 33–48.Search in Google Scholar
[12] X. Liu, B. Akinci, M. Bergés, and J. H. Garrett, “Exploration and comparison of approaches for integrating heterogeneous information sources to support performance analysis of HVAC systems,” in Computing in Civil Engineering (2012), ASCE, 2012, pp. 25–32.10.1061/9780784412343.0004Search in Google Scholar
[13] D. Hong, J. Ortiz, K. Whitehouse, and D. Culler, “Towards automatic spatial verification of sensor placement in buildings,” in Proceedings of the 5th ACM Workshop on Embedded Systems For Energy-Efficient Buildings, ACM, 2013, pp. 1–8.10.1145/2528282.2528302Search in Google Scholar
[14] M. Koc, B. Akinci, and M. Bergés, “Comparison of linear correlation and a statistical dependency measure for inferring spatial relation of temperature sensors in buildings,” in Proceedings of the 1st ACM Conference on Embedded Systems for Energy-Efficient Buildings, 2014, pp. 152–155.10.1145/2674061.2674075Search in Google Scholar
[15] B. Balaji, C. Verma, B. Narayanaswamy, and Y. Agarwal, “Zodiac: organizing large deployment of sensors to create reusable applications for buildings,” in Proceedings of the 2nd ACM International Conference on Embedded Systems for Energy-Efficient Built Environments, ACM, 2015, pp. 13–22.10.1145/2821650.2821674Search in Google Scholar
[16] J. Gao, J. Ploennigs, and M. Berges, “A data-driven meta-data inference framework for building automation systems,” in Proceedings of the 2nd ACM International Conference on Embedded Systems for Energy-Efficient Built Environments, 2015, pp. 23–32.10.1145/2821650.2821670Search in Google Scholar
[17] D. Hong, H. Wang, and K. Whitehouse, “Clustering-based active learning on sensor type classification in buildings,” in Proceedings of the 24th ACM International on Conference on Information and Knowledge Management 19-23-Oct-2015, 2015, pp. 363–372.10.1145/2806416.2806574Search in Google Scholar
[18] D. Hong, H. Wang, J. Ortiz, and K. Whitehouse, “The building adapter: towards quickly applying building analytics at scale,” in Proceedings of the 2nd ACM International Conference on Embedded Systems for Energy-Efficient Built Environments, ACM, 2015, pp. 123–132.10.1145/2821650.2821657Search in Google Scholar
[19] M. Pritoni, A. Bhattacharya, D. Culler, and M. Modera, “Short paper: a method for discovering functional relationships between air handling units and variable-air-volume boxes from sensor data,” in Proceedings of the 2nd ACM International Conference on Embedded Systems for Energy-Efficient Built Environments, ACM, 2015, pp. 133–136.10.1145/2821650.2821677Search in Google Scholar
[20] E. Holmegaard and M. B. Kjaergaard, “Mining building metadata by data stream comparison,” in 2016 IEEE Conference on Technologies for Sustainability (SusTech), 2016, pp. 28–33.10.1109/SusTech.2016.7897138Search in Google Scholar
[21] J. Koh, B. Balaji, V. Akhlaghi, Y. Agarwal, and R. Gupta, “Quiver: using control perturbations to increase the observability of sensor data in smart buildings,” CoRRabs/1601.0726, 2016, In preparation.Search in Google Scholar
[22] J. Ploennigs, “Automating analytics: how to learn metadata such that our buildings can learn from us,” in 2016 IEEE International Conference on Sensing, Communication and Networking (SECON Workshops), 2016, pp. 1–6.10.1109/SECONW.2016.7746804Search in Google Scholar
[23] J. Fütterer, M. Kochanski, and D. Müller, “Application of selected supervised learning methods for time series classification in Building Automation and Control Systems,” Energy Procedia, vol. 122, pp. 943–948, 2017. https://doi.org/10.1016/j.egypro.2017.07.428.Search in Google Scholar
[24] J. Gao and M. Bergés, “A large-scale evaluation of automated metadata inference approaches on sensors from air handling units,” Adv. Eng. Inform., vol. 37, pp. 14–30, 2018. https://doi.org/10.1016/j.aei.2018.04.010.Search in Google Scholar
[25] F. Montori, K. Liao, M. De Giosa, et al.., “A metadata-assisted cascading ensemble classification framework for automatic annotation of open IoT data,” IEEE Internet Things J., vol. 10, no. 15, pp. 13401–13413, 2023. https://doi.org/10.1109/jiot.2023.3263213.Search in Google Scholar
[26] W. Wang, M. R. Brambley, W. Kim, S. Somasundaram, and A. J. Stevens, “Automated point mapping for building control systems: recent advances and future research needs,” Autom. Constr., vol. 85, pp. 107–123, 2018. https://doi.org/10.1016/j.autcon.2017.09.013.Search in Google Scholar
[27] G. Bode, T. Schreiber, M. Baranski, and D. Müller, “A time series clustering approach for Building Automation and Control Systems,” Appl. Energy, vol. 238, pp. 1337–1345, 2019. https://doi.org/10.1016/j.apenergy.2019.01.196.Search in Google Scholar
[28] D. Hong, R. Cai, H. Wang, and K. Whitehouse, “Learning from correlated events for equipment relation inference in buildings,” in Proceedings of the 6th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, ACM, 2019, pp. 203–212.10.1145/3360322.3360852Search in Google Scholar
[29] Z. Shi, G. R. Newsham, L. Chen, and H. Burak Gunay, “Evaluation of clustering and time series features for point type inference in smart building retrofit,” in Proceedings of the 6th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, ACM, 2019, pp. 111–120.10.1145/3360322.3360839Search in Google Scholar
[30] F. Stinner, P. Neißer-Deiters, M. Baranski, and D. Müller, “Aikido: structuring data point identifiers of technical building equipment by machine learning,” J. Phys.: Conf. Ser., vol. 1343, no. 1, p. 012039, 2019. https://doi.org/10.1088/1742-6596/1343/1/012039.Search in Google Scholar
[31] F. Stinner, L. Raßpe-Lange, M. Baranski, and D. Müller, “Takeshi: application of unsupervised machine learning techniques for topology detection in building energy systems,” J. Phys.: Conf. Ser., vol. 1343, no. 1, p. 012041, 2019. https://doi.org/10.1088/1742-6596/1343/1/012041.Search in Google Scholar
[32] L. Chen, H. B. Gunay, Z. Shi, W. Shen, and X. Li, “A metadata inference method for building automation systems with limited semantic information,” IEEE Trans. Autom. Sci. Eng., vol. 17, no. 4, pp. 2107–2119, 2020. https://doi.org/10.1109/tase.2020.2990566.Search in Google Scholar
[33] S. Li, D. Hong, and H. Wang, “Relation inference among sensor time series in smart buildings with metric learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, 2020, pp. 4683–4690.10.1609/aaai.v34i04.5900Search in Google Scholar
[34] J. Ma, D. Hong, and H. Wang, “Selective sampling for sensor type classification in buildings,” in 2020 19th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN), 2020, pp. 241–252.10.1109/IPSN48710.2020.00028Search in Google Scholar
[35] A. Hassani, F. Montori, K. Liao, et al.., “INFORM: a tool for classification and semantic annotation of IoT datastreams,” in 2021 IEEE 7th World Forum on Internet of Things (WF-IoT), IEEE, 2021, pp. 223–228.10.1109/WF-IoT51360.2021.9594994Search in Google Scholar
[36] D. Waterworth, S. Sethuvenkatraman, and Q. Z. Sheng, “Advancing smart building readiness: automated metadata extraction using neural language processing methods,” Adv. Appl. Energy, vol. 3, p. 100041, 2021. https://doi.org/10.1016/j.adapen.2021.100041.Search in Google Scholar
[37] J. Ploennigs, J. Cohn, and A. Stanford-Clark, “The future of IoT,” IEEE. IoTM, vol. 1, no. 1, pp. 28–33, 2018. https://doi.org/10.1109/iotm.2018.1700021.Search in Google Scholar
[38] S. Khalid, H. Hwang, and H. S. Kim, “Real-world data-driven machine-learning-based optimal sensor selection approach for equipment fault detection in a thermal power plant,” Mathematics, vol. 9, no. 21, p. 2814, 2021. https://doi.org/10.3390/math9212814.Search in Google Scholar
[39] VGBE, KKS Kraftwerk-Kennzeichensystem, 8th ed, vol. 8, Auflage, Verlag Technisch-Wissenschaftlicher Schriften, 2018.Search in Google Scholar
[40] L. Melodia and R. Lenz, “Homological time series analysis of sensor signals from power plants,” ECML PKDD, vol. 1524, pp. 283–299, 2021.10.1007/978-3-030-93736-2_22Search in Google Scholar
[41] F. Takens, “Detecting Strange Attractors in Turbulence,” in Dynamical Systems and Turbulence, Warwick 1980: Proceedings of a Symposium Held at the University of Warwick 1979/80, Heidelberg, Berlin, Springer Berlin Heidelberg, 1980, pp. 366–381.10.1007/BFb0091924Search in Google Scholar
[42] A. Sivanathan, H. H. Gharakheili, F. Loi, et al.., “Classifying IoT devices in smart environments using network traffic characteristics,” IEEE Trans. Mob. Comput., vol. 18, no. 8, pp. 1745–1759, 2019. https://doi.org/10.1109/tmc.2018.2866249.Search in Google Scholar
[43] T. Idé and K. Inoue, “Knowledge discovery from heterogeneous dynamic systems using change-point correlations,” in Proceedings of the 2005 SIAM International Conference on Data Mining, 2005, pp. 571–575.10.1137/1.9781611972757.63Search in Google Scholar
[44] N. Halko, P. G. Martinsson, and J. A. Tropp, “Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions,” SIAM Rev., vol. 53, no. 2, pp. 217–288, 2011. https://doi.org/10.1137/090771806.Search in Google Scholar
[45] L. Weber, Changepoynt: Readable Continous Change Point Scoring in Python, 2023. Available at: https://github.com/Lucew/changepoynt [accessed: Aug. 28, 2023].Search in Google Scholar
[46] G. E. A. P. A. Batista, E. J. Keogh, O. M. Tataw, and V. M. A. de Souza, “CID: an efficient complexity-invariant distance for time series,” Data Min. Knowl. Discov., vol. 28, no. 3, pp. 634–669, 2014. https://doi.org/10.1007/s10618-013-0312-3.Search in Google Scholar
[47] B. Hjorth, “EEG analysis based on time domain properties,” Electroencephalogr. Clin. Neurophysiol., vol. 29, no. 3, pp. 306–310, 1970. https://doi.org/10.1016/0013-4694(70)90143-4.Search in Google Scholar PubMed
[48] H. E. Hurst, “Long-term storage capacity of reservoirs,” Trans. Am. Soc. Civ. Eng., vol. 116, no. 1, pp. 770–799, 1951. https://doi.org/10.1061/taceat.0006518.Search in Google Scholar
[49] F. Pedregosa, G. Varoquaux, A. Gramfort, et al.., “Scikit-learn: machine learning in Python,” J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011.Search in Google Scholar
[50] Yu and Shi, “Multiclass spectral clustering,” in Proceedings Ninth IEEE International Conference on Computer Vision, vol. 1, 2003, pp. 313–319.10.1109/ICCV.2003.1238361Search in Google Scholar
[51] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, AAAI Press, 1996, pp. 226–231.Search in Google Scholar
[52] S. Aminikhanghahi and D. J. Cook, “A survey of methods for time series change point detection,” Knowl. Inf. Syst., vol. 51, no. 2, pp. 339–367, 2017. https://doi.org/10.1007/s10115-016-0987-z.Search in Google Scholar PubMed PubMed Central
[53] J. B. Borges, H. S. Ramos, and A. A. F. Loureiro, “A classification strategy for Internet of Things data based on the class separability analysis of time series dynamics,” ACM Trans. Internet Things, vol. 3, no. 3, pp. 1–30, 2022. https://doi.org/10.1145/3533049.Search in Google Scholar
[54] Y. Miao, H. J. Davies, and D. P. Mandic, “Amplitude-independent machine learning for PPG through visibility graphs and transfer learning,” ArXiv Prepublish, 2023.Search in Google Scholar
[55] F. Stinner, Y. Yang, T. Schreiber, G. Bode, M. Baranski, and D. Müller, “Generating generic data sets for machine learning applications in building services using standardized time series data,” in Proceedings of the 36th ISARC, IAARC, 2019.10.22260/ISARC2019/0031Search in Google Scholar
© 2023 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Frontmatter
 - Editorial
 - Machine learning applications
 - Contributions to a thematic issue
 - Machine learning and cyber security
 - Artificial intelligence for molecular communication
 - Machine learning in run-time control of multicore processor systems
 - Machine learning in sensor identification for industrial systems
 - Wildfire prediction for California using and comparing Spatio-Temporal Knowledge Graphs
 - Machine learning in computational literary studies
 - Machine learning in AI Factories – five theses for developing, managing and maintaining data-driven artificial intelligence at large scale
 
Articles in the same Issue
- Frontmatter
 - Editorial
 - Machine learning applications
 - Contributions to a thematic issue
 - Machine learning and cyber security
 - Artificial intelligence for molecular communication
 - Machine learning in run-time control of multicore processor systems
 - Machine learning in sensor identification for industrial systems
 - Wildfire prediction for California using and comparing Spatio-Temporal Knowledge Graphs
 - Machine learning in computational literary studies
 - Machine learning in AI Factories – five theses for developing, managing and maintaining data-driven artificial intelligence at large scale