Efficiently determining the effect of data set size on autoencoder-based metamodels for structural design optimization
-
Fabian Schneider
Fabian Schneider graduated with a Master of Science degree from University Siegen in 2020. He joined the working group Automatic Control – Mechatronics of Prof. Nelles as a research assistant. His research topics focus on metamodeling for optimization tasks and design of experiments., Timm J. Peter
, Ralph Jörg Hellmig Timm Julian Peter graduated with a Master of Science degree from University Siegen in 2018. He joined the working group Automatic Control – Mechatronics of Prof. Nelles as a research assistant. His research topics focus on data set selection and linear and nonlinear system identification. and Oliver Nelles Ralph Jörg Hellmig is apl. Professor at the University of Siegen in the Department of Mechanical Engineering and institute of material science. He received his doctor’s degree in 2000 at the Technical University of Clausthal. His key research topics are material-based questions in the field of joining technology and corrosion. Oliver Nelles is Professor at the University of Siegen in the Department of Mechanical Engineering and chair of Automatic Control – Mechatronics. He received his doctor’s degree in 1999 at the Technical University of Darmstadt. His key research topics are nonlinear system identification, design of experiments, metamodeling, and local model networks.
Abstract
The size of the training data set significantly impacts the quality of data-driven models. However, creating the data set is also one of the most significant expenses in the model development process. This is especially true for metamodels designed for structural optimization. Second, the data must be generated using computationally intensive finite element simulations. Therefore, it is essential to understand how the amount of available data influences the balance between computing costs and expected model quality to get a good trade-off. This relationship will be analyzed for a metamodel approach based on an autoencoder applied to a forming process. By applying appropriate methods of instance selection, also known as subset selection, the required data can be strongly reduced, and computational costs can be minimized.
Zusammenfassung
Die Größe des Trainingsdatensatzes hat einen entscheidenden Einfluss auf die Qualität von datengetriebenen Modellen. Jedoch gehört die Erstellung des Datensatzes auch zu den größten Kostenpunkten im Entwicklungsprozess. Dies gilt besonders für Metamodelle, die zur Strukturoptimierung erstellt werden sollen. Die Daten müssen zunächst über rechenintensive Finite Elemente Simulationen erzeugt werden. Für die Anwendung ist daher die Kenntnis des Einflusses der zur Verfügung stehenden Datenmenge wichtig, um eine gute Abwägung zwischen Rechenkosten und erwartbarer Modellqualität durchführen zu können. Anhand eines Umformprozesses soll dieser Zusammenhang für einen Metamodelansatz basierend auf einem Autoencoder ermittelt werden. Durch die Anwendung einer geeigneten Datensatzselektion kann erfolgreich der Großteil der sonst benötigten Daten gespart und die Rechenkosten auf ein Minimum reduziert werden.
About the authors

Fabian Schneider graduated with a Master of Science degree from University Siegen in 2020. He joined the working group Automatic Control – Mechatronics of Prof. Nelles as a research assistant. His research topics focus on metamodeling for optimization tasks and design of experiments.

Timm Julian Peter graduated with a Master of Science degree from University Siegen in 2018. He joined the working group Automatic Control – Mechatronics of Prof. Nelles as a research assistant. His research topics focus on data set selection and linear and nonlinear system identification.

Ralph Jörg Hellmig is apl. Professor at the University of Siegen in the Department of Mechanical Engineering and institute of material science. He received his doctor’s degree in 2000 at the Technical University of Clausthal. His key research topics are material-based questions in the field of joining technology and corrosion.

Oliver Nelles is Professor at the University of Siegen in the Department of Mechanical Engineering and chair of Automatic Control – Mechatronics. He received his doctor’s degree in 1999 at the Technical University of Darmstadt. His key research topics are nonlinear system identification, design of experiments, metamodeling, and local model networks.
-
Research ethics: Not applicable.
-
Informed consent: Not applicable.
-
Author contributions: The authors have accepted responsibility for the entire content of this manuscript and approved its submission.
-
Use of Large Language Models, AI and Machine Learning Tools: Machine Learning based spell checker is used to improve language.
-
Conflict of interest: All other authors state no conflict of interest.
-
Research funding: None declared.
-
Data availability: Not applicable.
References
[1] H. Khatouri, T. Benamara, P. Breitkopf, and J. Demange, “Metamodeling techniques for CPU-intensive simulation-based design optimization: a survey,” Adv. Model. Simul. Eng. Sci., vol. 9, no. 1, 2022, Art. no. 1. https://doi.org/10.1186/s40323-022-00214-y.Search in Google Scholar
[2] I. Negrin, M. Kripka, and V. Yepes, “Metamodel-assisted design optimization in the field of structural engineering: a literature review,” Structures, vol. 52, pp. 609–631, 2023. https://doi.org/10.1016/j.istruc.2023.04.006.Search in Google Scholar
[3] M. H. A. Bonte, A. H. van den Boogaard, and J. Huétink, “A metamodel based optimisation algorithm for metal forming processes,” in Advanced Methods in Material Forming, Berlin, Heidelberg, Springer, 2007, pp. 55–72.10.1007/3-540-69845-0_4Search in Google Scholar
[4] C.-C. Yang and C.-H. Liu, “The study of multi-stage cold forming process for the manufacture of relief valve regulating nuts,” Appl. Sci., vol. 13, no. 10, p. 6299, 2023. https://doi.org/10.3390/app13106299.Search in Google Scholar
[5] T. Li, J. Zheng, and Z. Chen, “Description of full-range strain hardening behavior of steels,” SpringerPlus, vol. 5, no. 1, p. 1316, 2016. https://doi.org/10.1186/s40064-016-2998-3.Search in Google Scholar PubMed PubMed Central
[6] C. Schwarz, S. Kriechenbauer, R. Mauermann, and W. G. Drossel, “Field meta modelling for process design in complex sheet metal forming,” IOP Conf. Ser.: Mater. Sci. Eng., vol. 1238, 2022, Art. no. 012069. https://doi.org/10.1088/1757-899x/1238/1/012069.Search in Google Scholar
[7] T. Schneider, A. B. Bedrikow, and K. Stahl, “Enhanced prediction of thermomechanical systems using machine learning, PCA, and finite element simulation,” Adv. Model. Simul. Eng. Sci., vol. 11, no. 1, 2024, Art. no. 14. https://doi.org/10.1186/s40323-024-00268-0.Search in Google Scholar
[8] F. Schneider, R. J. Hellmig, and O. Nelles, “Autoencoder-based metamodeling for structural design optimization,” IFAC-PapersOnLine, vol. 58, no. 28, pp. 288–293, 2024. https://doi.org/10.1016/j.ifacol.2025.01.009.Search in Google Scholar
[9] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, MIT Press, 2016. Available at: http://www.deeplearningbook.org.Search in Google Scholar
[10] O. Nelles, Nonlinear System Identification from Classical Approaches to Neural Networks, Fuzzy Models, and Gaussian Processes, 2nd ed. Cham, Springer International Publishing; Imprint: Springer, 2020, Nelles2020.10.1007/978-3-030-47439-3Search in Google Scholar
[11] P. Cristovao, H. Nakada, Y. Tanimura, and H. Asoh, “Generating in-between images through learned latent space representation using variational autoencoders,” IEEE Access, vol. 8, pp. 149456–149467, 2020. https://doi.org/10.1109/access.2020.3016313.Search in Google Scholar
[12] F. A. C. Viana, “Things you wanted to know about the Latin hypercube design and were afraid to ask,” in 10th World Congress on Structural and Multidisciplinary Optimization, vol. 19, 2013, pp. 1–9.Search in Google Scholar
[13] F. Schneider, et al.., “Constrained design of experiments for data-driven models,” in Proceedings – 32. Workshop Computational Intelligence: Berlin, 1. – 2. Dezember 2022, KIT Scientific Publishing, 2022, pp. 193–212.10.58895/ksp/1000151141-14Search in Google Scholar
[14] L. Piegl and W. Tiller, The NURBS Book, Berlin, Heidelberg, Springer, 1995.10.1007/978-3-642-97385-7Search in Google Scholar
[15] F. Schneider, R. J. Hellmig, and O. Nelles, “Uniform design of experiments for equality constraints,” in Intelligent Data Engineering and Automated Learning – IDEAL 2023, P. Quaresma, et al.., Eds., Cham, Nature Switzerland, Springer, 2023, pp. 311–322.10.1007/978-3-031-48232-8_29Search in Google Scholar
[16] D. Yan, L. Huang, and M. I. Jordan, “Fast approximate spectral clustering,” in Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2009, pp. 907–916.10.1145/1557019.1557118Search in Google Scholar
[17] K. Zhang, I. W. Tsang, and J. T. Kwok, “Improved Nyström low-rank approximation and error analysis,” in Proceedings of the 25th International Conference on Machine Learning, 2008, pp. 1232–1239.10.1145/1390156.1390311Search in Google Scholar
[18] A. R. C. Paiva and T. Tasdizen, “Fast semi-supervised image segmentation by novelty selection,” in 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, 2010, pp. 1054–1057.10.1109/ICASSP.2010.5495333Search in Google Scholar
[19] M. Seyedhosseini, A. R. C. Paiva, and T. Tasdizen, “Image parsing with a three-state series neural network classifier,” in 2010 20th International Conference on Pattern Recognition, IEEE, 2010, pp. 4508–4511.10.1109/ICPR.2010.1095Search in Google Scholar
[20] K. Zhang and J. T. Kwok, “Density-weighted Nyström method for computing large kernel eigensystems,” Neural Comput., vol. 21, no. 1, pp. 121–146, 2009. https://doi.org/10.1162/neco.2009.11-07-651.Search in Google Scholar
[21] T. J. Peter and O. Nelles, “Fast and simple dataset selection for machine learning,” at-Automatisierungstechnik, vol. 67, no. 10, pp. 833–842, 2019. https://doi.org/10.1515/auto-2019-0010.Search in Google Scholar
[22] C. Schwarz, P. Ackert, and R. Mauermann, “Principal component analysis and singular value decomposition used for a numerical sensitivity analysis of a complex drawn part,” Int. J. Adv. Des. Manuf. Technol., vol. 94, no. 5, pp. 2255–2265, 2018. https://doi.org/10.1007/s00170-017-0980-z.Search in Google Scholar
© 2025 Walter de Gruyter GmbH, Berlin/Boston
Articles in the same Issue
- Frontmatter
- Editorial
- Selected contributions from the workshops “Computational Intelligence” in 2023 and 2024
- Methods
- Nonlinear system categorization for structural data mining with state space models
- Incorporation of structural properties of the response surface into oblique model trees
- Takagi-Sugeno based model reference control for wind turbine systems in frequency containment scenarios
- On autoregressive deep learning models for day-ahead wind power forecasts with irregular shutdowns due to redispatching
- Applications
- Efficiently determining the effect of data set size on autoencoder-based metamodels for structural design optimization
- Kalibriermodellerstellung und Merkmalsselektion für die mikromagnetische Materialcharakterisierung mittels maschineller Lernverfahren
- Investigating quality inconsistencies in the ultra-high performance concrete manufacturing process using a search-space constrained non-dominated sorting genetic algorithm II
- EAP4EMSIG – enhancing event-driven microscopy for microfluidic single-cell analysis
Articles in the same Issue
- Frontmatter
- Editorial
- Selected contributions from the workshops “Computational Intelligence” in 2023 and 2024
- Methods
- Nonlinear system categorization for structural data mining with state space models
- Incorporation of structural properties of the response surface into oblique model trees
- Takagi-Sugeno based model reference control for wind turbine systems in frequency containment scenarios
- On autoregressive deep learning models for day-ahead wind power forecasts with irregular shutdowns due to redispatching
- Applications
- Efficiently determining the effect of data set size on autoencoder-based metamodels for structural design optimization
- Kalibriermodellerstellung und Merkmalsselektion für die mikromagnetische Materialcharakterisierung mittels maschineller Lernverfahren
- Investigating quality inconsistencies in the ultra-high performance concrete manufacturing process using a search-space constrained non-dominated sorting genetic algorithm II
- EAP4EMSIG – enhancing event-driven microscopy for microfluidic single-cell analysis