5 Navigating the landscape of automated data preprocessing: an in-depth review of automated machine learning platforms
-
Abderahim Salhi
Abstract
Effective data preprocessing plays a pivotal role in enhancing the performance of machine learning (ML) pipeline by influencing the accuracy and overall effectiveness of the final outcomes. In the context of the growing prominence of automated machine learning (AutoML), the significance of data preprocessing has become more prominent. This chapter is an ongoing effort to explore and survey the use of AutoML platforms for data preprocessing. The focus is particularly directed toward exploring how automated machine learning tools contribute to streamlining the construction and training phases of ML models. Our examination delves into a comprehensive exploration of various tasks and subtasks, providing insights into the ways AutoML tools can be leveraged across the spectrum of ML processes.
Abstract
Effective data preprocessing plays a pivotal role in enhancing the performance of machine learning (ML) pipeline by influencing the accuracy and overall effectiveness of the final outcomes. In the context of the growing prominence of automated machine learning (AutoML), the significance of data preprocessing has become more prominent. This chapter is an ongoing effort to explore and survey the use of AutoML platforms for data preprocessing. The focus is particularly directed toward exploring how automated machine learning tools contribute to streamlining the construction and training phases of ML models. Our examination delves into a comprehensive exploration of various tasks and subtasks, providing insights into the ways AutoML tools can be leveraged across the spectrum of ML processes.
Kapitel in diesem Buch
- Frontmatter I
- Preface V
- Contents VII
-
Methods and instrumentation
- 1 Identifying and estimating outliers in time series with nonstationary mean through multiobjective optimization method 1
- 2 Using the intentionally linked entities (ILE) database system to create hypergraph databases with fast and reliable relationship linking, with example applications 21
- 3 Rapid and automated determination of cluster numbers for high-dimensional big data: a comprehensive update 37
- 4 Canonical correlation analysis and exploratory factor analysis of the four major centrality metrics 49
- 5 Navigating the landscape of automated data preprocessing: an in-depth review of automated machine learning platforms 71
- 6 Generating random XML 83
-
Applications and case studies
- 7 Exploring autism risk: a deep dive into graph neural networks and gene interaction data 105
- 8 Leveraging ChatGPT and table arrangement techniques in advanced newspaper content analysis for stock insights 121
- 9 An experimental study on road surface classification 145
- 10 RNN models for evaluating financial indices: examining volatility and demand-supply shifts in financial markets during COVID-19 165
- 11 Topological methods for vibration feature extraction 185
- 12 Dyna-SPECTS: DYNAmic enSemble of Price Elasticity Computation models using Thompson Sampling in e-commerce 215
- 13 Creating a metadata schema for reservoirs of data: a systems engineering approach 251
- 14 Implementation and evaluation of an eXplainable artificial intelligence to explain the evaluation of an assessment analytics algorithm for freetext exams in psychology courses in higher education to attest QBLM-based competencies 271
- 15 Toward a skill-centered qualification ontology supporting data mining of human resources in knowledge-based enterprise process representations 307
- Index 333
Kapitel in diesem Buch
- Frontmatter I
- Preface V
- Contents VII
-
Methods and instrumentation
- 1 Identifying and estimating outliers in time series with nonstationary mean through multiobjective optimization method 1
- 2 Using the intentionally linked entities (ILE) database system to create hypergraph databases with fast and reliable relationship linking, with example applications 21
- 3 Rapid and automated determination of cluster numbers for high-dimensional big data: a comprehensive update 37
- 4 Canonical correlation analysis and exploratory factor analysis of the four major centrality metrics 49
- 5 Navigating the landscape of automated data preprocessing: an in-depth review of automated machine learning platforms 71
- 6 Generating random XML 83
-
Applications and case studies
- 7 Exploring autism risk: a deep dive into graph neural networks and gene interaction data 105
- 8 Leveraging ChatGPT and table arrangement techniques in advanced newspaper content analysis for stock insights 121
- 9 An experimental study on road surface classification 145
- 10 RNN models for evaluating financial indices: examining volatility and demand-supply shifts in financial markets during COVID-19 165
- 11 Topological methods for vibration feature extraction 185
- 12 Dyna-SPECTS: DYNAmic enSemble of Price Elasticity Computation models using Thompson Sampling in e-commerce 215
- 13 Creating a metadata schema for reservoirs of data: a systems engineering approach 251
- 14 Implementation and evaluation of an eXplainable artificial intelligence to explain the evaluation of an assessment analytics algorithm for freetext exams in psychology courses in higher education to attest QBLM-based competencies 271
- 15 Toward a skill-centered qualification ontology supporting data mining of human resources in knowledge-based enterprise process representations 307
- Index 333