Data-Driven Methods and AI in Engineering Design: A Systematic Literature Review Focusing on Challenges and Opportunities

Reading time: 6 minute
...

📝 Abstract

The increasing availability of data and advancements in computational intelligence have accelerated the adoption of data-driven methods (DDMs) in product development. However, their integration into product development remains fragmented. This fragmentation stems from uncertainty, particularly the lack of clarity on what types of DDMs to use and when to employ them across the product development lifecycle. To address this, a necessary first step is to investigate the usage of DDM in engineering design by identifying which methods are being used, at which development stages, and for what application. This paper presents a PRISMA systematic literature review. The V-model as a product development framework was adopted and simplified into four stages: system design, system implementation, system integration, and validation. A structured search across Scopus, Web of Science, and IEEE Xplore (2014–2024) retrieved 1{,}689 records. After screening, 114 publications underwent full-text analysis. Findings show that machine learning (ML) and statistical methods dominate current practice, whereas deep learning (DL), though still less common, exhibits a clear upward trend in adoption. Additionally, supervised learning, clustering, regression analysis, and surrogate modeling are prevalent in design, implementation, and integration system stages but contributions to validation remain limited. Key challenges in existing applications include limited model interpretability, poor cross-stage traceability, and insufficient validation under real-world conditions. Additionally, it highlights key limitations and opportunities such as the need for interpretable hybrid models. This review is a first step toward design-stage guidelines; a follow-up synthesis should map computer science algorithms to engineering design problems and activities.

💡 Analysis

The increasing availability of data and advancements in computational intelligence have accelerated the adoption of data-driven methods (DDMs) in product development. However, their integration into product development remains fragmented. This fragmentation stems from uncertainty, particularly the lack of clarity on what types of DDMs to use and when to employ them across the product development lifecycle. To address this, a necessary first step is to investigate the usage of DDM in engineering design by identifying which methods are being used, at which development stages, and for what application. This paper presents a PRISMA systematic literature review. The V-model as a product development framework was adopted and simplified into four stages: system design, system implementation, system integration, and validation. A structured search across Scopus, Web of Science, and IEEE Xplore (2014–2024) retrieved 1{,}689 records. After screening, 114 publications underwent full-text analysis. Findings show that machine learning (ML) and statistical methods dominate current practice, whereas deep learning (DL), though still less common, exhibits a clear upward trend in adoption. Additionally, supervised learning, clustering, regression analysis, and surrogate modeling are prevalent in design, implementation, and integration system stages but contributions to validation remain limited. Key challenges in existing applications include limited model interpretability, poor cross-stage traceability, and insufficient validation under real-world conditions. Additionally, it highlights key limitations and opportunities such as the need for interpretable hybrid models. This review is a first step toward design-stage guidelines; a follow-up synthesis should map computer science algorithms to engineering design problems and activities.

📄 Content

The rapidly growing wave of digitalization is causing a major transformation in engineering design. With the development of connected sensors, industrial IoT systems, and cyber-physical systems, the volume, variety, and velocity of data created throughout the product life cycle are constantly reaching new heights. This “data overload” has caused a paradigm shift in the way various engineering disciplines define problems, draw insights, and develop solutions (Li et al. 2022;Knödler et al. 2023). Simultaneously, the exponential growth and advances in computational power, not only associated with improvements in hardware performance but also with algorithmic advancements in, for example, machine learning (ML), deep learning (DL), and high-dimensional optimization algorithms, now forms the core toolkit of modern engineering analytics (Bach et al. 2017b). The multitude of available data combined with powerful and affordable infrastructure for data processing facilitates the widespread adoption of data-driven methods (DDMs) from conceptual design and requirements engineering to validation and optimization (Shabestari et al. 2019;Gay et al. 2021). Despite their current widespread adoption, a high diversity in understanding and uncertainty in application of DDMs remains. In the following, we summarize the understanding of DDMs and the alignment of the application of DDMs to the V-Model (VDI/VDE 2206, 2021) as basis for deriving the research gap.

In Literature, DDM are defined in several distinct ways, from leveraging empirical data (Maslyaev et al. 2020;Talal et al. 2020) to the process of data-driven innovating drawn from big data (Luo 2023), combination of data analytics techniques (Ayensa-Jiménez et al. 2019) or information for and enhancement of decision making, modelling and analysis (Nosck et al. 2023). More broad definitions describe DDMs as methods to extract knowledge from data to support modeling, decision-making, prediction and optimization (Wu 2024;Payrebrune et al. 2024). Furthermore, (Knödler et al. 2023) frame them as analytical systems capable of adapting dynamically to incoming data. Another line of research defines DDMs by how they learn from data. These studies infer empirical relationships from large datasets (Mosallam et al. 2015;Villarejo et al. 2016) using machine learning and related methods in computational intelligence algorithms (Mount et al. 2016). (Gao et al. 2013)) argues that DDMs can operate with limited prior process knowledge when they rely on, for example, signal processing and large-scale analytics. Building on this view, researchers emphasize reduced dependency on domain specific background knowledge when prior theoretical knowledge is limited and system complexity is high (Mount et al. 2016;How et al. 2019). This orientation is consistent with software engineering view that distinguish data driven Learners, that infer behavior from data, slow to train, and fast to run, and model-based Solvers, that compute with explicit models, quick to start, and slower to run (Geffner 2018). So, once DDMs are trained, they may produce outputs faster than methods that depend on explicit models (Geffner 2018). Furthermore, a widely cited and general definition, originating from ML research by (Jordan and Mitchell 2015) and adopted in this publication, describes DDMs as methods to derive insights or control actions directly from data, without relying on traditional engineering models. Traditional engineering models can be, for example, based on physical knowledge and assumptions, as well as analytical formulations. In summary this means that there is no standardized definition of DDMs in engineering design. But it also reflects the growing recognition of DDMs as not only tools to handle big data, but critical enablers of modern, adaptive, and scalable engineering design solutions.

Within engineering design, there is a variety of different models that describe its processes and stages (Wynn and Clarkson 2018). One of the widely recognized process models is the so-called V-model. The initial idea of the V-model was first introduced in software development by (Boehm 1979) and later taken up by (Bröhl and Dröschel 1995). It provides a systematic guideline for product development, encompassing various stages from requirements analysis to verification and validation. For mechanical and mechatronic systems, this requires the integration of mechanical, electronic, and software components, necessitating a multidisciplinary approach. This leads to explications like the VDI 2206 guideline -a V-model-based standard in mechatronic and cyber-physical systems development considering different domains of engineering design shown in Figure 1 (VDI/VDE 2206, 2021). Recently, development frameworks such as the Double-V, Triple-Vs model (Li et al. 2019b) and the AI4PD ontology (Gerschütz et al. 2023a) have emerged to systematize the integration of data, models, and product development lifecycles. The transition f

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut