Fig 7 - uploaded by Max Mowbray
Content may be subject to copyright.
Industrial data science workflow based on the IBM crossindustry standard process for data mining (CRISP-DM).
Source publication
In the literature, machine learning (ML) and artificial intelligence (AI) applications tend to start with examples that are irrelevant to process engineers (e.g. classification of images between cats and dogs, house pricing, types of flowers, etc.). However, process engineering principles are also based on pseudo-empirical correlations and heuristi...
Contexts in source publication
Context 1
... response time (months to years). 33 Machine learning takes advantage of this vast amount of historical data for the following industrial applications: condition or predictive monitoring, quality prediction, process, control optimization, and scheduling. 6 Before implementation and industrialization, a diagnostic study is often conducted (see Fig. 7 and 8). Utilizing ML to accelerate the understanding and discovery of the root cause, which perhaps does not need a complex solution to be ...
Context 2
... or wrong assumptions despite the amount of data stored or knowledge available. During the first phase, which can be called diagnostics, it is usually common to iterate through several data and modeling steps until the problem and potential solution are better understood. Diagnostics correspond to the beginning of any industrial application (see Fig. 7). Industrial data science can accelerate the process of discriminating what are the tags (sensors) that can help explain the problem while capturing nonlinearities via data-driven modeling techniques (see Fig. 9). The general idea is always to perform simpler, more interpretable, tree-based models for screening followed by more complex ...
Context 3
... the use of these tools for root-cause analysis. [64][65][66][67][68] An anomaly that propagates and diverges through the process causes a higher priority set of alarms than those created by unusual operations. Graph analysis can be used in this regard [69][70][71] to include the topology of the plant and the relations among operating units (see Fig. 17). This approach can cover the entire plant from anomaly operations and reduce the number of false positives. This is a similar line of thinking to the use of knowledge graphs for complex analyses, which are able to provide an integrated view of macro, meso-and microscale processes. 72 ...
Context 4
... bootstrap Fig. 26 Uncertainty can be estimated by comparing a model (or sample statistic) with its simulated distribution using resampling techniques. For example, the slope obtained in a linear model can be compared to a distribution of the same parameter that was generated by resampling the training data. Adapted from ref. 107 with permission. Fig. 27 Figurative description of the Bayesian approach to express modeling uncertainty in neural networks. The top two subplots show the covariance between two-parameter distributions in the first and second layers of the network, respectively. The bottom subplot demonstrates the generation of a predictive distribution by Monte Carlo sampling ...
Context 5
... approach to training ANNs is provided by the Bayesian learning paradigm. Bayesian neural networks (BNN) share the same topology as conventional neural networks, but instead of having point estimates for parameters, they instead have a distribution over parameters (Fig. 27). Treating the network parameters as random variables then allows for the generation of a predictive distribution (given a model input) via the Monte Carlo method. Similarly, Bayesian extensions to other models such as support vector machines (SVMs) 111 ...
Context 6
... is often large, which often provides difficulty for the RL algorithms, as there are many discontinuities in the 'reward landscape'. Further, there are typically many operations that a given unit can process, and given the nature of RL (i.e. using a functional parameterization of a control policy), it is not clear how best to select controls. Fig. 37 and 38 show one idea proposed in recent work 221 and a corresponding schedule generated for the case study detailed ...
Context 7
... governed by standard operating procedure (SOPs) (i.e. the requirement for cleaning times, the presence of precedence constraints, etc.). These SOPs essentially define logic rules, f SOP , that govern the way in which the plant is operated and the set of operations one could schedule in units, t , given the current state of the plant, x t (see Fig. 37a). As a result, one can often pre-identify the controls, which innately satisfy those constraints defined by SOPs and implement a rounding policy, f r to alter the control predicted by the policy function to select one of those available controls (see Fig. 37b). Perhaps the largest downside of this approach is that derivative-free ...
Context 8
... one could schedule in units, t , given the current state of the plant, x t (see Fig. 37a). As a result, one can often pre-identify the controls, which innately satisfy those constraints defined by SOPs and implement a rounding policy, f r to alter the control predicted by the policy function to select one of those available controls (see Fig. 37b). Perhaps the largest downside of this approach is that derivative-free approaches to RL are most suitable. These algorithms are particularly suited when the effective dimensionality of the problem is low. However, the approach is known to become less efficacious when the effective dimensionality of the parameter space is large (as may ...
Context 9
... chain optimization. The operation of supply chains is subject to inherent uncertainty as derived from market mechanisms (i.e. supply and demand), 223 transportation, supply chain structure and the interactions that take place between organizations, and various other exogenous uncertainties (such as global weather and humanitarian events). 224 Fig. 37 Handling control constraints innately in RL-based chemical production scheduling via identification of transformations of the control prediction through standard operating procedures (i.e. precedence and disjunctive constraints and requirements for unit cleaning). a) Augmenting the decision-making process by identifying the set of ...
Context 10
... by identifying the set of controls which satisfy the logic provided by standard operating procedure at each time index, and b) implementation of a rounding policy to ensure that RL control selection satisfies the associated logic. Fig. 38 Solving a MILP problem via RL to produce an optimal production schedule via the framework displayed in Fig. 37. A discrete time interval is equivalent to 0.5 days in this study. Due to the large uncertainties that exist within supply chains, there is an effort to ensure that organizational behavior is more cohesive and coordinated with other operators within the chain. For example, graph neural networks (GNNs) 226,227 have been applied to help ...
Citations
... AI involves several methodological domains, such as reasoning, knowledge representation, solution search, and the basic paradigm of machine learning (ML) among them. In the last few years, especially since the introduction of AlphaGo, ML has been greatly developed in the field of industrial chemistry and chemical engineering, thus greatly helping the development of pharmaceuticals and fine chemicals, thus reducing time and cost [3][4][5]. So far, much of the literature has summarized the application of machine learning algorithms in the chemical industry ( Figure 2) [6]. ...
... Key to the development of computer- AI involves several methodological domains, such as reasoning, knowledge representation, solution search, and the basic paradigm of machine learning (ML) among them. In the last few years, especially since the introduction of AlphaGo, ML has been greatly developed in the field of industrial chemistry and chemical engineering, thus greatly helping the development of pharmaceuticals and fine chemicals, thus reducing time and cost [3][4][5]. So far, much of the literature has summarized the application of machine learning algorithms in the chemical industry ( Figure 2) [6]. ...
With the development of Industry 4.0, artificial intelligence (AI) is gaining increasing attention for its performance in solving particularly complex problems in industrial chemistry and chemical engineering. Therefore, this review provides an overview of the application of AI techniques, in particular machine learning, in chemical design, synthesis, and process optimization over the past years. In this review, the focus is on the application of AI for structure-function relationship analysis, synthetic route planning, and automated synthesis. Finally, we discuss the challenges and future of AI in making chemical products.
... The data is used for illustration after cleaning of errors and repetitions. The data representation is essential for better performance of algorithm and it is carried out to convert the raw data into suitable form according to the requirement of algorithm [4]. ML is classified into two main types such as supervised and unsupervised learning approaches. ...
... In order to solve these issues, advanced and analytical tools are available which can be used for data automation such as screening models, i.e., AutoML. The advanced monitoring system has become a new standard in manufacturing environment, which is capable of notifying the abnormal behaviors, list correlated factors, and allows engineers to visualize process data [4]. Another, challenge is associated with data collection and analytics of data generated by complex processes. ...
The field of machine learning has proven to be a powerful approach in smart manufacturing and processing in the chemical and process industries. This review provides a systematic overview of current state of artificial intelligence and machine learning and their applications in textile, nuclear power plant, fertilizer, water treatment, and oil and gas industries. Moreover, this study reveals the current dominant machine learning methods, pre and post processing of models, increased utilization of machine learning in terms of fault detection, prediction, optimization, quality control, and maintenance in these sectors. In addition, this review gives the insight into the actual benefits and impact of each method, and complications in their extensive deployment. Finally in the current impressive state, challenges, future development in terms of algorithm and infrastructure aspects are highlighted. A systematic overview of the current state of artificial intelligence and machine learning and their applications in textile, nuclear power plant, fertilizer, water treatment, and oil and gas industries is provided. The current dominant machine learning methods, pre and post processing of models, increased utilization of machine learning in terms of fault detection, prediction, optimization, quality control, and maintenance in these sectors is revealed.
... Beyond this application, ML models can also be used as surrogate models for complex scale-up models (e.g. by replacing costly simulations in computational fluid dynamics) [102]. Though literature in the field of ML learning for bioprocess scale-up is still scarce, we anticipate that methods will be evolving quickly, potentially using the field of chemical engineering as a blueprint (e.g., [103]). ...
Fostered by novel analytical techniques, digitalization, and automation, modern bioprocess development provides large amounts of heterogeneous experimental data, containing valuable process information. In this context, data-driven methods like machine learning (ML) approaches have great potential to rationally explore large design spaces while exploiting experimental facilities most efficiently. Herein we demonstrate how ML methods have been applied so far in bioprocess development, especially in strain engineering and selection, bioprocess optimization, scale-up, monitoring, and control of bioprocesses. For each topic, we will highlight successful application cases, current challenges, and point out domains that can potentially benefit from technology transfer and further progress in the field of ML.
... Gekko uses numeric solvers such as the Interior Point Optimizer (IPOPT) [2] and Advanced Process Optimizer (APOPT) [3] among others to solve these complex problems. Using first and second derivative information from the provided algebraic equations in the problem statement, Gekko solves a range of different optimization problems, and has been used in various applications such as nuclear waste glass formulation [4], mosquito population control strategies [5], small module nuclear reactor design [6], ammonia production from wind power [7], smart transportation systems [8], chemical and process industries [9], smart-grid electric vehicle charging [10], optimization of high-altitude solar aircraft [11], model predictive control of sucker-rod pumping [12], and LNG-fueled ship design optimization [13]. Although Gekko solves differential and algebraic equations, it is unable to solve problems with functions that do not have derivative information available. ...
Gekko is an optimization suite in Python that solves optimization problems involving mixed-integer, nonlinear, and differential equations. The purpose of this study is to integrate common Machine Learning (ML) algorithms such as Gaussian Process Regression (GPR), support vector regression (SVR), and artificial neural network (ANN) models into Gekko to solve data based optimization problems. Uncertainty quantification (UQ) is used alongside ML for better decision making. These methods include ensemble methods, model-specific methods, conformal predictions, and the delta method. An optimization problem involving nuclear waste vitrification is presented to demonstrate the benefit of ML in this field. ML models are compared against the current partial quadratic mixture (PQM) model in an optimization problem in Gekko. GPR with conformal uncertainty was chosen as the best substitute model as it had a lower mean squared error of 0.0025 compared to 0.018 and more confidently predicted a higher waste loading of 37.5 wt% compared to 34 wt%. The example problem shows that these tools can be used in similar industry settings where easier use and better performance is needed over classical approaches. Future works with these tools include expanding them with other regression models and UQ methods, and exploration into other optimization problems or dynamic control.
... Process industries nowadays collect large amounts of process data from sensors and machines, which can be exploited for various software-based innovations, in particular when combined with additional data sources, such as the enterprise resource planning system (ERP). Applications include among other advanced analytics, such as root cause analysis for defects and anomaly detection, process control, planning optimizations, such as predictive maintenance, and in general an improved observability of the processes based on visualizations and real time simulations (Mowbray et al., 2022). ...
... Digitalization and data science including AI/ML technology are sweeping various fields of science and engineering and data-based methods are being implemented on unprecedented scales, see (Khalil et al., 2021), (Mowbray et al., 2022;Sircar et al., 2021) for reviews of data science in various fields of engineering. Data science is a vast and dynamic field with great versatility. ...
Geothermal heat pump (GHP) systems have been established as a proven technology for cooling and heating residential, public and commercial buildings. There is a geothermal solution to the ambitious goal of decarbonizing the space heating and cooling, which is contingent on the successful deployment of the GHP technology. This in turn requires accurate site characterization, sound design methodologies, effective control logic, and short and long-term (life-cycle) performance analysis and optimization. In this article, we review the afore-mentioned aspects of the vertical closed-loop GHPs specifically focusing on the important role of the subsurface. The basics of GHP technology are introduced along with relevant trends and statistics. GHPs are compared with similar technologies such as air source heat pumps (ASHP) along with the effects of deployment on the grid peak load. We then review the common system architectures and the growing trends for deeper boreholes and the drivers behind it. Various methods for design, sizing, and simulation of GHPs are introduced along with software tools common in research and industry. We then move to subsurface characterization, drilling and well construction of vertical boreholes. Long-term performance monitoring for GHP systems is an important source of information for model validation and engineering design and is garnering increasing attention recently. Data science is another field that is growing rapidly with its methods increasingly utilized in GHP applications. The environmental aspect of GHPs is briefly reviewed. Finally, concluding remarks are given to summarize the review and highlight the potential of petroleum engineering expertise and methods in GHP applications.
... Advanced techniques for model interpretation can be applied to summary data (e.g. SHAP 77,84,85 ). However, for batch process data a natural step will be to analyze the subset of tags using FPCA. ...
Batch processes show several sources of variability, from raw materials' properties to initial and evolving conditions that change during the different events in the manufacturing process. In this chapter, we will illustrate with an industrial example how to use machine learning to reduce this apparent excess of data while maintaining the relevant information for process engineers. Two common use cases will be presented: 1) AutoML analysis to quickly find correlations in batch process data, and 2) trajectory analysis to monitor and identify anomalous batches leading to process control improvements.
... Nowadays, ML algorithms are well known and have been applied in many fields including Chemical Engineering. For instance, ML has been used in predictive analysis for modeling process operations (i.e., crystallization, absorption, distillation, gasification, dry reforming, etc.) (Damour et al., 2010;Velásco-Mejía et al., 2016;Kharitonova et al., 2019;Singh et al., 2007;Pandey et al., 2016;Azzam et al., 2018;Bagheri et al., 2019), for predicting thermodynamic properties of different fluids (Liu et al., 2019), hybrid modeling of chemical reactors (Ammar et al., 2021), also, a wide range of industrial applications of this kind of models can be found in Mowbray et al. (2022), Lee et al. (2018) and Trinh et al. (2021). ML has had an important repercussion in Chemical Engineering practice, because models are powerful and flexible tools that can describe chemical systems in real-time and also are relatively easy to implement into existing systems for monitoring, controlling and predicting the outputs of units operations (Kakkar et al., 2021). ...
To boost process operation modern chemical technology can require detailed mathematical descriptions of such complex, interacting and nonlinear systems, specially when experiments or pilot plant data are lacking. In some cases, these process models are formulated in terms of partial differential equations, which, in turn are hard to solve due the high demand of computational resources. However, the recent access to large data sets has allowed to address the simulation of these complex chemical systems by machine learning schemes. This new approach has notable strengths over traditional methods, such as flexibility, relative easy implementation and fastest performance. The proposal is to build surrogate models to approximate system behavior by making use of massive data information. Nonetheless, one of the principal drawbacks of these methods is the lack of understanding and the inherent uncertainty related to them. This paper explores the capability of different machine learning techniques for modeling different chemical process with different non-linear behavior. Furthermore, to handle the uncertainty in the models and interpret the confidentiality of the results, a probabilistic gaussian machine learning framework was leveraged.
... AI can be also used as tool to support the decision making related in the fundamental research and practical production of chemicals. In addition, at process level, Mowbray et al. (2022) explains the fundamentals of Machine Learning (ML) and data science, and how they can be linked to process and industrial engineering. ...
Within the European Green Deal, the Chemicals Strategy for Sustainability (CSS) (EC, 2020a) identified a number of actions to reduce negative impacts on human health and the environment associated with chemicals, materials, products and services commercialised or introduced onto the EU market. In particular, the ambition of the CSS is to phase out the most harmful substances and substitute, as far as possible, all other substances of concern, and otherwise minimise their use and track them. This objective requires novel approaches to analysing and comparing, across all life cycle stages, effects, releases and emissions for specific chemicals, materials, products and services, and move towards zero-pollution for air, water, soil and biota.