## No full-text available

To read the full-text of this research,

you can request a copy directly from the authors.

Article

Parameter estimation of reaction kinetics from spectroscopic data remains an important and challenging problem. This study describes a unified framework to address this challenge. The presented framework is based on maximum likelihood principles, nonlinear optimization techniques, and the use of collocation methods to solve the differential equations involved. To solve the overall parameter estimation problem, we first develop an iterative optimization-based procedure to estimate the variances of the noise in system variables (eg, concentrations) and spectral measurements. Once these variances are estimated, we then determine the concentration profiles and kinetic parameters simultaneously. From the properties of the nonlinear programming solver and solution sensitivity, we also obtain the covariance matrix and standard deviations for the estimated kinetic parameters. Our proposed approach is demonstrated on 7 case studies that include simulated data as well as actual experimental data. Moreover, our numerical results compare well with the multivariate curve resolution alternating least squares approach.

To read the full-text of this research,

you can request a copy directly from the authors.

... In developing kinetic models, it approximates the solid dissolution process and deals with multiple stages with different reactor temperatures. Moreover, variances, parameters, concentration and absorbance profiles are estimated for the process stages using the approach presented by Chen et al. 3 . The application of these developed concepts results in realistic profiles as well as reliable kinetic parameter values. ...

... To address these shortcomings, Chen et al. 3 However, this requires the particle behavior to be modeled, and most of these approaches Weiss 15 applied this approach to general drug dissolution processes with homogeneous and heterogeneous particles using probabilistic concepts as well as random effects in dissolution. ...

... To address the limitations of multivariate curve resolution techniques, Chen et al. 3 proposed a simultaneous approach that solves the full parameter estimation problem within an optimization framework, which includes eq. (7) along with a kinetic model of the form ...

Laboratory and process measurements from spectroscopic instruments are ubiquitous in pharma processes, and directly using the data can pose a number of challenges for kinetic model building. Moreover, scaling up from laboratory to industrial level requires predictive models with accurate parameter values. This means that process identification does not only imply kinetic parameter estimation, but also the identification of the absorbing species and estimation of variances for both the data and parameters. A recently developed, open-source toolkit KIPET 1,2 addresses these topics and provides an alternative to standard parameter estimation packages, in particular for spectroscopic data problems. Moreover, batch processes commonly used in the chemical and pharmaceutical industry involve multiple stages to carry out synthesis operations in a step by step manner, often dealing with heterogeneous mixtures, wide operating temperatures, and constant additions and removals of product and waste. For such cases novel modeling approaches are required, as the structure of the kinetic model may vary with time, with model switches that are state dependent. This study presents a new modeling approach and methodology that deals with these practical issues. In developing kinetic models, it approximates the solid dissolution process and deals with multiple stages with different reactor temperatures. Moreover, variances, parameters, concentration and absorbance profiles are estimated for the process stages using the approach presented by Chen et al. 3. The application of these developed concepts results in realistic profiles as well as reliable kinetic parameter values. The outcomes of this work show that KIPET is a useful toolkit for dealing with pharmaceutical processes with capabilities for dealing with challenging kinetic modeling problems.

... However, it is often challenging to estimate reaction kinetics directly from it. Recent advances in obtaining kinetic parameter estimates from spectroscopic data based on large-scale nonlinear programming (NLP), maximum likelihood principles, and discretization on finite elements lead to increased speed and efficiency (Chen et al., 2016). These new techniques have great potential for widespread use in parameter estimation. ...

... In addition, we propose a new variance estimation technique based on maximum likelihood derivations for unknown covariances from two sample populations. This new variance estimation technique is compared to the previously proposed iterative-heuristics-based algorithm of Chen et al. (2016) for distinguishing between variances of the noise in model variables and in the spectral measurements. We demonstrate the new techniques on a variety of example problems, with sample code, to show the utility of the approach and its ease of use. ...

... Following Fig. 3. The iterative variance estimation algorithm implemented in KIPET, as derived in Chen et al. [11]. Fig. 4. The solution obtained after the parameter estimation for Example 5.1 when the Chen et al. [11] variance estimation algorithm is used. ...

Multivariate spectroscopic data is increasingly abundant in the chemical and pharmaceutical industries. However, it is often challenging to estimate reaction kinetics directly from it. Recent advances in obtaining kinetic parameter estimates from spectroscopic data based on large-scale nonlinear programming (NLP), maximum likelihood principles, and discretization on finite elements lead to increased speed and efficiency (Chen et al., 2016). These new techniques have great potential for widespread use in parameter estimation. However they are currently limited due to their applicability to relatively small problem sizes. In this work, we extend the open-source package for estimating reaction kinetics directly from spectra or concentration data, KIPET, for use with multiple experimental datasets, or multisets (Schenk et al., 2020). Through a detailed initialization scheme and by taking advantage of large-scale nonlinear programming techniques and problem structure, we are able to solve large problems obtained from multiple experiments, simultaneously. The enhanced KIPET package can solve problems wherein multiple experiments contain different reactants and kinetic models, different dataset sizes with shared or unshared individual species’ spectra, and can obtain confidence intervals quickly based on the NLP sensitivities. In addition, we propose a new variance estimation technique based on maximum likelihood derivations for unknown covariances from two sample populations. This new variance estimation technique is compared to the previously proposed iterative-heuristics-based algorithm of Chen et al. (2016) for distinguishing between variances of the noise in model variables and in the spectral measurements. We demonstrate the new techniques on a variety of example problems, with sample code, to show the utility of the approach and its ease of use. We also include the curve-fitting problem to cases where we have concentration data given directly, and are required to estimate kinetic parameters across multiple experimental datasets.

... In Section 2, we describe the Monte-Carlo approach from [4], provide background on the flexibility test and flexibility index problems, and then present the proposed approaches for computing the probabilistic design space with extensions to the flexibility analysis concepts. In Section 3, we demonstrate the approach on a small case study as well as the industrial Michael addition reaction case provided by the Eli Lilly and Company [27]. These case studies are used to compare the effectiveness of the new approaches with the Monte-Carlo simulation based approach. ...

... We first consider a simple reaction case provided by Chen et al. (2016) [27]. The reaction kinetics may be described as such: ...

... We first consider a simple reaction case provided by Chen et al. (2016) [27]. The reaction kinetics may be described as such: ...

To increase manufacturing flexibility and system understanding in pharmaceutical development, the FDA launched the quality by design (QbD) initiative. Within QbD, the design space is the multidimensional region (of the input variables and process parameters) where product quality is assured. Given the high cost of extensive experimentation, there is a need for computational methods to estimate the probabilistic design space that considers interactions between critical process parameters and critical quality attributes, as well as model uncertainty. In this paper we propose two algorithms that extend the flexibility test and flexibility index formulations to replace simulation-based analysis and identify the probabilistic design space more efficiently. The effectiveness and computational efficiency of these approaches is shown on a small example and an industrial case study.

... A new approach was proposed by Chen et al. (2016) which is a more comprehensive and holistic unified framework for reaction kinetic parameter estimation based on maximum likelihood principles and collocation methods. This approach forms the backbone of the KIPET (Kinetic Parameter Estimation Toolkit) software package, which will be introduced in the next section. ...

... The rest of this paper will focus on the mathematical details of the unified framework of Chen et al. (2016) followed by a description of the software implementation in KIPET. A tutorial example is then presented before the conclusion and future work. ...

... In addition to introducing a framework to directly estimate kinetic parameters from spectral data, Chen et al. (2016) was the first to propose a numerical procedure to estimate measurement and model variances 2 , 2 . ...

... KIPET seeks to overcome limitations of standard parameter estimation packages by applying a unified optimization framework based on maximum likelihood principles and large-scale nonlinear programming strategies for solving estimation problems that involve systems of nonlinear differential algebraic equations (DAEs). The software is based on recent advances proposed by Chen, et al. (2016) and puts their original framework into an accessible framework for practitioners and academics. The software package includes tools for data preprocessing, estimability analysis, and determination of parameter confidence levels for a variety of problem types. ...

... More details of these sequential approaches are presented by Ruckebusch and Blanchet (2013); Golshan et al. (2016). In order to overcome these disadvantages, an entirely new approach was presented by Chen et al. (2016). The proposed alternative method to these hard-and soft-modeled approaches is to simultaneously obtain the reaction kinetic parameters with the curve resolution. ...

... The software is mainly used for model discrimination and includes a number of different objective function formulations for users to choose from as well as tools for estimability based on the rival models and datasets, but cannot deal directly with spectra. With a lack of software for the treatment of kinetic parameter estimation problems from spectra, combined with the proliferation of multivariate data collection techniques for batch reaction systems and the relatively new and highly adaptable formulation from Chen et al. (2016), there exists an opportunity to create a new, multi-use tool that can be easily extended and manipulated to perform a wide-array of parameter estimation strategies. Many industrial practitioners combine the commercial tools above with a large number of custom codes in different modeling environments to solve their problems. ...

This paper presents KIPET (Kinetic Parameter Estimation Toolkit) an open-source toolbox for the determination of kinetic parameters from a variety of experimental datasets including spectra and concentrations. KIPET seeks to overcome limitations of standard parameter estimation packages by applying a unified optimization framework based on maximum likelihood principles and large-scale nonlinear programming strategies for solving estimation problems that involve systems of nonlinear differential algebraic equations (DAEs). The software is based on recent advances proposed by Chen et al. (2016) and puts their original framework into an accessible framework for practitioners and academics. The software package includes tools for data preprocessing, estimability analysis, and determination of parameter confidence levels for a variety of problem types. In addition KIPET introduces informative wavelength selection to improve the lack of fit. All these features have been implemented in Python with the algebraic modeling package Pyomo. KIPET exploits the flexibility of Pyomo to formulate and discretize the dynamic optimization problems that arise in the parameter estimation algorithms. The solution of the optimization problems is obtained with the nonlinear solver IPOPT and confidence intervals are obtained through the use of either sIPOPT or a newly developed tool, k_aug. The capabilities as well as ease of use of KIPET are demonstrated with a number of examples.

... Yet, the approach relies on prior knowledge of the number of clusters (representing species). Although MCR recovers the number of species and their corresponding latent spectra and concentrations, without a priori knowledge of the reaction system, spectral resolution has been formulated using a mixed integer non-linear programming approach [194]. This is owing to the rotational and intensity ambiguities in MCR [195]. ...

... JNMF in tandem with probabilistic graphical models, reduces the reliance on prior knowledge of the reaction system, while developing inferential models to generate reaction hypotheses. This framework is preliminary to the development of kinetic models that could be used to control the composition of complex mixtures [194], facilitating advances in reaction engineering using principles of process systems engineering [111]. In this work, data from Fourier Transform Infrared (FTIR) spectroscopy and Proton Nuclear Magnetic Resonance ( 1 H-NMR) spectroscopy of the products from the thermal conversion of Cold Lake bitumen [196,197], are mined to develop reaction pathways using machine learning tools. ...

Processing of complex feedstocks for the production of value-added chemicals and fuels is industrially important. The lack of a priori knowledge of the innumerable species and the reaction pathways governing their conversion, has posed challenges to monitoring these processes. Although, data-driven models have been used, their lack of interpretability and an end-to-end modeling framework has limited the efficiency of diagnostic decisions in process monitoring. On the other hand, systems where the mechanistic knowledge of the species and their reactions are arrived at from first-principles simulations, face computational challenges in the deployment of such models for the process design. This thesis focuses on the following two aspects: (i) developing inferential machine learning models to enhance the interpretability of data-driven models, and (ii) developing predictive machine learning models to reduce the computational cost of first-principles simulations, in modeling chemical systems.
The first aspect of developing inferential machine learning models focuses on the identification of species, reaction pathways, and kinetic parameter estimation from spectroscopic data of the system, with application to the visbreaking of bitumen. Spectroscopic curve resolution methods that are structure-preserving, interpretable, and jointly parse data from multiple sensors, to extract latent features for species identification have been presented with an increasing degree of sophistication as follows:(i) self-modeling multivariate curve resolution (SMCR), (ii) joint non-negative matrix factorization (JNMF) as a data fusion analogue of SMCR where regularization constraints act like chemical information sieves to handle complementary, orthogonal and redundant features in the latent factorization of multi-sensor data and (iii) joint non-negative tensor factorization (JNTF) as a structure-preserving higher order analogue of JNMF. Next, Bayesian structure learning among the extracted spectral features has been used to causally infer plausible reaction pathways that have been validated by domain knowledge. Finally, the latent factorization and causal inference models have been used as an engine to interpret the modes identified by training hidden semi-Markov models on spectra. This captures the time scales and dynamics of reaction mechanisms with changing temperatures, for the realtime monitoring of reactive systems purely from spectroscopic data. Projections of spectroscopic data onto the temporal mode of data collection via latent factorization, are interpreted as concentrations. Kinetic models constrained by physical laws and the reaction adjacency matrix deduced from the Bayesian network structure are implemented using chemical neural ODEs trained on the temporal concentrations. The prediction accuracy is seen to depend on the ability of latent factorization to handle process noise.
The second aspect of training predictive machine learning models, focuses on not only reducing the computational cost of the ab initio molecular dynamics (AIMD) simulations of chemical systems, but also the cost in itself of developing such models. This has been demonstrated with application to the transglycosylation of cellobiose, to assess whether or not the solvent molecules reorganize significantly in going from the reactant to the product configurations. A self-supervised 3D convolutional neural network autoencoder is trained to extract features from the reactant and product simulation trajectories, the probability distributions across the difference between which is used to assess if the solvent reorganization is significant. Cellobiose systems at lower temperatures are found to reorganize to a greater extent than those at higher temperatures, consistent with the decrease in the activation free energy barrier as temperature increases. Similarity between the reactant configuration features of other chemical systems with those extracted from that of the cellobiose systems, is then used as a basis to inform the extent of reorganization in the product profiles, without having to explicitly run AIMD simulations for the same.

... The parameter estimates which explicitly require theintegral of ODEs model can be categorized as sequential and simultaneous approach. In the sequential approach (Hwang and Seinfeld, 1972, Kim et al., 1991, Bilardello et al., 1993, the optimization problem is solved separately from numerical solution of ODEs model whilst in the simultaneous approach, the optimization problem of parameter estimation is solved together with differential equations model which is converted into algebraic equations (Chen et al., 2016, De et al., 2013. A collocation approaches for parameter estimation has been demonstrate in (Chen et al., 2016, Villadsen, 1982, Tjoa and Biegler, 1991 and ANN implementation is used in Dua and Dua (2011) for simultaneous parameter estimation. ...

... In the sequential approach (Hwang and Seinfeld, 1972, Kim et al., 1991, Bilardello et al., 1993, the optimization problem is solved separately from numerical solution of ODEs model whilst in the simultaneous approach, the optimization problem of parameter estimation is solved together with differential equations model which is converted into algebraic equations (Chen et al., 2016, De et al., 2013. A collocation approaches for parameter estimation has been demonstrate in (Chen et al., 2016, Villadsen, 1982, Tjoa and Biegler, 1991 and ANN implementation is used in Dua and Dua (2011) for simultaneous parameter estimation. However, the above mentioned approaches are computationally expensive and may not converge in a reasonable time. ...

This work presents a study that aims to compare two discretization methods for solving parameter estimation using multiparametric programming. In our earlier work, parameter estimation using multiparametric programming was presented where model parameters were obtained as an explicit function of measurements. In this method, the nonlinear ordinary equations (ODEs) model was discretized by using explicit Euler's method to obtain algebraic equations. Then, a square system of parametric nonlinear algebraic equations was obtained by formulating optimality condition. These equations were then solved symbolically to obtain model parameters as an explicit function of measurements. Thus, the online computation burden of solving optimization problems for parameter estimation is replaced by simple function evaluations. In this work, we use implicit Euler's method for discretization of nonlinear ODEs model and compare with the explicit Euler's method for parameter estimation using multiparametric programming. Complexity of explicit parametric functions, accuracy of parameter estimates and effect of step size are discussed.

Kinetics estimation of chemical reaction is considered to be a crucial step prior to design a robust, controllable and safe production process. Spectroscopic measurements are widely used for kinetic analysis. However, there may be unwanted contributions in the measured spectra, which may come from the instrumental variations (such as the baseline shift or distortion) or from the presence of inert absorbing interferences with no kinetic behavior. Here, the kinetic estimation problem with time invariant contributions is studied in depth, and we derive conditions where the estimation accuracy is not affected by the time invariant contributions. Moreover, kinetic parameter estimation and separation of time invariant contributions can be performed simultaneously under proper conditions. Also, an approach for kinetic parameter estimation based on spectra with time variant contributions is proposed. Finally, a novel unified framework is developed for kinetic parameter estimation when there is no prior information for unwanted contributions.

Design space definition is one of the key issues in pharmaceutical research and development. Flexibility index and design centering are two complementary ways to estimate a candidate design space. In this study, we first propose a novel formulation of flexibility index based on a direction search method, which is applied to any shape of the design space. Next, we propose two design centering methods. The vertex direction search method is first developed as a single-level optimization model, which is rigorous for convex regions. Next, based on the proposed flexibility index model, a Derivative-Free Optimization (DFO) method is developed for solving the bi-level optimization models involved in design centering problems, which is applicable not only to convex but also to nonconvex problems. In order to find near global solutions, Latin Hypercube Sampling (LHS) is used to generate multiple starting points for the DFO solver. The solution yields the optimal nominal point, which is the candidate point with the largest flexibility index. Several case studies demonstrate the performance of the proposed methods.

Inferring the reaction pathways underlying the processing of complex feeds, using noisy data from spectral sensors that may contain information regarding molecular mechanisms, is challenging. This is tackled by a...

Parameter estimation is important in aspects for applied in many process models in control system design, chemistry, and other engineering application for expectation models of statistical observation. The focus of this work is on a nonlinear ordinary differential equation (ODEs) system which is a direct current (DC) motor model to estimate the model parameters using multiparametric programming. In this work, the ODE model is discretized by implementing Euler’s method to obtain the algebraic equation. The model parameters of the DC motor will be derived as an explicit function of measurements to estimate the parameter values. The applicability of the proposed method and accuracy of parameter estimation is demonstrated.

Many model‐based online process monitoring and control applications rely on state estimation techniques that use noisy process data to update states, thereby ensuring that imperfect model predictions are consistent with process behavior. Techniques for tuning state estimators are reviewed, and their effectiveness and limitations are summarized in this article. A new simultaneous parameter and estimator tuning (SPET) methodology is proposed, one in which parameter estimation techniques for stochastic differential equations (SDEs) are used to simultaneously estimate measurement‐error covariances and model‐error covariances along with the model parameters. The resulting information is then used to compute state‐estimator tuning information. This study shows how SPET can be used, along with old dynamic process data, to obtain reliable tuning information. The proposed methodology is tested using a nonlinear two‐state continuous stirred‐tank reactor (CSTR) model with simulated data. Comparisons are made with a more‐conventional approach that uses WLS to estimate fixed model parameters and autocovariance least squares (ALS) to estimate extended Kalman filter (EKF) tuning factors. The main benefit of the proposed approach is that fixed model parameters and tuning factors for the EKF are estimated simultaneously, resulting in significant improvements to state estimates and online model predictions.
This article is protected by copyright. All rights reserved.

The importance of the Design Space (DS) definition lies in the assurance of quality, a key goal of Quality by Design, and in the broadening from a single acceptable operating point, to a collection of feasible operating regions. Therefore, properly defining the limits of the design space is vital to provide manufacturing flexibility and quality assurance. In this work, we propose new MINLP reformulations to the extended flexibility analysis, where we distinguish between process and model parameters, to efficiently calculate a new flexibility index for process parameters. This index defines a hyperrectangular operating region within a design space of a pharmaceutical process in terms of the process parameters and accounts for the uncertainty in the model parameters, described by hyperrectangle and ellipsoidal sets. We illustrate the application of these novel techniques on several examples.

Universities worldwide are establishing drug discovery centers to facilitate translation of exciting new human disease biology into therapeutic modalities. Drug hunting activities are typically focused on lead finding (high‐throughput screening) coupled with some measure of chemical and pharmacokinetic optimization. Ideally, the research yields novel, selective drug‐like molecules suitable for in vivo proof‐of‐concept studies and preclinical drug target validation. Preclinical activities are increasingly conducted in partnership with a pharmaceutical company seeking to access and supplement their drug development pipeline. Perpetually striving to gain a competitive edge and enhance efficiency, productivity, and profitability, the pharma industry is simultaneously experimenting with open‐ and crowd‐source platforms and innovation incubators. Both enterprises, therefore, benefit from each other. This article is composed of a series of contributions (vignettes) from eight academic centers in the United States, the United Kingdom, Sweden, and Japan and one US‐based pharmaceutical company (Table 2). The perspectives cover a range of topics including the rise of academic drug discovery and public–private partnerships; mission, objectives, and evolution of a particular center; resourcing; performance metrics; strategic and tactical lessons learned; attributes of successful projects; and open innovation initiatives. The accounts are punctuated with case studies to illustrate collective inventive capabilities.

Continuous ibuprofen (a widespread used analgesic drug) manufacturing is full of superiorities and is a fertile field both in industry and academia since it can not only effectively treat rheumatic and other chronic and painful diseases, but also shows great potential in dental diseases. As one of central elements of operability analysis, flexibility analysis is in charge of the quantitative assessment of the capability to guarantee the feasible operation in face of variations on uncertain parameters. In this paper, we focus on the flexibility index calculation for the continuous ibuprofen manufacturing process. We update existing state-of-the-art formulations, which traditionally lead to the max-max-max optimization problem, to approach the calculation of the flexibility index with a favorable manner. Advantages regarding the size of the mathematical model and the computational CPU time of the modified method are examined by four cases. In addition to identifying the flexibility index without any consideration of control variables, we also investigate the effects of different combinations of control variables on the flexibility property to reveal the benefits from taking recourse actions into account. Results from systematic investigations are expected to provide a solid basis for the further control system design and optimal operation of continuous ibuprofen manufacturing.

Design space definition is one of the key parts in pharmaceutical research and development. In this article, we propose a novel solution strategy to explicitly describe the design space without recourse decisions. First, to smooth the boundary, the Kreisselmeier-Steinhauser (KS) function is applied to aggregate all inequality constraints. Next, for creating a surrogate polynomial model of the KS function, we focus on finding sampling points on the boundary of KS space. After performing Latin hypercube sampling (LHS), two methods are presented to efficiently expand the boundary points, i.e., line projection to the boundary through any two feasible LHS points and perturbation around the adaptive sampling points. Finally, a symbolic computation method, cylindrical algebraic decomposition, is applied to transform the surrogate model into a series of explicit and triangular subsystems, which can be converted to describe the KS space. Two case studies show the efficiency of the proposed algorithm.

In many studies kinetic parameter estimation from spectroscopic data is performed with the absorbing species known beforehand, as this provides a straightforward link between the reaction models to the spectroscopic data. In practice, however, the absorbing species are generally unknown and they are only estimated based on professional experience and prior knowledge of the kinetic reaction. In this work, we propose an optimization strategy with both continuous and discrete decision variables in order to estimate kinetic parameters from spectroscopic data with unknown absorbing species. Also included in our approach is an estimability analysis for kinetic parameters based on the Gram‐Schmidt orthogonalization procedure, along with covariance estimation. Four case studies were considered, which demonstrate the effectiveness of our approach. The first and second have simulated data and illustrate our approach with known solutions. The third and fourth are based on actual experiments from spectroscopy data sets.
This article is protected by copyright. All rights reserved

We introduce a flexible, open source implementation that provides the optimal sensitivity of solutions of nonlinear programming (NLP) problems, and is adapted to a fast solver based on a barrier NLP method. The program, called sIPOPT evaluates the sensitivity of the Karush–Kuhn–Tucker (KKT) system with respect to perturbation parameters. It is paired with the open-source IPOPT NLP solver and reuses matrix factorizations from the solver, so that sensitivities to parameters are determined with minimal computational cost. Aside from estimating sensitivities for parametric NLPs, the program provides approximate NLP solutions for nonlinear model predictive control and state estimation. These are enabled by pre-factored KKT matrices and a fix-relax strategy based on Schur complements. In addition, reduced Hessians are obtained at minimal cost and these are particularly effective to approximate covariance matrices in parameter and state estimation problems. The sIPOPT program is demonstrated on four case studies to illustrate all of these features.

A general method for estimating reaction rate constants of chemical reactions using ultraviolet-visible (UV-vis) spectroscopy is presented. The only requirement is that some of the chemical components involved be spectroscopically active. The method uses the combination of spectroscopic measurements and techniques from numerical mathematics and chemometrics. Therefore, the method can be used in cases where a large spectral overlap of the individual reacting absorbing species is present. No knowledge about molar absorbances of individual reacting absorbing species is required for quantification. The reaction rate constants and the individual spectra of the reacting absorbing species of the two-step consecutive reaction of 3-chlorophenylhydrazonopropane dinitrile with 2-mercaptoethanol were estimated simultaneously from UV-vis recorded spectra in time. The results obtained were excellent.

Line search methods are proposed for nonlinear programming using Fletcher and Leyffer's filter method [Math. Program., 91 (2002), pp. 239--269], which replaces the traditional merit function. Their global convergence properties are analyzed. The presented framework is applied to active set sequential quadratic programming (SQP) and barrier interior point algorithms. Under mild assumptions it is shown that every limit point of the sequence of iterates generated by the algorithm is feasible, and that there exists at least one limit point that is a stationary point for the problem under consideration. A new alternative filter approach employing the Lagrangian function instead of the objective function with identical global convergence properties is briefly discussed.

DOWNLOAD COMPLETE BOOK FROM
http://AMPL.COM/resources/the-ampl-book/chapter-downloads/
Practical large-scale mathematical programming involves more than just the application of an algorithm to minimize or maximize an objective function. Before any optimizing routine can be invoked, considerable effort must be expended to formulate the underlying model and to generate the requisite computational data structures. AMPL is a new language designed to make these steps easier and less error-prone. AMPL closely resembles the symbolic algebraic notation that many modelers use to describe mathematical programs, yet it is regular and formal enough to be processed by a computer system; it is particularly notable for the generality of its syntax and for the variety of its indexing operations. We have implemented an efficient translator that takes as input a linear AMPL model and associated data, and produces output suitable for standard linear programming optimizers. Both the language and the translator admit straightforward extensions to more general mathematical programs that incorporate nonlinear expressions or discrete variables.

We present a primal-dual interior-point algorithm with a lter line-search method for non- linear programming. Local and global convergence properties of this method were analyzed in previous work. Here we provide a comprehensive description of the algorithm, including the fea- sibility restoration phase for the lter method, second-order corrections, and inertia correction of the KKT matrix. Heuristics are also considered that allow faster performance. This method has been implemented in the IPOPT code, which we demonstrate in a detailed numerical study based on 954 problems from the CUTEr test set. An evaluation is made of several line-search options, and a comparison is provided with two state-of-the-art interior-point codes for nonlin- ear programming.

An updated version of the graphical user-friendly interface related to the Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS) algorithm is presented. This GUI works under MATLAB (R) environment and includes recently published advances of this algorithm linked to the implementation of additional constraints, such as kinetic hard-modeling and correlation (calibration), as well as constraints linked to model structure for multiset and multiway data analysis, such as the possibility to use fully or partially multilinear models (trilinear or quadrilinear) to describe the data set. In addition, a step has been included to allow the preliminary subspace maximum likelihood projection to decrease noise propagation effects in case of large non-homoscedastic uncertainties, and the possibility of direct selection of number of components and of initial estimates. Finally, a number of options to present and handle the output information have been added, such as the display of data fitting evolution, improvement in the display of loading profiles in different modes for multi-way data, refolding MCR scores into 2D distribution maps for hyperspectral images and the internal connection to the MCR-Bands GUI, previously designed for the assessment of the extent and location of ambiguities in the MCR resolved profiles. Different examples of use of this updated interface are given in this work.

Multivariate curve resolution techniques are powerful tools to extract from sequences of spectra of a chemical reaction system the number of independent chemical components, their associated spectra, and the concentration profiles in time. Usually, these solutions are not unique because of the so‐called rotational ambiguity.In the present work, we reduce the non‐uniqueness by enforcing the consistency of the computed concentration profiles with a given kinetic model. Traditionally, the kinetic modeling is realized in a separate step, which follows the multivariate curve resolution procedure. In contrast to this, we consider a hybrid approach that combines the model‐free curve resolution technique with the model‐based kinetic modeling in an overall optimization. For a two‐component model problem, the range of possible solutions is analyzed, and its reduction to a single, unique solution by means of the hybrid kinetic modeling is shown. The algorithm reduces the rotational ambiguity and improves the quality of the kinetic fitting. Numerical results are also presented for a multi‐component catalytic reaction system that obeys the Michaelis–Menten kinetics. Copyright © 2012 John Wiley & Sons, Ltd.

Kinetic modeling of batch reactions monitored by in situ spectroscopy has been shown to be a helpful method for developing a complete understanding of reaction systems. Much work has been carried out to demonstrate the ability to model dissolution, reaction, and crystallization processes separately; however, little has been performed in terms of combining all of these into one comprehensive kinetic model. This paper demonstrates the integration of models of dissolution, temperature-dependent solubility, and unseeded crystallization driven by cooling into a comprehensive kinetic model describing the evolution of a slurry reaction monitored by in situ attenuated total reflectance ultraviolet–visible spectroscopy. The model estimates changes in the volume of the dissolved fraction of the slurry by use of the partial molar volume of the dissolved species that change during the course of reagent addition, dissolution, reaction, and crystallization. The comprehensive model accurately estimates concentration profiles of dissolved and undissolved components of the slurry and, thereby, the degree of undersaturation and supersaturation necessary for estimation of the rates of dissolution and crystallization. Results were validated across two subsequent batches via offline high-performance liquid chromatography measurements. Copyright © 2014 John Wiley & Sons, Ltd.

The paper is concerned with parametric models for populations of curves; i.e. models of the form yi(Z) = f(θi; x) + error, i = I, 2, …, n. The shape invariant model f(θi; x) = θ0i + θ1ig([x – θ2i/θ3i) is introduced. If the function g(x) is known, then the θi may be estimated by nonlinear regression. If g(x) is unknown, then the authors propose an iterative technique for simultaneous determination of the best g(x) and θi. Generalizations of the shape invariant model to curve resolution are also discussed. Several applications of the method are also presented.

The paper presents a method for resolving additive mixtures of overlapping curves by combining nonlinear regression and principal component analysis. The method can be applied to spectroscopy, chromatography, etc. The method makes use of the postulated chemical reaction, and allows one to check the reaction and estimate chemical rate and equilibrium constants.

This paper presents a method for determining the shapes of two overlapping functions f1(x) and f2(x) from an observed set of additive mixtures, [αif1(x) + βif2(x); i = 1, …, n), of the two functions. This type of problem arises in the fields of spectrophotometry, chromatography, kinetic model building, and many others. The methods described by this paper are based on the use of principal component techniques, and produce two bands of functions, each of which contains one of the unknown, underlying functions. Under certain mild restrictions on the fj(x), each band reduces to a single curve, and the fi(x) are completely determined by the analysis.

This work is mainly oriented to give an overview of the progress of multivariate curve resolution methods in the last 5 years. Conceived as a review that combines theory and practice, it will present the basics needed to understand what is the use, prospects and limitations of this family of chemometric methods with the latest trends in theoretical contributions and in the field of analytical applications.

A novel spectroscopic method was developed to measure mixing time and the results compared to the well-known and adopted conductivity method. Spectra of non-reactive tracers were collected using UV/Visible (UV-VIS) fibre optic probes coupled with a McPherson spectrograph. The latter could detect a full spectrum at a maximum acquisition rate of 100spectras−1 using a full vertical binning acquisition mode and 41.7spectras−1 when using a multitrack acquisition mode. The spectra acquired were processed using a Savitzky–Golay smoothing filter algorithm and analysed, to give mixing time values, according to the rational method proposed by Ruszkowski (1994). Experiments were carried out by placing the fibre optic probes in stirred vessels with working volumes ranging from 3L to 20L. Traditional Rushton turbines and 45° angled pitched blade turbines were used for the experiments. The mixing time values, θ95 obtained with the spectroscopic technique were compared with those obtained using a conductivity technique and with a correlation available in literature (Nienow, 1997).

A novel approach mixing the qualities of hard-modelling and soft-modelling methods is proposed to analyse kinetic data monitored spectrometrically. Taking as a basis the Multivariate Curve Resolution–Alternating Least Squares method (MCR–ALS), which obtains the pure concentration profiles and spectra of all absorbing species present in the raw measurements by using typical soft-modelling constraints, a new hard constraint is introduced to force some or all the concentration profiles to fulfill a kinetic model, which is refined at each iterative cycle of the optimisation process.This modification of MCR–ALS drastically decreases the rotational ambiguity associated with the kinetic profiles obtained using exclusively soft-modelling constraints. The optional inclusion of some or all the absorbing species into the kinetic model allows the successful treatment of data matrices whose instrumental response is not exclusively due to the chemical components involved in the kinetic process, an impossible scenario for classical hard-modelling approaches. Moreover, the possible distinct constraint of each of the matrices in a three-way data set allows for the simultaneous analysis of kinetic runs with diverse kinetic models and rate constants. Thus, the introduction of model-based and model-free features in the treatment of kinetic data sets yields more satisfactory results than the application of pure hard- or pure soft-modelling approaches. Simulated and real examples are used to confirm this statement.

The presence of rotation ambiguities and unique solutions in Multivariate Curve Resolution (MCR) chemometric methods is discussed in detail. Using recently proposed graphical approaches to display the bands and areas of feasible solutions in a subspace of reduced dimensions, the results obtained by different MCR methods are compared. These results show that in the presence of rotation ambiguities and under a particular set of constraints, the solutions obtained by the different MCR methods can differ among them and also from the true solution depending on initial estimates and on the applied algorithm. In absence of rotational ambiguities, all MCR methods should give the same unique solution which should be equal to the true one. Many of the MCR methods proposed in the literature like MCR-ALS, RFA, MCR-FMIN, or MCR-BANDS are confirmed to give a valid solution within the band or area of feasible solutions. On the contrary, and according to the results of this study, in its present implementation, the minimum volume simplex analysis, MVSA method can give unfeasible solutions when resolving bilinear data systems with more than two components, because it only applies non-negativity constraints to concentration profiles and not to spectral profiles.

The parameters of ecological models are usually estimated through numerical search algorithms. Determining confidence boundaries for the parameter values obtained in such a way is a problem of great practical importance. In this paper a method is proposed to estimate such regions in two ways, based on either the Hessian matrix or the Fisher Information Matrix (FIM). There is a conceptual difference in the two approximations: the FIM approach is based on the sensitivity trajectories, whereas the Hessian expansion depends on the shape of the error functional. From a comparison between the two approaches, a discriminating method is obtained to detect inaccurate estimation results. The Hessian and FIM approaches differ by the second derivative terms of the output function. This difference is used to assess the success of the estimation, because the two methods yield the same confidence estimate only if the search terminates at the optimal parameter value. The method is demonstrated with reference to a pair of widely used dynamics: the Monod kinetics and the Richards logistic function applied to algal growth. It is shown that in both cases this method compares favourably with the residual correlation analysis and appears to have more discriminatory power.

Multivariate curve resolution (MCR) is a widespread methodology for the analysis of process data in many different application fields. This article intends to propose a critical review of the recently published works. Particular attention will be paid to situations requiring advanced and tailored applications of multivariate curve resolution, dealing with improvements in preprocessing methods, multi-set data arrangements, tailored constraints, issues related to non-ideal noise structure and deviation to linearity. These analytical issues are tackling the limits of applicability of MCR methods and, therefore, they can be considered as the most challenging ones.

The introduction of fast scanning and diode-array spectre-photometers facilitates the acquisition of large series of absorption spectra as a function of reaction time (kinetics), elution time (chromatography), or added reagent (equilibrium Investigations). It is important to develop appropriate programs that are able to handle the wealth of data and to extract all information. In this contribution a new application of factor analysis in a nonlinear least-squares fitting algorithm is presented. By prior factor analysis of the raw multichannel data, essential savings in memory allocation and computing time are achieved. Separation of the linear and the nonlinear parameters is accomplished and, specifically, the computation of the Jacobian is dramatically enhanced by specifically exploiting the orthonormality of the eigenvectors. The performance of the new algorithm is compared with established programs for examples of complex equilibrium studies.

A completely model-free method for the resolution of overlapping chromatographic peaks is presented. Evolving factor analysis enhances the power of classical factor analysis by exploiting the additional information contained in the response data through the intrinsic order of the elution time. The results are the elution profiles and the normalized spectra of the components.

A practical and accessible introduction to numerical methods for stochastic differential equations is given. The reader is assumed to be familiar with Euler's method for deterministic differential equations and to have at least an intuitive feel for the concept of a random variable; however, no knowledge of advanced probability theory or stochastic processes is assumed. The article is built around 10 MATLAB programs, and the topics covered include stochastic integration, the Euler-Maruyama method, Milstein's method, strong and weak convergence, linear stability, and the stochastic chain rule.

In this chapter, we present a detailed review of fundamental concepts, theory, definitions, and methods that are used in nonlinear regression analysis of pressure and rate transient data. Nonlinear parameter estimation coupled with statistical methods is simply referred to as nonlinear regression analysis. It has become a standard analysis procedure for interpreting pressure-transient data in the last two decades because unlike conventional methods of graphical methods used in pressure transient data analysis, it allows one to quantify the uncertainty in the final estimated formation parameters and model uniqueness in the presence of noisy (or inexact) data and uncertainty about the true, but unknown, reservoir/well model.The parameter estimation method based on the maximum likelihood estimation (MLE) is introduced and compared with the least-squares estimation (LSE), a most widely used and known estimation method in petroleum engineering. Although MLE is not widely used in petroleum engineering, it is, by far, the most commonly used method of parameter estimation in the statistics literature. Associating uncertainty into measurements by careful construction of the objective functions is best done within the concept of maximum likelihood, particularly when history matching multiple sets of pressure transient data sets, e.g., multi-well system and interval pressure transient tests. The objectives functions incorporating available prior information into parameter estimation within the framework of Bayesian methodology are given. Nonlinear parameter estimation problems for overdetermined and underdetermined problems are also covered. The chapter includes minimization methods and algorithms (while constraining the parameters within a feasible region) that can be used for minimizing various objective functions arising from MLE and LSE formulations with or without prior information. Computation of various statistics [e.g., 95%, confidence intervals for parameters, correlation coefficients for parameters, standard deviation of residuals, root-mean-square (RMS) errors] in assessing the uncertainty in estimated parameters and goodness of fit is provided.In this chapter, a number of example applications are presented using both real field and synthetic pressure transient tests to illustrate the use of nonlinear regression analysis based on the maximum likelihood and least-squares estimations.

A new multivariate curve resolution method is presented and tested with data of various levels of complexity. Rotational and intensity ambiguities and the effect of selectivity on resolution are the focus. Analysis of simulated data provides the general guidelines concerning the conditions for uniqueness of a solution for a given problem. Multivariate curve resolution is extended to the analysis of three-way data matrices. The particular case of three-way data where only one of the orders is common between slices is studied in some detail.

There is an increasing need for new techniques for the understanding, monitoring and the control of batch processes. Spectroscopy is now becoming established as a means of obtaining real-time, high-quality chemical information at frequent time intervals and across a wide range of industrial applications. In this article, the role of spectroscopy for batch process monitoring is discussed in terms of both current and potential advances. The emphasis is on how to handle the measured data to extract maximum information for improved process performance and efficiency. In particular, the use of spectroscopy for statistical process monitoring is detailed and considered as complementary to the use of engineering process data. A case study of the ultraviolet-visible monitoring of a first-order biochemical conversion reaction is described, as well as the advantages of spectroscopy for process fault detection and diagnosis. Future prospects for the use of on-line spectroscopy are also discussed.

One of the major applications of factor analysis in the chemical literature, self-modeling curve resolution (SMCR), is covered in this review, including a historical account of the methods derived from Lawton and Sylvestre's original method. Papers treating the theory or applications of SMCR are included. Qualitative and quantitative applications are described where appropriate.

We present new ideas underlying a self-modelling factor analytical method which allows to extract pure component spectra and the associated concentration profiles from a set of spectroscopic measurements. The usefulness of the method is demonstrated and compared with established tools for model problems and for a system from catalytic hydroformylation by Rhodium complexes both with overlapping component spectra. Self-modelling methods tend to minimize the overlap of the recovered spectra, which can result in an unwanted distortion of the spectra and concentration profiles. For strongly overlapping spectra a penalty condition on a specific singular value of the absorptivity matrix factor and a global decomposition approach are appropriate tools to construct improved factorizations. Copyright © 2010 John Wiley & Sons, Ltd.

We present convergence rates for the error between the direct transcription solution and the true solution of an unconstrained
optimal control problem. The problem is discretized using collocation at Radau points (aka Gauss-Radau or Legendre-Gauss-Radau
quadrature). The precision of Radau quadrature is the highest after Gauss (aka Legendre-Gauss) quadrature, and it has the
added advantage that the end point is one of the abscissas where the function, to be integrated, is evaluated. We analyze
convergence from a Nonlinear Programming (NLP)/matrix algebra perspective. This enables us to predict the norms of various
constituents of a matrix that is “close” to the KKT matrix of the discretized problem. We present the convergence rates for
the various components, for a sufficiently small discretization size, as functions of the discretization size and the number
of collocation points. We illustrate this using several test examples. This also leads to an adjoint estimation procedure,
given the Lagrange multipliers for the large scale NLP.

In this paper, a method for characterizing an industrially significant reaction using chemometrics, fiber-optic UV/visible spectroscopy and a single fiber transmission probe is reported. Aliquots of the reaction mixture were also taken at constant intervals for off-line HPLC analysis. HPLC peak areas were used to develop multivariate calibrations for the real-time determination of product and consumption of reactants. Composition profiles and pure component spectra of the reactant mixture, intermediate, and product were estimated using automatic window factor analysis (WFA), a type of self-modeling curve resolution (SMCR), without the aid of referee methods of analysis or standards. Window edges were automatically refined by a new iterative process that uses a robust adaptive noise threshold in the stopping criterion. Strong evidence for the formation of a reactive intermediate was detected and characterized by SMCR that could not be detected by HPLC. Eight replicate runs over a period of 3 months demonstrated that the SMCR results were reproducible. Robust smoothing of the SMCR profiles with locally weighted scatter plot smoothing (LOWESS) was used to construct control charts for detecting upsets in the batch reaction caused by the introduction of small amounts of water. Residuals (smooth–unsmoothed) outside control limits (3×MAD, median absolute deviation of residuals from pre-run batches) were used to detect small, sudden process upsets.

Following on the popularity of dynamic simulation for process systems, dynamic optimization has been identified as an important task for key process applications. In this study, we present an improved algorithm for simultaneous strategies for dynamic optimization. This approach addresses two important issues for dynamic optimization. First, an improved nonlinear programming strategy is developed based on interior point methods. This approach incorporates a novel filter-based line search method as well as preconditioned conjugate gradient method for computing search directions for control variables. This leads to a significant gain in algorithmic performance. On a dynamic optimization case study, we show that nonlinear programs (NLPs) with over 800,000 variables can be solved in less than 67 CPU minutes. Second, we address the problem of moving finite elements through an extension of the interior point strategy. With this strategy we develop a reliable and efficient algorithm to adjust elements to track optimal control profile breakpoints and to ensure accurate state and control profiles. This is demonstrated on a dynamic optimization for two distillation columns. Finally, these algorithmic improvements allow us to consider a broader set of problem formulations that require dynamic optimization methods. These topics and future trends are outlined in the last section.

Evolving factor analysis (EFA) is a general method for the analysis of multivariate data having an intrinsic order. Examples are data produced by many hyphenated techniques, such as high-performance liquid chromatography with photodiode array detection (HPLC-DAD) and the study of complex equilibria by ultraviolet spectrometry as a function of pH. EFA relies on an intrinsic order of the data and relies on only a few assumptions, such as nonnegativity of concentrations and the validity of Beer's law. The method can be applied to curve resolution and the assessment of peak purity in different disciplines of analytical chemistry. A didactic example from HPLC-DAD is used to illustrate the method. Possible limitations of EFA are also discussed.

The continuing development of modern instrumentation means an increasing amount of data is being delivered in less time. As a consequence, it is crucial that research into techniques for the analysis of large data sets continues. However, even more crucial is that once developed these techniques are disseminated to the wider chemical community. In this tutorial, all the steps involved in the fitting of a chemical model, based on reaction kinetics, to measured multiwavelength spectroscopic data are presented. From postulation of the chemical model and derivation of the appropriate differential equations, through to calculating the concentration profiles and, using non-linear regression, fitting of the rate constants of the model to measured multiwavelength data. The benefits of using multiwavelength data are both discussed and demonstrated. A number of real examples where the described techniques are applied to real measurements are also given.

Application of multivariate curve resolution to second order data from hyphenated liquid chromatography with spectrometric diode array detection is shown. Chromatographic analysis of samples giving unresolved mixtures produces different data structures depending on the reproducibility of the elution process: (a) second order data where elution peaks of the same component in the different chromatographic runs have the same shape and appear at exactly the same elution times (synchronized); (b) second order data where elution peaks of the same component in the different chromatographic runs appear at different elution times (non-synchronized) although they are still of the same shape; and (c) second order data where elution peaks of the same component in the different chromatographic runs have different shapes and appear at different elution times. Multivariate curve resolution is easily adapted to analyze all these situations taking advantage in every case of the particular data structure. Multivariate curve resolution is also easily adapted to those situations where second order data has not a complete trilinear structure.

Most algorithms for the least-squares estimation of non-linear parameters have centered about either of two approaches. On the one hand, the model may be expanded as a Taylor series and corrections to the several parameters calculated at each iteration on the assumption of local linearity. On the other hand, various modifications of the method of steepest-descent have been used. Both methods not infrequently run aground, the Taylor series method because of divergence of the successive iterates, the steepest-descent (or gradient) methods because of agonizingly slow convergence after the first few iterations. In this paper a maximum neighborhood method is developed which, in effect, performs an optimum interpolation between the Taylor series method and the gradient method, the interpolation being based upon the maximum neighborhood in which the truncated Taylor series gives an adequate representation of the nonlinear model. The results are extended to the problem of solving a set of nonlinear algebraic e

We address the problem of simultaneous model structure determination and parameter estimation in infrared spectroscopy. For given measurements of concentrations (C) and absorbances (A), we seek to find the constant of analogy (Θ) in reverse Beer's law (C=ΘA). Two approaches are described and compared in this paper. Both utilize Akaike's information criterion (AIC) to obtain an estimate of the constant. The first method is frequently used in practice and requires the iterative solution of mixed-integer convex quadratic optimization problems. The second method is a novel one that requires the solution of a single mixed-integer nonconvex nonlinear program for which we develop a global optimization algorithm. Computational results demonstrate that the latter approach provides better solutions for all of the eleven problems solved in this paper. Our computational experiments also reveal the importance of bounding the errors and number of model parameters when minimizing AIC.

A line search method is proposed for nonlinear programming using Fletcher and Leyffer’s filter method [R. Flechter and S. Leyffer, Math. Program. 91, No. 2 (A), 239–269 (2002; Zbl 1049.90088)], which replaces the traditional merit function. A simple modification of the method proposed in a companion paper [SIAM J. Optim. 16, No. 1, 1–31 (2005; Zbl 1114.90128)] introducing second order correction steps is presented. It is shown that the proposed method does not suffer from the Maratos effect, so that fast local convergence to second order sufficient local solutions is achieved.

A new algorithm for self-modeling curve resolution (SMCR) that yields improved results by incorporating soft constraints is described. The method uses least squares penalty functions to implement constraints in an alternating least squares algorithm, including nonnegativity, unimodality, equality, and closure constraints. By using least squares penalty functions, soft constraints are formulated rather than hard constraints. Significant benefits are (obtained using soft constraints, especially in the form of fewer distortions due to noise in resolved profiles. Soft equality constraints can also be used to introduce incomplete or partial reference information into SMCR solutions. Four different examples demonstrating application of the new method are presented, including resolution of overlapped HPLC-DAD peaks, flow injection analysis data, and batch reaction data measured by UV/visible and near-infrared spectroscopy (NIR). Each example was selected to show one aspect of the significant advantages of soft constraints over traditionally used hard constraints. Incomplete or partial reference information into self-modeling curve resolution models is described. The method offers a substantial improvement in the ability to resolve time-dependent concentration profiles from mixture spectra recorded as a function of time.

Near-infrared spectroscopy (NIRS) is a fast and non-destructive analytical method. Associated with chemometrics, it becomes a powerful tool for the pharmaceutical industry. Indeed, NIRS is suitable for analysis of solid, liquid and biotechnological pharmaceutical forms. Moreover, NIRS can be implemented during pharmaceutical development, in production for process monitoring or in quality control laboratories. This review focuses on chemometric techniques and pharmaceutical NIRS applications. The following topics are covered: qualitative analyses, quantitative methods and on-line applications. Theoretical and practical aspects are described with pharmaceutical examples of NIRS applications.

Estimating rate constants and pure UV-vis spectra of a two-step reaction using trilinear models

- Bijlsma