BookPDF Available

Engineering Design Optimization using Calculus Level Methods: A Casebook Approach

Authors:
  • Optimal Designs Enterprise

Abstract

Engineers in industry wanted to ‘tweak’ their parameters. So this textbook was written to show the simplicity of ‘tweaking’ parameters in algebraic through differential equation problems when using a Calculus-level language like PROSE or FortranCalculus. FortranCalculus (FC) is available on the web. Automatic Differentiation (AD) and Operator overloading were key technologies that allowed numerical methods, now called solvers, to be stored in a FC library. A user will use a solver by stating a solver name in a ‘find’ statement using the ‘by’ clause. Want to switch solvers? Just change the solver name (e.g. from ‘Ajax’ to ‘Jupiter’) and you are ready to try a different numerical method! It is that easy to code. (See the FortranCalculus manual for suggestions on what solver to use for a given problem.)
A preview of the PDF is not available
... Adjoints computation is a numerical method for computing the gradient of a function, which may be complex. This method is at the core of many scientific applications, from climate and ocean modeling [3] to oil refinery [15]. In addition, the structure of the underlying dependence graph is also at the basis of the backpropagation step of machine learning [57], and thus the models considered in this manuscript are based on it. ...
Thesis
Artificial Intelligence is a field that has received a lot of attention recently. Its success is due to advances in Deep Learning, a sub-field that groups together machine learning methods based on neural networks. These neural networks have proven to be effective in solving very complex problems in different domains. However, their effectiveness depends on a number of factors: the architecture of the model, its size, how and where the training is performed... Most studies indicate that the large models are more likely to achieve the smallest error, but they are also more difficult to train. The main challenges are related to insufficient computational power and limited memory of the machines: if the model is too large then it can take a long time to be trained (days or even months), or it cannot even fit in memory in the worst case. During the training, it is necessary to store the weights (model parameters), the activations (intermediate computed data) and the optimizer states.This situation offers several opportunities to deal with memory problems, depending on their origin. Training can be distributed across multiple resources of the computing platform, and different parallelization techniques suggest different ways of dividing memory load. In addition, data structures that remain inactive for a long period of time can be temporarily offloaded to a larger storage space with the possibility of retrieving them later (offloading strategies). Furthermore, activations that are computed anew at each iteration can be deleted and recomputed several times within it (rematerialization strategies). Memory saving strategies usually induce a time overhead with respect to the direct execution, therefore optimization problems should be considered to choose the best approach for each strategy. In this manuscript, we formulate and analyze optimization problems in relation to various methods reducing memory consumption of the training process. In particular, we focus on rematerialization, activation offloading and pipelined model parallelism strategies, for each of them we design optimal solutions under a set of assumptions. Finally, we propose a fully functional tool called rotor that combines activation offloading and rematerialization and can be applied to training in PyTorch, allowing to process big models that otherwise would not fit into memory.
... Computation of adjoint is at the core of many scientific applications, from climate and ocean modeling [Adcroft et al. 2008] to oil refinery [Brubaker 2016]. In addition, the structure of the underlying dependence graph is also at the basis of the retropropagation step of machine learning [Kukreja et al. 2018a]. ...
Article
We study the problem of checkpointing strategies for adjoint computation on synchronous hierarchical platforms, specifically computational platforms with several levels of storage with different writing and reading costs. When reversing a large adjoint chain, choosing which data to checkpoint and where is a critical decision for the overall performance of the computation. We introduce H-R evolve , an optimal algorithm for this problem. We make it available in a public Python library along with the implementation of several state-of-the-art algorithms for the variant of the problem with two levels of storage. We provide a detailed description of how one can use this library in an adjoint computation software in the field of automatic differentiation or backpropagation. Finally, we evaluate the performance of H-R evolve and other checkpointing heuristics though an extensive campaign of simulation.
ResearchGate has not been able to resolve any references for this publication.