Content uploaded by Vrushabh Zinage

Author content

All content in this area was uploaded by Vrushabh Zinage on Sep 17, 2024

Content may be subject to copyright.

TransformerMPC: Accelerating Model Predictive Control via

Transformers

Vrushabh Zinage1, Ahmed Khalil1, Efstathios Bakolas 1

Abstract— In this paper, we address the problem of re-

ducing the computational burden of Model Predictive Con-

trol (MPC) for real-time robotic applications. We propose

TransformerMPC, a method that enhances the computational

efﬁciency of MPC algorithms by leveraging the attention mech-

anism in transformers for both online constraint removal and

better warm start initialization. Speciﬁcally, TransformerMPC

accelerates the computation of optimal control inputs by se-

lecting only the active constraints to be included in the MPC

problem, while simultaneously providing a warm start to the

optimization process. This approach ensures that the original

constraints are satisﬁed at optimality. TransformerMPC is

designed to be seamlessly integrated with any MPC solver,

irrespective of its implementation. To guarantee constraint

satisfaction after removing inactive constraints, we perform

an ofﬂine veriﬁcation to ensure that the optimal control

inputs generated by the MPC solver meet all constraints.

The effectiveness of TransformerMPC is demonstrated through

extensive numerical simulations on complex robotic systems,

achieving up to 35×improvement in runtime without any loss

in performance. Videos and code are available at this website12 .

I. INTRODUCTION

We consider the problem of improving the computational

efﬁciency of Model Predictive Control (MPC) algorithms

for general nonlinear systems under non-convex constraints.

MPC is one of the most popular frameworks used in robotic

systems for embedded optimal control with constraints,

enabling them to operate autonomously in various real-

world situations with applications ranging from legged and

humanoid robots [1], [2] to quadrotors [3], [4], swarms of

spacecraft [5], and ground robots [6], [7], to name but a few.

However, they are generally computationally expensive as

their implementation relies on solving constrained optimal

control problems (OCPs). Well-established MPC solvers are

usually computationally efﬁcient but often restrict themselves

to OCPs with convex quadratic cost functions subject to

linear dynamics and constraints giving rise to (constrained)

quadratic programs (QPs) [8]–[11]. On the other hand, non-

linear MPC solvers [12]–[14] can handle general nonlinear

dynamics with both convex and non-convex constraints but

are generally computationally expensive.

Many approaches for reducing the computational burden

of MPC have been proposed in the literature [9], [15].

Most of these methods, however, focus on the computational

aspects of solving the underlying optimization problems.

Most MPC solvers are generally classiﬁed into three

categories: interior-point methods, augmented-Lagrangian

/ ADMM methods, and active-set methods. Interior-point

solvers [16]–[24] offer robust convergence, however, are

usually challenging to warm start, making them less ideal for

MPC problems. Augmented-Lagrangian and ADMM solvers

[4], [8], [10], [15], [25] are prevalent in MPC solvers for

robotic applications due to their fast convergence. How-

ever, both of these classes of methods are computationally

1Vrushabh Zinage, Ahmed Khalil, Efstathios Bakolas

are with the Department of Aerospace Engineering and

Engineering Mechanics, University of Texas at Austin

vrushabh.zinage@utexas.edu,akhalil@utexas.edu,

bakolas@austin.utexas.edu

1Website: https://transformer-mpc.github.io/

2Code will be made available after ﬁnal submission

cvxopt daqp ecos highs piqp proxqp qpoases

0

0.5

1

1.5

6.8x

1.74x

1.55x

3.11x

2.42x 1.45x

8.19x

Avg. comp. time (in 10−2s)

Solver TransformerMPC (transformed augmented solver)

Fig. 1: Average computational time (in 10−2s) for various solvers

on Upkie wheeled biped robot with and without TransformerMPC

augmentation, demonstrating signiﬁcant reductions in computa-

tional time of TransformerMPC in accelerating solver performance.

efﬁcient when restricted to quadratic programs (QPs) and

usually do not scale well when applied to general nonlinear

systems with non-convex constraints. By contrast, nonlinear

MPC [14], [26], which accounts for nonlinear dynamics

and constraints, typically results in non-convex optimization

problems, making it more challenging to ﬁnd an optimal

solution. Active-set methods [11], [27]–[29] which focus on

constraint removal to improve the computational efﬁciency

allow for easy warm starts and can be fast if the active set

is correctly identiﬁed, though they suffer from combinatorial

worst-case time complexity [30] and are usually restricted to

QPs. On the other hand, explicit MPC based methods pre-

compute optimal control laws ofﬂine, enabling faster online

implementation [31]. However, the lookup tables required in

explicit MPC can become exponentially large for systems

with more than a few states and inputs [32].

Recently, a new line of research [33]–[40] has brought

together the best of both classical optimization-based meth-

ods and learning-based methods to rapidly and efﬁciently

generate control inputs for OCPs. The transition towards

incorporating learning-based methods into robotic systems

is motivated by three main factors. First, the computational

demands of deploying trained machine learning models dur-

ing inference are minimal and likely align with the limited

computational resources available on most robotic systems.

Second, learning-based methods hold potential for addressing

control problems that involve complex, multi-stage processes

with potentially non-convex cost functions. Finally, learning-

based methods can continuously reﬁne the model of the

system by incorporating new data over time, leading to

improved control performance.

Inspired by these approaches, our approach uses the at-

tention mechanism [41] from transformers to identify active

constraints as well as compute better warm start initial

conditions, enabling integration with existing MPC solvers

arXiv:2409.09266v1 [cs.RO] 14 Sep 2024

Transformer Πc

θ⋆

c

Learned θ⋆

c:

Reduced MPC (3)

Optimal control

sequence

u⋆

[k,k+N−1]d

Veriﬁer System

xk+1 =f(xk,uk)

MPC (2)

Transformer Πw

θ⋆

w

(warm start)

Propagate current state xk

by K(≤N) steps using (1)

MPC horizon: N

Learned θ⋆

w:

xk+K(or other parameters)

Yes

No

Warm start

(a) Our proposed approach (TransformerMPC)

Fig. 2: TransformerMPC improves the computational efﬁciency of the MPC framework by incorporating a transformer-based attention

mechanism to determine active constraints as well as better warm start initialization, which can be integrated with any state-of-the-art

MPC solver. Additionally, we have a veriﬁer step that checks whether the optimal control sequence synthesized using (3) satisﬁes all the

constraints of the original MPC (2). Note that TransformerMPC is applicable to general nonlinear MPC problems as well.

[16]–[24] or learning-based optimal controllers [33]–[37].

Unlike most methods that are speciﬁc to QPs, our approach

applies to general nonlinear MPC problems as well. The

contributions of this paper are as follows:

1) We introduce TransformerMPC, a method that utilizes

the attention mechanism in transformers for efﬁcient

online inactive constraint removal in MPC as well as

for warm starting. This accelerates the computation

of optimal control inputs by selecting a subset of

constraints that are active at optimality in the MPC

problem while ensuring that the solution maintains the

original constraint satisfaction properties.

2) Our approach is agnostic to the speciﬁc implementa-

tion of the optimization problem, allowing it to be in-

tegrated with any state-of-the-art MPC solver. Further-

more, for QP-based MPC problems, we demonstrate

how the transformer’s active constraint predictions can

be used to solve the MPC problem analytically.

3) To ensure constraint satisfaction after

TransformerMPC removes inactive constraints,

we perform ofﬂine veriﬁcation to ensure that the

optimal control sequence computed by the MPC

solver post-removal satisﬁes all the constraints.

4) We further enhance the computational efﬁciency for

nonlinear MPC by combining multiple nonlinear /

non-convex constraints into a single smooth constraint

using log-sum-exp functions, accelerated via GPU par-

allelization of the summation operation.

The paper is organized as follows. Section II discusses

the preliminaries and the problem statement followed by our

proposed approach in Section III. Finally, we discuss the

results in Section IV followed by some concluding remarks

in Section V.

II. PRELIMINARIES AND PROBLEM FORMUL ATION

Consider the discrete-time system given by

xk+1 =f(xk,uk), x0=x0,(1)

where xk∈ X ⊂ Rnand uk∈ U ⊂ Rmare the state and

control input at time step krespectively, x0is the initial

state, f:X × U → X is continuously differentiable, and X

and Uare compact sets. The receding horizon MPC problem

can be formulated as:

u⋆

[k,k+N−1]d=(2a)

argmin

uk,...,uk+N−1

N−1

X

i=0

ℓ(xk+i,uk+i)

subject to xk+i+1 =f(xk+i,uk+i), (2b)

xk+i∈ X,∀i= [0, N−1]d,(2c)

uk+i∈ U,∀i= [0, N−1]d,(2d)

xk+N∈ Xf,(2e)

where u⋆

[k,k+N−1]d= [u⋆

k,. . . ,u⋆

k+N−1]Tis the optimal

control sequence, Nis the time horizon, ℓ(xk,uk)is

the stage cost function, Xand Uare the feasible sets

for the state and control input, respectively, and Xf

is the terminal constraint set. We assume that the sets

Xand Uare characterized by sequences of smooth

functions as X:= {x|gj(x)≤0, ∀j∈[0, NX]d}and

U:= {u|hj(u)≤0, ∀j∈[0, NU]d}, respectively.The

notation [i,j]d(i≥j)represents the set of integers

{i,i+ 1, . . . ,j}. A constraint gj(x)≤0or hj(u)≤0is

considered active at a given time step if the equality holds

at optimality, i.e., gj(x⋆) = 0 or hj(u⋆) = 0. A constraint

is active if it directly affects the current optimization

solution, potentially limiting the feasible region of the state

or control input. Conversely, a constraint is considered

inactive if it is strictly satisﬁed, i.e., gj(x⋆)<0or

hj(u⋆)<0. Inactive constraints do not restrict the current

solution and, therefore, have no immediate impact on the

feasible region at that time step. Consequently, the active

constraint set Cactive is deﬁned as the set of constraints

where the equality holds at a given time step, i.e., Cactive =

{gj(x⋆)=0|j∈[0, NX]d}∪{hj(u⋆) = 0 |j∈[0, NU]d}.

These constraints restrict the feasible set of states or

control inputs. The inactive constraint set Cinactive includes

constraints where strict inequalities hold, i.e., Cinactive =

{gj(x⋆)<0|j∈[0, NX]d}∪{hj(u⋆)<0|j∈[0, NU]d},

and do not impact the optimization at that time step.

Furthermore, let the indices corresponding to inactivate

state constraints, inactive input constraints, active state

constraints, and active input constraints be given by

Ix

inactive ={j|gj(x⋆)<0},Iu

inactive ={j|hj(u⋆)<0},

Ix

active ={j|gj(x⋆)=0}and Iu

active ={j|hj(u⋆) = 0},

respectively. At each time step k, the ﬁrst control input u⋆

k

from the optimal sequence u⋆

[k,k+N−1]dis applied to (1),

and the problem is solved again at the next time step.

A. Problem Statement

We now formally state the problem statement:

Problem 1: Given the system dynamics (1), and parame-

ters of the MPC problem instance that uniquely characterize

the optimal solution, the goal is to predict the set of active

constraints at optimality.

Note that predicting active and inactive constraints is gener-

ally non-trivial, especially for nonlinear MPC problems, as it

requires one to know the optimal solution to (2). In addition,

most active set methods [11], [27]–[29] from the literature

focus on constraint removal, but are mainly restricted to QPs

with linear constraints and suffer from combinatorial worst-

case time complexity [30]. Note that problem 1 considers

general MPC problems with nonlinear dynamics and non-

convex/convex constraints.

III. PROP OSE D APPRO ACH

In this section, we ﬁrst present the reduced MPC problem

(3) and show that the optimal control sequence synthesized

by solving (3) is equivalent to solving the original MPC

problem (2). We next discuss our proposed approach Trans-

formerMPC. The overall approach involves two main phases:

a learning phase and an execution phase. In the learning

phase, the transformer is trained on a dataset containing

various MPC problem instances (2), where each instance is

characterized by parameters such as initial conditions x0,

reference trajectories {xref

k,. . . ,xref

k+N}etc. The transformer

learns to map these parameters to the corresponding active

constraints Cactive (Section III-B). Another transformer model

is used for better warm start initialization that predicts

a control sequence close to the optimal control sequence

(Section III-C). In the execution phase, these trained models

are used for real-time prediction of the active constraints for

a given set of parameters as well as for better warm start

initialization (Section III-D). This allows the MPC solver

to ignore all the inactive constraints and focus solely on the

active constraints, thereby reducing the computational burden

while ensuring that the control objectives are met. Next, we

consider the special case of linear MPC problems where

the costs are quadratic, and the dynamics and constraints

are linear, thereby reducing the problem to a QP. Given

the transformer’s constraint predictions, we show that the

linear MPC problem can be solved analytically (Section III-

E). Finally, we propose an approach to further accelerate

nonlinear MPC problems by combining multiple non-convex

constraints into a single smooth constraint using log-sum-

exp functions, with GPU parallelization thereby enhancing

computational efﬁciency (Section III-F).

A. Constraint removal and simpliﬁed MPC problem

We ﬁrst reformulate the original MPC optimal control

problem (2) into a more reduced and simpliﬁed problem by

eliminating constraints that are identiﬁed as inactive at the

optimal solution. Note that an inactive constraint in (2) does

not affect the optimal solution, meaning the optimal control

sequence u⋆

[k,k+N−1]dfrom (2) remains unchanged if the

constraint is excluded from (2).

Lemma 1: Let x0∈ X be an arbitrary initial condition.

Assume that Ix

active,Iu

active and Ix

inactive,Iu

inactive represent

the indices of the active and inactive constraints of 2,

Linear encoder

Position encoder

Transformer encoder

Linear decoder

{xref,i

k}Nc

i=1

{xi

0}Nc

i=1

0 : inactive constraint |1 : active constraint

1, 0 0 0 1 0, . . .

(a) Πc

θc(Inactive constraint re-

moval)

Linear encoder

Position encoder

Transformer encoder

Linear decoder

{xref,i

k}Nw

i=1

{xi

k}Nw

i=1

˜

u⋆

[k,k+N−1]d(≈u⋆

[k,k+N−1]d)

(b) Πw

θw(Warm start)

Fig. 3: For training both transformers, the input corresponds

to problem parameters, such as the initial state x0and refer-

ence trajectories xref, that uniquely characterize the optimal

solution. The transformer for inactive constraint removal,

Πc

θc, returns the set of active constraints at optimality, while

the transformer for warm starting, Πw

θwreturns a better initial

guess for an optimal control sequence.

respectively. Then, the optimal control sequence u⋆

[k,k+N−1]d

is a solution to

minimize

uk,...,uk+N−1

N−1

X

i=0

ℓ(xk+i,uk+i)(3a)

subject to xk+i+1 =f(xk+i,uk+i), i= [0, N−1]d,

(3b)

gj(x)≤0, ∀j∈ Ix

active,(3c)

hj(u)≤0, ∀j∈ Iu

active,(3d)

if and only if it is also a solution to problem (2).

Proof: The necessary conditions for optimality of

problem (2) can be expressed using its Lagrangian, which

is given by

Lu⋆

[k,k+N−1]d=

N−1

X

i=0

ℓ(xk+i,uk+i)

+X

j∈Ix

active∪I x

inactive

µjgj(x)

+X

j∈Iu

active∪I u

inactive

λjhj(u), (4)

where λj≥0and µj≥0. Given that the constraints indexed

by j∈ Ix

inactive for gj(x)and j∈ Iu

inactive for hj(u)are

assumed to be inactive, it follows that µj= 0 and λj= 0

for j∈ Ix

inactive and j∈ Iu

inactive respectively. Therefore,

Lu⋆

[k,k+N−1]d=

N−1

X

i=0

ℓ(xk+i,uk+i)

+X

j∈Ix

active

µjgj(x) + X

j∈Iu

active

λjhj(u). (5)

Since (4) and (5) are identical, the KKT conditions based

on Lu⋆

[k,k+N−1]dbecome the same for the reduced MPC

problem (3) and the actual MPC problem (2).

Remark 1: The number of constraints that are reduced

by solving (3) instead of (2) is |Ix

inactive ∪Iu

inactive|, where |S|

is the cardinality of the ﬁnite set S. Note that as mentioned

in Lemma 1, the optimal control sequence is not affected by

solving the simpliﬁed MPC problem (3).

Remark 2: For a QP, the computational complexity

scales as O(M3)with the number of constraints Mwhen

solved using established interior point methods, assuming

the number of variables is ﬁxed [42]. Consequently, if

the set of active constraints Cactive is known a priori, the

computational complexity of solving the QP reduces to

O|Ix

inactive ∪ Iu

inactive|3.

B. Learning framework for identiﬁcation of active con-

straints

The dataset Dcorresponds to a speciﬁc instance of the

MPC problem characterized by its parameters, i.e., D=

{xi

0,xref,i

k,. . . ,xref,i

k+N,. . . }Nc

i=1 where the superscript ide-

notes the ith parameter set. The corresponding labels for the

output indicates the set of active constraints Ci

active with a

value of 1and inactive constraints Ci

inactive with a value of 0.

The transformer model is then trained on this dataset to learn

a mapping from the input parameters to the set of active con-

straints. Mathematically, let Pc={x0,xref

k,. . . ,xref

k+N,. . . }

represent the input parameter for a given MPC problem

instance. The transformer model learns a function Πc

θcsuch

that Πc

θc(Pc) = Cactive where θcare parameters of the

transformer model Πc

θ(3a). The transformer architecture

3a uses the attention mechanism to weigh the importance

of different elements within Pcwhen predicting the active

constraints. Speciﬁcally, the attention mechanism computes

a set of attention weights A∈Rn×nas follows:

A = softmax QKT

√dk,(6)

where Q∈Rn×dkrepresents the query matrix, K∈Rn×dk

the key matrix, and dkis the dimensionality of the key

vectors. The output of the attention mechanism, known as

the context vector C∈Rn×dv, is given by C = AV where

V∈Rn×dvis the value matrix. Each element of the context

vector Cis a weighted sum of the value vectors, with weights

determined by the attention mechanism:

ci=

N

X

j=1

softmax ⟨qi,kj⟩

√dk·vj,(7)

where ⟨·,·⟩ denotes the dot product, qiis the ith row of

Q,kjis the jth row of K,vjis the jth column of V

and ciis the ith row of C. This process allows the model

to weigh the importance of different elements within Pc

when predicting the active constraints. The context vector C

captures the relevant information from the input parameters,

Pc, that is most relevant for predicting the active constraints.

For training the transformer model, we use the Mean Squared

Error (MSE) loss function that measures the difference

between the predicted and actual active constraints. Once the

transformer model is trained, it can be used during online

operation of MPC to predict the set of active constraints

Cactive based on the current parameters, Pc. This prediction

allows the MPC solver to focus only on the active constraints,

signiﬁcantly reducing the computational complexity of the

optimization problem. Constraints not predicted to be active,

denoted as Cinactive, are removed from the problem formu-

lation. Thus, the proposed approach streamlines the MPC

process by predicting and removing inactive constraints in

real-time, leading to faster computation of optimal control

inputs without sacriﬁcing constraint satisfaction.

C. Transformer-Based Warm Start Initialization

In addition to using transformers for inactive constraint

removal, we leverage its capabilities to warm start the MPC

problem as well. Towards this goal, we use another trans-

former model Πw

θwto predict an initial guess for the optimal

control inputs ˜

u[k,k+N−1]d≈u⋆

[k,k+N−1]d. Consequently,

the state trajectories {xk+1,. . . ,xk+N}can also be com-

puted, given the initial state xkat each time horizon k. By

learning the optimal solutions of previous MPC problems, the

transformer Πw

θwcan provide a starting point that is closer

to the ﬁnal solution. Let Zw={xi

k,xref,i

k,. . . ,xref,i

k+N}Nw

i=1

represent the set of current system states and reference trajec-

tories. The transformer model Πw

θw3b learns a mapping from

Zwto the warm start guess, i.e., Πw

θw(Zw)→˜

u[k,k+N−1]d

where ˜

u[k,k+N−1]dis the approximate initial guess for the

optimal control sequence (that is expected to be close to

u⋆

[k,k+N−1]d). This warm start solution is then used as the

initial input for the MPC solver. Note that the warm start pro-

vided by the transformer complements the online constraint

removal strategy. After the constraints have been reduced

learned model Π⋆

θc, the MPC solver uses the transformer’s

warm start initialization to further speed up the convergence

process. This dual approach ensures that the MPC problem

is solved efﬁciently, both by reducing the number of inactive

constraints as well as initiating the optimization process

closer to the optimal solution via transformers.

D. Execution phase

In the execution phase, given the set of parameters Pc=

{x0,xref

k,. . . ,xref

k+N,C} that characterize an MPC problem,

the trained transformer models, Πc

θ⋆

cand Πw

θ⋆

w, are deployed

in real-time to predict the active constraints, Cactive, and im-

prove the warm start initialization, respectively. These active

constraints are then added as constraints to the simpliﬁed

MPC problem. The MPC problem is then solved iteratively

via any state-of-the-art MPC solver as shown in Fig. 2a. This

reformulated problem is computationally less intensive than

solving the complete MPC problem with all constraints. The

optimal solution obtained from the simpliﬁed MPC problem

is veriﬁed (see Fig. 2a) to ensure it satisﬁes all the constraints

of the original MPC. If any constraints are violated, the

original MPC is solved to generate the optimal control inputs.

E. Analytical solution to QP’s after transformer based inac-

tive constraint removal

In this section, we consider a special case of a general

MPC problem with convex quadratic cost and linear dynam-

ics, i.e., a QP. Consider the following QP

minimize 1

2yTQy+pTy(8a)

subject to Ay=b,Cy≤d,(8b)

where yis the augmented optimization variable obtained

after converting the linear MPC into a QP [9]. If the

active constraints are predicted accurately by the learned

transformer model Πc

θ⋆

c(where θ⋆

care learned parameters),

then the simpliﬁed QP with only equality constraints is given

by

minimize 1

2yTQy+pTy(9a)

subject to Ey=f,(9b)

where E= diag(A,C1),f= [bT,dT

1]T,C1and d1

are such that the inequality constraint Cy≤dis divided

into active constraints, C1y=d1, and inactive constraints,

C2y<d2.

Lemma 2: Assuming that Qis a positive deﬁnite (sym-

metric) matrix and Ea full row rank matrix, the analytical

solution to (9) is given by

y⋆=Q−1ETEQ−1ET−1EQ−1p+f−p.

Proof: The Lagrangian for (9) is given by:

L(y,λ) = 1

2yTQy+pTy+λT(Ey−f), (10)

where λ≥0is the vector of Lagrange multipliers associated

with the equality constraints. At optimality, the gradient of L

with respect to yand λis zero i.e., ∇yL=Qy+p+ETλ=

0and ∇λL=Ey−f= 0, which can be written in a

compact form as

Q ET

E0y

λ=−p

f.(11)

The solution to this system provides the optimal y⋆and λ⋆.

Speciﬁcally, y⋆is given by y⋆=−Q−1p−Q−1ETλ⋆where

λ⋆is computed by solving EQ−1ETλ=EQ−1p+f.

Finally, after substituting λ⋆in the expression for y⋆, we

get

y⋆=Q−1ETEQ−1ET−1EQ−1p+f−p

Consequently, the result follows.

Remark 3: If the optimal solution y⋆satisﬁes all the

constraints in the original QP (8), it will be applied to the

robotic system. Otherwise, one must solve the original QP

to generate the optimal control inputs. For large-scale QP

problems, the matrix inversion EQ−1ET−1can be further

accelerated through the use of GPUs [43], [44].

F. Accelerating Nonlinear MPC (NMPC) problems after

inactive constraint removal

After the inactive constraints have been removed from

(2) by the learned model Πc

θ⋆

c, NMPC problems can be

further accelerated by combining multiple nonlinear and

non-convex constraints, (3c) and (3d), into a single smooth

constraint function using log-sum-exp expressions (a smooth

approximation of union operation), i.e., gcomb(x,u) :=

log Pj∈Ix

active eβgj(x)+Pj∈Iu

active eβhj(u)≤0for some

β > 0. By leveraging GPU parallelization (for instance,

using cupy [45] package) for the summation within the

log-sum-exp function, signiﬁcant runtime improvements can

be achieved, particularly beneﬁcial for real-time MPC ap-

plications. It can be shown that the constraint set S=

{(x,u)|gi(x)≤0, hj(u)≤0, i∈ Ix

active,j∈ Iu

active}is a

subset of Sc={(x,u)|gcomb(x,u)≤0}and as βtends

to inﬁnity, the set Stends to Sc. Furthermore, this approach

scales effectively with increasing the number of constraints,

offering a promising solution for large-scale, nonlinear MPC

problems that require rapid computation. Note that other

functions, such as Mellowmax or p-Norm functions, can also

be used instead of log-sum-exp functions.

IV. RES U LTS

In this section, we compare our proposed approach, Trans-

formerMPC, with recent baseline methods for solving MPC

problems. Through our numerical experiments, we aim to

answer the following questions (i) what is the average

reduction in the number of inactive constraints observed

using our approach? (ii) what is the overall decrease in

computational time of TransformerMPC compared with the

baseline methods? (iii) how do other approaches, such as

Multi-Layer Perceptron (MLP), random forest [46], gradient

boosting [47] and Support Vector Machine (SVM) [48]

compare with this our proposed transformer architecture?

The average computational time (averaged over 500 MPC

problems) for our approach is computed by αtRMPC + (1 −

α)(tRMPC +tMPC)where α∈[0, 1],tRMPC , and tMPC are

the times taken to compute optimal control inputs for reduced

MPC (RMPC) (3) and MPC (2) respectively. For instance,

for 100 MPC problem instances, if the learned transformer

predicts the active constraints for 90 problems correctly, then

α= 0.9, (90/100). We compare our proposed approach

TransformerMPC with interior point methods [20], [21],

[23], active set methods [11], [28], proximal interior method

[23] as well as an augmented Lagrangian method [10]. For

our experiments, we consider three realistic MPC problems

that are common in the robotics community. The ﬁrst is

that of balancing an Upkie wheeled biped robot (15 states

and 7control inputs) [49], [50]. Second, we consider the

problem of stabilizing a Crazyﬂie quadrotor (12 states and 4

control inputs) from a random initial condition to a hovering

position [4]. Finally, we consider the problem of stabilizing

an Atlas humanoid robot (58 states and 29 control inputs)

balancing on one foot [9]. All benchmarking experiments

were performed on a desktop equipped with an Intel(R)

Core(TM) i9-10900K CPU @ 3.70GHz and an NVIDIA

RTX A4000 GPU with 16 GB of GDDR6 memory.

A. Average percentage reduction in the number of inactive

constraints

Figure 4 illustrates the percentage reduction in the total

number of inactive constraints for the three robotic systems

considered. TransformerMPC achieves signiﬁcant inactive

constraint reductions across all systems, with an 89.0%

reduction for the wheeled biped, 93.6% for the quadrotor,

and 95.8% for the humanoid. These results demonstrate the

effectiveness of TransformerMPC in simplifying the MPC

problem by removing inactive constraints, particularly for

high-dimensional and complex systems such as the wheeled

biped and the Atlas humanoid robot, thereby enhancing

computational efﬁciency.

Wheeled biped Quadrotor Humanoid

0

50

100 89 93.6 95.81

% reduction in inactive constraints

Fig. 4: Average reduction in the total number of inactive constraints

for the three robotic systems.

B. Average reduction of computational time

Figures 1 and 5 illustrate the average computational time

for various solvers when applied to three different systems:

an Upkie wheeled biped robot, a Crazyﬂie quadrotor, and

an Atlas humanoid robot. For the wheeled biped robot (see

Fig. 1), the TransformerMPC signiﬁcantly reduces compu-

tational time across all solvers. Notably, for CVXOPT (in

10−2s), there is a reduction from 1.1549 to 0.1698 (6.8x

improvement), and qpOASES (in 10−2s) from 1.3923 to

0.1698 (8.19x improvement). Even solvers with initially

lower computational times, such as DAQP and ProxQP (in

10−2s), beneﬁt from TransformerMPC, with reductions to

0.0174 and 0.0227, respectively.

cvxopt daqp ecos highs piqp proxqp qpoases

0

1

2

3

6.88x

1.2x

1.15x

5.25x

2.85x

2.05x

1.55x

Avg. comp. time (in 10−2s)

Solver TransformerMPC (transformed augmented solver)

Fig. 5: Average computational time of proposed TransformerMPC

compared with solvers for steering the quadrotor to a hover position

from random initial conditions.

Furthermore, for the quadrotor (see Fig. 5), Trans-

formerMPC demonstrates consistent performance improve-

ments across all solvers. CVXOPT’s runtime (in 10−2s)

decreases from 2.8996 to 0.4241 (6.88x improvement), while

qpOASES (in 10−2s) shows a reduction from 2.80854 to

1.80683 (1.55x improvement). The most signiﬁcant gains

are observed in HiGHS (in 10−2s), where the runtime drops

from 2.7345 to 0.5208.

cvxopt daqp ecos highs piqp proxqp qpoases

0

0.2

0.4

0.6

0.8

1

1.2

34.9x

2.52x

3.92x

6.50x

6.53x 1.96x

8.79x

Avg. comp. time (in 10−2s)

Solver TransformerMPC (transformed augmented solver)

Fig. 6: Average computational time of proposed TransformerMPC

on balancing of Atlas humanoid robot on one foot compared with

recent baseline methods

Finally, for the problem of Atlas balancing on one foot

(see Fig. 6), CVXOPT runtime (in 10−2s) has a reduction

from 0.95 to 0.0272 (34.9x improvement), while qpOASES

(in 10−2s) sees an 8.79x reduction from 1.221 to 0.1389.

Other solvers like PIQP and HiGHS also demonstrate sub-

stantial improvements of 6.53x and 6.50x, respectively.

Even for lower-time solvers like DAQP and ProxQP, Trans-

formerMPC achieves notable gains, reducing the time by

2.52x and 1.96x, respectively. This reduction indicates that

TransformerMPC effectively enhances solver performance

by warm-starting and removing inactive constraints, mak-

ing it real-time implementable for robotic systems such

as wheeled biped robots, quadrotors, and humanoid robots

where computational efﬁciency is critical.

C. Comparison with other learned models for active constraint

prediction

For these numerical experiments, we employed an 80−20

train-test split of the dataset, using 80% of the data for

training the models and 20% for testing. Fig. 7 presents

the prediction accuracy on test data for different learning

models in predicting inactive constraints across the three

robotic systems. Among the models, the learned transformer

model Πc

θ⋆

cconsistently achieves the highest accuracy, with

89% for wheeled biped robot, 92% for quadrotor, and 95.8%

for humanoid. In contrast, other learned models, including

MLP, Random Forest [46], Gradient Boosting [47], Support

Vector Machine (SVM) [48], and Logistic Regression, show

signiﬁcantly lower accuracy across all systems. These results

highlight the efﬁcacy of the learned transformer model in

accurately predicting inactive constraints.

wheeled biped quadrotor humanoid

0

20

40

60

80

100 89 92 95.8

3.3 0.5

12.5

3.3 24.5

8.3

18 12.5

15.8

35.5 41.6

10

43.5 42.9

% Accuracy on test data

Transformer (Ours) MLP Random forest

Gradient boosting SVM Logistic regression

Fig. 7: Percentage accuracy for inactive constraint removal via

different learned models on test data.

V. CONCLUSIONS

In this paper, we addressed the problem of improving the

computational efﬁciency in Model Predictive Control (MPC)

by introducing a transformer-based approach for online in-

active constraint removal as well as warm start initializa-

tion. Our approach reduced the computational burden by

focusing on a subset of constraints predicted by the learned

transformer model. We synthesized optimal control inputs by

solving the MPC problem with this reduced set of constraints

and veriﬁed the efﬁcacy of our approach with complex

robotic systems, demonstrating signiﬁcant improvements in

performance and feasibility for real-time applications over

other state-of-the-art MPC solvers. Future work includes the

implementation of TransformerMPC on resource constrained

robotic systems as well as extending our approach to multi-

agent systems.

REFERENCES

[1] S. Katayama, M. Murooka, and Y. Tazaki, “Model predictive control

of legged and humanoid robots: models and algorithms,” Advanced

Robotics, vol. 37, no. 5, pp. 298–315, 2023.

[2] S. Hong, J.-H. Kim, and H.-W. Park, “Real-time constrained nonlinear

model predictive control on so (3) for dynamic legged locomotion,”

in 2020 IEEE/RSJ International Conference on Intelligent Robots and

Systems (IROS). IEEE, 2020, pp. 3982–3989.

[3] A. Didier, A. Parsi, J. Coulson, and R. S. Smith, “Robust adaptive

model predictive control of quadrotors,” in 2021 European Control

Conference (ECC). IEEE, 2021, pp. 657–662.

[4] K. Nguyen, S. Schoedel, A. Alavilli, B. Plancher, and Z. Manchester,

“Tinympc: Model-predictive control on resource-constrained micro-

controllers,” in 2024 IEEE International Conference on Robotics and

Automation (ICRA), 2024, pp. 1–7.

[5] D. Morgan, S.-J. Chung, and F. Y. Hadaegh, “Model predictive con-

trol of swarms of spacecraft using sequential convex programming,”

Journal of Guidance, Control, and Dynamics, vol. 37, no. 6, pp. 1725–

1740, 2014.

[6] T. P. Nascimento, C. E. D´

orea, and L. M. G. Gonc¸alves, “Nonholo-

nomic mobile robots’ trajectory tracking model predictive control: a

survey,” Robotica, vol. 36, no. 5, pp. 676–696, 2018.

[7] S. Yu, M. Hirche, Y. Huang, H. Chen, and F. Allg¨

ower, “Model predic-

tive control for autonomous ground vehicles: a review,” Autonomous

Intelligent Systems, vol. 1, pp. 1–17, 2021.

[8] B. Stellato, G. Banjac, P. Goulart, A. Bemporad, and S. Boyd, “Osqp:

An operator splitting solver for quadratic programs,” Mathematical

Programming Computation, vol. 12, no. 4, pp. 637–672, 2020.

[9] A. L. Bishop, J. Z. Zhang, S. Gurumurthy, K. Tracy, and Z. Manch-

ester, “Relu-qp: A gpu-accelerated quadratic programming solver for

model-predictive control,” in 2024 IEEE International Conference on

Robotics and Automation (ICRA). IEEE, 2024, pp. 13 285–13 292.

[10] A. Bambade, S. El-Kazdadi, A. Taylor, and J. Carpentier, “Prox-qp:

Yet another quadratic programming solver for robotics and beyond,”

in RSS 2022-Robotics: Science and Systems, 2022.

[11] H. J. Ferreau, C. Kirches, A. Potschka, H. G. Bock, and M. Diehl,

“qpoases: A parametric active-set algorithm for quadratic program-

ming,” Mathematical Programming Computation, vol. 6, pp. 327–363,

2014.

[12] F. Fiedler, B. Karg, L. L¨

uken, D. Brandner, M. Heinlein, F. Brabender,

and S. Lucia, “do-mpc: Towards fair nonlinear and robust model

predictive control,” Control Engineering Practice, vol. 140, p. 105676,

2023.

[13] J. A. E. Andersson, J. Gillis, G. Horn, J. B. Rawlings, and M. Diehl,

“CasADi – A software framework for nonlinear optimization and

optimal control,” Mathematical Programming Computation, vol. 11,

no. 1, pp. 1–36, 2019.

[14] V. Azhmyakov and J. Raisch, “Convex control systems and convex

optimal control problems with constraints,” IEEE Transactions on

Automatic Control, vol. 53, no. 4, pp. 993–998, 2008.

[15] T. A. Howell, B. E. Jackson, and Z. Manchester, “Altro: A fast

solver for constrained trajectory optimization,” in 2019 IEEE/RSJ

International Conference on Intelligent Robots and Systems (IROS).

IEEE, 2019, pp. 7674–7679.

[16] M. ApS, “Mosek optimization toolbox for matlab,” User’s Guide and

Reference Manual, Version, vol. 4, no. 1, 2019.

[17] L. Gurobi Optimization, “Gurobi optimizer reference manual (2020),”

2023.

[18] G. Frison and M. Diehl, “Hpipm: a high-performance quadratic

programming framework for model predictive control,” IFAC-

PapersOnLine, vol. 53, no. 2, pp. 6563–6569, 2020.

[19] P. J. Goulart and Y. Chen, “Clarabel: An interior-point solver for conic

programs with quadratic objectives,” arXiv preprint arXiv:2405.12762,

2024.

[20] M. Andersen, J. Dahl, and L. Vandenberghe, “Cvxopt: Convex opti-

mization,” Astrophysics Source Code Library, pp. ascl–2008, 2020.

[21] A. Domahidi, E. Chu, and S. Boyd, “Ecos: An socp solver for

embedded systems,” in 2013 European control conference (ECC).

IEEE, 2013, pp. 3071–3076.

[22] A. G. Pandala, Y. Ding, and H.-W. Park, “qpswift: A real-time sparse

quadratic program solver for robotic applications,” IEEE Robotics and

Automation Letters, vol. 4, no. 4, pp. 3355–3362, 2019.

[23] R. Schwan, Y. Jiang, D. Kuhn, and C. N. Jones, “Piqp: A proximal

interior-point quadratic programming solver,” in 2023 62nd IEEE

Conference on Decision and Control (CDC). IEEE, 2023, pp. 1088–

1093.

[24] A. W¨

achter and L. T. Biegler, “On the implementation of an interior-

point ﬁlter line-search algorithm for large-scale nonlinear program-

ming,” Mathematical programming, vol. 106, pp. 25–57, 2006.

[25] B. Hermans, A. Themelis, and P. Patrinos, “Qpalm: A proximal

augmented lagrangian method for nonconvex quadratic programs,”

Mathematical Programming Computation, vol. 14, no. 3, pp. 497–541,

2022.

[26] B. Lautenschlager, K. Kruppa, and G. Lichtenberg, “Convexity prop-

erties of the model predictive control problem for subclasses of mul-

tilinear time-invariant systems,” IFAC-PapersOnLine, vol. 48, no. 23,

pp. 148–153, 2015.

[27] D. Goldfarb and A. Idnani, “A numerically stable dual method for

solving strictly convex quadratic programs,” Mathematical program-

ming, vol. 27, no. 1, pp. 1–33, 1983.

[28] J. Hall, I. Galabova, L. Gottwald, and M. Feldmeier, “Highs–high

performance software for linear optimization,” 2023.

[29] D. Arnstr”om, A. Bemporad, and D. Axehill, “A dual active-set solver

for embedded quadratic programming using recursive ldlTupdates,”

IEEE Transactions on Automatic Control, vol. 67, no. 8, pp. 4362–

4369, 2022.

[30] J. Nocedal and S. J. Wright, Numerical optimization. Springer, 1999.

[31] A. Bemporad, M. Morari, V. Dua, and E. N. Pistikopoulos, “The ex-

plicit linear quadratic regulator for constrained systems,” Automatica,

vol. 38, no. 1, pp. 3–20, 2002.

[32] A. Alessio and A. Bemporad, “A survey on explicit model predictive

control,” Nonlinear Model Predictive Control: Towards New Challeng-

ing Applications, pp. 345–369, 2009.

[33] G. Shi, X. Shi, M. O’Connell, R. Yu, K. Azizzadenesheli, A. Anand-

kumar, Y. Yue, and S.-J. Chung, “Neural lander: Stable drone landing

control using learned dynamics,” in 2019 international conference on

robotics and automation (icra). IEEE, 2019, pp. 9784–9790.

[34] M. O’Connell, G. Shi, X. Shi, K. Azizzadenesheli, A. Anandkumar,

Y. Yue, and S.-J. Chung, “Neural-ﬂy enables rapid learning for agile

ﬂight in strong winds,” Science Robotics, vol. 7, no. 66, p. eabm6597,

2022.

[35] J. Briden, T. Gurga, B. J. Johnson, A. Cauligi, and R. Linares,

“Improving computational efﬁciency for powered descent guidance

via transformer-based tight constraint prediction,” in AIAA SCITECH

2024 Forum, 2024, p. 1760.

[36] T. Salzmann, E. Kaufmann, J. Arrizabalaga, M. Pavone, D. Scara-

muzza, and M. Ryll, “Real-time neural mpc: Deep learning model

predictive control for quadrotors and agile robotic platforms,” IEEE

Robotics and Automation Letters, vol. 8, no. 4, pp. 2397–2404, 2023.

[37] T. Guffanti, D. Gammelli, S. D’Amico, and M. Pavone, “Transformers

for trajectory optimization with application to spacecraft rendezvous,”

in 2024 IEEE Aerospace Conference. IEEE, 2024, pp. 1–13.

[38] M. Klauˇ

co, M. Kal´

uz, and M. Kvasnica, “Machine learning-based

warm starting of active set methods in embedded model predictive

control,” Engineering Applications of Artiﬁcial Intelligence, vol. 77,

pp. 1–8, 2019.

[39] L. Schwenkel, M. Gharbi, S. Trimpe, and C. Ebenbauer, “Online

learning with stability guarantees: A memory-based warm starting for

real-time mpc,” Automatica, vol. 122, p. 109247, 2020.

[40] S. W. Chen, T. Wang, N. Atanasov, V. Kumar, and M. Morari, “Large

scale model predictive control with neural networks and primal active

sets,” Automatica, vol. 135, p. 109947, 2022.

[41] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones,

A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you

need.(nips), 2017,” arXiv preprint arXiv:1706.03762, vol. 10, p.

S0140525X16001837, 2017.

[42] Y. Nesterov and A. Nemirovskii, Interior-point polynomial algorithms

in convex programming. SIAM, 1994.

[43] P. Benner, P. Ezzatti, E. S. Quintana-Ort´

ı, and A. Rem´

on, “Matrix

inversion on cpu–gpu platforms with applications in control theory,”

Concurrency and Computation: Practice and Experience, vol. 25,

no. 8, pp. 1170–1182, 2013.

[44] P. Ezzatti, E. S. Quintana-Orti, and A. Remon, “High performance

matrix inversion on a multi-core platform with several gpus,” in 2011

19th International Euromicro Conference on Parallel, Distributed and

Network-Based Processing. IEEE, 2011, pp. 87–93.

[45] R. Nishino and S. H. C. Loomis, “Cupy: A numpy-compatible library

for nvidia gpu calculations,” 31st confernce on neural information

processing systems, vol. 151, no. 7, 2017.

[46] L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp.

5–32, 2001.

[47] J. H. Friedman, “Greedy function approximation: a gradient boosting

machine,” Annals of statistics, pp. 1189–1232, 2001.

[48] C. Cortes and V. Vapnik, “Support vector networks,” Machine Learn-

ing, 1995.

[49] A. Bambade, F. Schramm, S. E. Kazdadi, S. Caron, A. Taylor,

and J. Carpentier, “PROXQP: an Efﬁcient and Versatile Quadratic

Programming Solver for Real-Time Robotics Applications and

Beyond,” Sep. 2023, working paper or preprint. [Online]. Available:

https://inria.hal.science/hal-04198663

[50] S. Caron, A. Zaki, P. Otta, D. Arnstr ¨

om, J. Carpentier, F. Yang, and

P.-A. Leziart, “qpbenchmark: Benchmark for quadratic programming

solvers available in Python,” 2024. [Online]. Available: https:

//github.com/qpsolvers/qpbenchmark