Page 1

A Majorization-Minimization Approach to

Design of Power Transmission Networks

Jason K. Johnson and Michael Chertkov

Abstract—We propose an optimization approach to design

cost-effective electrical power transmission networks. That is,

we aim to select both the network structure and the line

conductances (line sizes) so as to optimize the trade-off between

network efficiency (low power dissipation within the transmis-

sion network) and the cost to build the network. We begin with

a convex optimization method based on the paper “Minimizing

Effective Resistance of a Graph” [Ghosh, Boyd & Saberi]. We

show that this (DC) resistive network method can be adapted to

the context of AC power flow. However, that does not address

the combinatorial aspect of selecting network structure. We

approach this problem as selecting a subgraph within an over-

complete network, posed as minimizing the (convex) network

power dissipation plus a non-convex cost on line conductances

that encourages sparse networks where many line conductances

are set to zero. We develop a heuristic approach to solve

this non-convex optimization problem using: (1) a continuation

method to interpolate from the smooth, convex problem to

the (non-smooth, non-convex) combinatorial problem, (2) the

majorization-minimization algorithm to perform the necessary

intermediate smooth but non-convex optimization steps. Ulti-

mately, this involves solving a sequence of convex optimization

problems in which we iteratively reweight a linear cost on line

conductances to fit the actual non-convex cost. Several examples

are presented which suggest that the overall method is a good

heuristic for network design. We also consider how to obtain

sparse networks that are still robust against failures of lines

and/or generators.

I. INTRODUCTION

The power grid of today was not systematically planned

but grew in a piecemeal fashion. In spite of this it is

largely reliable, arguably among the greatest engineering

achievements of the 20th century. However, this status quo

is now challenged with increased demand and stress on the

aging network leading to extremely costly and growing-in-

scale blackouts and operational problems. A shift towards

renewable sources of energy will further stress the grid as

these resources are intermittent and thus not reliable in the

traditional sense. These changes emphasize the importance

of incorporating new and extending existing infrastructure

in a systematic way. In this paper we present a proof of

principles study suggesting an efficient algorithmic approach

for optimal or close to optimal power grid design.

A. Motivation

A key challenge in updating and extending the power grid

is determining where to place new transmission, generation

J. Johnson and M. Chertkov are both with the Center for Nonlinear

Studies and Theoretical Division T-4 of Los Alamos National Laboratory,

Los Alamos, NM 87544. M. Chertkov is also affiliated with the New

Mexico Consortium, Los Alamos, NM 87544. jasonj@lanl.gov,

chertkov@lanl.gov

and storage facilities or in some cases how to design a

new grid from scratch. Specifically, the present theoretical

study was motivated by the national challenge of integrating

renewables into operation of the existing US grid. Renew-

able generation, such as wind and solar, are intermittent.

Moreover, regions where wind is plentiful often lack ade-

quate transmission lines. Effective and reliable exploitation

of renewables requires planning. The National Renewable

Energy Laboratory’s (NREL) WinDS project [13], [14] is

an excellent first step, however, it does not account for

power flow stability or grid resiliency. A study of the WinDS

solution performed at LANL [15] has discovered that it

results in an often infeasible electric grid suggesting there

is a problem in generating globally optimal solutions that

accommodate intermittent renewable generation.

Our paper develops an approach towards the challenging

problem of planning cost-effective and robust extensions of

the power grid to accommodate growing demand and long-

term addition of renewables. Our approach may also provide

a starting point for practical planning approaches such as the

one proposed in [15].

B. Related Work

The initial inspiration for our approach was the convex

network optimization methods of Ghosh, Boyd and Saberi

[10]. Building on earlier work [5], they consider the problem

of minimizing the total resistance of an electrical network

subject to a linear budget on line conductances, where they

interpret the total resistance metric as the expected power dis-

sipation within the network under a random current model.

We extend their work by also selecting the network structure.

We impose sparsity on that structure in a manner similar

to a number of methods that modify a convex optimization

problem by adding some non-convex regularization to obtain

sparser solutions, such as in compressed sensing [6]–[8]

or edge-preserving image restoration [12]. The method of

Candes et al [6] is especially relevant to our approach.

They recommend the majorization-minimization algorithm

[11] as a heuristic approach to sparsity-favoring non-convex

optimization.

Another important element of our approach is that we

follow a similar strategy as in the graduated non-convexity

algorithm [1] in that we solve a sequence of optimization

problems that interpolates from a convex relaxation of the

actual non-convex problem. A somewhat similar approach

has been used to obtain sparse transport networks [2].

arXiv:1004.2285v1 [math.OC] 13 Apr 2010

Page 2

C. Our Contributions

• We adapt the convex network optimization approach

of Ghosh et al [10] to design power transmission

networks by demonstrating how AC power flow (to

first order approximation) can be modeled by a (DC)

resistive network model and specializing Ghosh et al’s

random current model and linear cost on lines to fit our

application.

• We propose a non-convex, discontinuous generalization

of this problem that more strongly encourages sparsity

in the network solution by adding a fixed cost for each

(non-zero conductance) line. We develop a heuristic

method for solving this latter non-convex optimization

problem using the following ideas:

1) We use a continuous relaxation of the non-convex,

combinatorial problem that arises by replacing the

discontinuous step-function by a smoothed proxy

with parameter γ allowing interpolation between

the tractable convex optimization problem (large

γ) and the intractable non-convex, combinatorial

optimization (γ = 0).

2) We use the majorization-minimization algorithm

to heuristically solve the necessary non-convex

optimization steps of this procedure by iteratively

linearizing the (concave) smoothed step function.

• Lastly, we extend all these methods by designing net-

works that are robust against the failures of a small

number of lines and/or generators. Essentially, this is

done by replacing the convex power-dissipation metric

by the worst-case power dissipation after removing

some k lines and/or generators.

The paper is structured as follows: (Section II-A) reviews

the resistive network model; (II-B) discusses how AC power

flow is modeled by DC resistive network; (III) presents the

convex network optimization problem; (IV) presents the non-

convex extension to enforce sparsity; (V) presents robust net-

work design; (VI) indicates a number of potential extensions

of our method and other challenging open questions.

II. TECHNICAL PRELIMINARIES

The optimization approach developed in Section III is

based on the resistive network model explained in Section

II-A. We also describe (Section II-B) that a modification of

the effective resistive network is adequate for the standard

AC power flow model when considered in the leading DC

approximation.

A. Resistive Network Model

We give a brief introduction to electrical networks [3],

[9]. Let G denote a graph with node set N = {1,...,n}

and m (undirected) edges {i,j} ∈ G ⊂ 2N. We assign edge

weights θij ≡ θji ≥ 0 for all {i,j} ∈ G (θij = 0 for all

non-edges {i,j} ?∈ G). Regarded as a resistive network, the

edges {i,j} ∈ G represent the lines of the network with

θijbeing the conductance (inverse resistance) of a line. We

also use ? ∈ G to index lines of the network. We define

the conductance matrix K(θ) ∈ RN×Nof the network by

K(θ) =?

Laplacian of G based on line conductances. Thus,

?

One may also write K(θ) = ADiag(θ)ATwhere Diag(θ) =

?

each edge ? = {i,j}.

Let b ∈ RNrepresent the vector of injected currents —

nodes with bi> 0 are sources, those with bi< 0 are sinks

and bi= 0 for transmission nodes. In the resistive network,

these represent currents being injected into (or drawn from)

each node by an external source. Given K and b, we obtain

the (relative) electrical potential among the nodes u ∈ RN

by solving the linear system of equations:

{i,j}∈Gθij(ei−ej)(ei−ej)Twhere ei∈ Rnare

the standard basis vectors. This is the edge-weighted graph

Kij(θ) =

−θij,

?

i ?= j

i = j

k?=iθik,

,

(1)

?θ?e?eT

the incidence matrix of G with columns a?= ±(ei−ej) for

?∈ Rm×mis a diagonal matrix and A ∈ Rn×mis

Ku = b

(2)

We observe the following properties of the conductance

matrix (assuming connected G and non-zero θ):

is a symmetric positive semi-definite matrix:

uTKu ≥ 0 for all u ∈ Rn. As we will see later,

this represents the fact that power dissipation is non-

negative.

• K has a single zero eigenvalue associated to the “ones”

eigenvector: K1 = 0. This indicates that for b = 0 we

must have uniform electric potential.

• For any eigenvector Ku = λu except 1 it holds that

1Tu = 0 and λ > 0.

It is required that the total injected current is zero, 1Tb =

?

R} for any u?solving Ku?= b, that is, the solution is

uniquely determined up to an overall additive shift of the

electric potentials. There are several approaches one might

use to “regularize” the problem of computing u such that the

solution becomes unique. Here, we require that?

based on the invertible matrix K?= K + 11T. One may

check that K?1 = n1 and all other eigenvalues and eigen-

vectors of K?are the same as for K. The regularized solution

to (2) is then given by u = K+b where K+? (K+11T)−1.

Current flow within the network is then determined by the

electric potential u and Ohm’s law: the current flow from i

to j is bij= θij(ui−uj). Since θij= θji, it of course holds

that bji= −bij. One may verify that bi+?

loss over the network (due to resistive heating of the lines)

is given by:

?

Substitution of u = K+b into this equation gives L =

bTK+KK+b = bTK+b. If we fix the graph structure G

• K

i∈Nbi= 0, so that (2) can be satisfied. Then, there is a

one-dimensional space of solutions of the form {u?+c1|c ∈

iui= 0,

obtained by solving the n×n system of equations K?u = b

k?=ibki= 0 for

all i (current is conserved at each node). The total power

L =

ij∈G

θij(ui− uj)2= uTKu

(3)

Page 3

and the loads b, then the power loss becomes a function of

the conductances L(θ) = bT(K(θ) + 11T)−1b. It is simple

to generalize the power loss objective to account for random

fluctuations of the load b. For a random current the expected

power loss is:

L(θ)=

?bTK+(θ)b?

?Tr(K+(θ)bbT)?

Tr(K+(θ)?bbT?)

Tr(K+(θ)B)

=

=

=

(4)

where we have defined the matrix B ? ?bbT?, which is

a sufficient statistic of the random current model for the

purpose of computing the expected power loss. Importantly,

L(θ) is a convex function, which is the basis for convex

network optimization methods [5], [10].

B. DC Approximation to AC Power Flow

The existing power grid uses the AC voltages and currents

generally described in terms of complex amplitudes and lines

with complex impedances, in contrast to real currents and

positiveconductances of the resistive network setting. In

spite of this difference, the resistive network framework can

be used to approximate the AC system [16].

Indeed, (3) still holds in the case of AC flows if (ui−uj)2

is replaced by |Ui− Uj|2, where Uj is a complex potential

at the node j of the network and

real part of the network admittance matrix, also called the

network (AC) conductance matrix [16]. In a healthy AC flow

the voltage magnitude is stabilized to a constant (unity in the

rescaled power units). In the so-called DC approximation,

where this stabilization is assumed ideal, Uj = exp(iϕj)

where real ϕj is the phase of the potential and i2= −1.

Susceptance of a transmission power line, defined as the

imaginary part of the line admittance, is normally an order of

magnitude larger than the respective real part (conductance

of the line). Then the DC-approximation of the AC Kirch-

hoff equations, with the conductance completely ignored,

becomes

K now stands for the

p =˜Kϕ,

(5)

where p is the vector of real power (with its components

being production/consumption at the graph nodes), and˜K

is the imaginary part of the network admittance matrix, also

called the network susceptance matrix, and (5) thus accounts

only for the lossless transfer of real power over the network,

?

Substituting (5) into the aforementioned expression for the

power losses over the network and keeping only the leading

DC-approximation terms (first order in the conductance-to-

susceptance ratio) one arrives at an expression for losses

ipi = 0. We note that ˜K has all the same essential

properties of K listed in Section II-A.

L =1

(˜K + 11T)−1. We assume that the

conductance-to-admittance ratio, µ, is kept constant for all

the lines, i.e. ˜K = (1/µ)K. Then, the only difference

2pT˜K+K˜K+p,

(6)

where

˜K+

?

between the DC-approximation model and the basic resistive

network model will consist in this additional re-scaling

factor whose particular value is any case irrelevant to the

network optimization discussed in Section III. In particular,

this translation from the resistive network model to the DC-

approximation of the AC-flow model means that (4) turns

into L(θ) =µ2

(real) power flow through the network.

The main conclusion of this subsection is that with proper

(and trivial) rescaling the resistive network model is com-

pletely adequate to describe losses in the leading order DC-

approximation of the AC-flow model of the power grid.

Therefore, with the understanding that we have neglected

reactive power flows, we may without loss of generality work

with the resistive network model in the remainder of the

paper.

2Tr(K+B) where B characterizes the random

III. CONVEX NETWORK OPTIMIZATION

In this section we develop the main convex optimization

method we use to design electric power transmission net-

works. This involves optimizing the line conductances for

a given graph to minimize the expected power loss subject

to a linear constraint (alternatively, adding a linear penalty)

on the vector of line conductances. This is a generalization

of the convex optimization problem posed in [10], which

inspired our approach of this paper. The main contribution

of this section is in adapting their problem formulation

to design electric power transmission networks. In later

sections, we also use this convex optimization method as the

core engine within an iterative method for performing non-

convex network optimization with the aim of discovering

good sparse network structures.

A. The Network Optimization Problem

First, we state the general form of the convex optimization

problem that we consider, and provide further details in the

following subsections. As discussed in Section II-A, we are

given a graph G of n nodes and m edges. The statistics

of currents (power flows in the DC-approximation of AC

system) through the network are described by an n × n

matrix B. Our aim is to assign the line conductances θ to

balance the competing objectives of (1) maximizing network

efficiency (minimizing the expected power dissipation within

the network) and (2) minimizing the cost of building the

network with conductances θ.

We now specify a simple linear cost model on the line

conductances. We model the cost (say, in dollars) of building

the network as αTθ =?

of copper (per unit volume), g is the conductivity of copper

and s? is the total length of line ?. Then, αTθ represents

the total cost of copper needed to build the network with

topology G, lines of length s? and conductances θ?. This

follows as the conductance of a line of length s?and cross-

sectional area a? is θ? = ga?s−1

line is s?a?= s?(g−1s?θ?) = g−1s2

is cg−1s2

?α?θ?. The coefficients of this cost

?where c is the price objective may be set as α?= cg−1s2

?. Hence, the volume of a

?θ?and the cost of a line

?θ? = α?θ?. Note that the problem of optimizing

Page 4

line conductances is essentially the same as line sizing due

to the linear correspondence between conductance and cross-

sectional area.

Given G, B and α one may then select the line con-

ductances θ to make the network as efficient as possible

(minimizing the expected power loss due to resistive heating

of the lines) subject to a linear constraint that the total cost

of building the network must be no greater than a specified

budget C:

minimize

subject to

L(θ)

θ ≥ 0

αTθ ≤ C

This is essentially the same as the convex optimization

problem posed in [10]. The total resistance metric that they

considered is recovered by setting B equal to the identity

matrix. This was interpreted as the expected power loss under

a Gaussian random current model b ∼ N(0,I) (modulo a

projection to enforce the constraint 1Tb = 0). Equivalently,

one may replace the budget constraint by a linear penalty on

network cost, solving the convex optimization problem:

?L(θ) + λαTθ?

The parameter λ > 0 is a Lagrange multiplier enforcing

the budget constraint (the two problems are equivalent for

corresponding values of C and λ). Alternatively, we may

set λ−1= pT where p is the cost of power generation and

T is the expected operational lifetime of the network. Then,

the solution of the penalized optimization problem yields the

most cost-effective network design, minimizing the sum of

the cost to build the network and the cost to operate the

network over its operational lifetime. In the remainder of

the paper, we focus of this latter form of the problem setting

λ = 1 (redefining α → λα).

Our main contribution in the remainder of the section is

to further tailor this problem to the setting of electric power

transmission by appropriate definition of B.

min

θ≥0

B. Single-Generator Formulation

First, we address the simplest case of optimizing a network

with multiple independent random loads supplied by a single

generator at a specified location. For non-generator nodes

we specify the mean load¯bi = ?bi? < 0 and the standard

deviation σi= ?(bi−¯bi)2?

¯bi = 0 and σi = 0. At the generator node we must have

b0= −?

?

where Σ = Diag(σ) is the diagonal covariance matrix of

non-generator loads. One could also use a general covariance

matrix Σ if cross-correlations among the consumers is known

(e.g., induced by hidden variables such as the time, season

or environmental factors such as temperature).

1

2. At transmission nodes we set

i?=0bito satisfy the constraint?

(?

ibi= 0. Then,

the overall random load matrix B = ?bbT? is given by:

i?=0¯bi)2+?

B =

i?=0σ2

i

−1T(¯b¯bT+ Σ)

¯b¯bT+ Σ

−(¯b¯bT+ Σ)1

?

C. Multiple-Generator Formulation

Next, we consider the case that there are two or more

generators within the network. One could consider explicitly

modeling the full matrix B, including both power consump-

tion and generation. However, for controlled power genera-

tors this is not realistic because the response of generators

to meet demand will surely depend on the network itself

(being designed) and moreover is adaptive to fluctuations

in the spatial distribution of demand. To provide a more

realistic model of power generation, we will assume that

power generation is always chosen optimally in response

to demand and network configuration. That is to say, for

any given demand bc the generation bg is chosen subject

to minimize bTK+(θ)b where b = (bc,bg), subject to the

constraint 1Tbg = −1Tbc. Then, averaging the optimized

power loss over the distribution of bcleads to a new convex

objectiveˆL(θ) that we may use in the convex network

optimization problem.

Although this may at first appear to be more complicated,

it turns out there is a simple trick that allows us to transform

it back to the problem we have already considered. Let G?

be an augmented representation of the network in which

we include one auxiliary node 0, considered as a virtual

generator, and where we add auxiliary lines connecting this

virtual generator to each of the real generator nodes of G.

Now, we may apply the formulation of Section III-B to this

augmented model, where the virtual generator is treated as

the only generator and the actual generator nodes of G are

now treated simply as transmission nodes. By setting the

conductance of virtual lines to infinity, we can allow current

(power) to flow freely without dissipation from the virtual

generator to the real generators. Thus, solving for power

flows in this augmented network model uses the optimal

flow (minimizing power dissipation) and is equivalent to

optimizing the power generation in the original model. We

omit technical proofs, which essentially involve showing that

the current flow described by Kirchoff’s laws is efficient.

Finally, rather then actually setting the virtual lines to have

infinite conductance, we can make the cost of conductance on

these lines negligible in comparison to real lines, so that the

virtual lines are assigned very large conductances (relative to

real lines) in the solution of the convex network optimization

problem.

D. Convex Optimization Algorithm

In this section we briefly describe the method we use to

solve the convex network optimization problem. The main

technical result ones needs are formulas for the gradient

vector and Hessian matrix of the expected power loss L(θ).

Generalizing those derivations of [10], one obtains:

∇L(θ) = −1

2diag(ATK+(θ)BK+(θ)A)

∇2L(θ) = (ATK+(θ)A) ◦ (ATK+(θ)BK+(θ)A)

Page 5

Similar to [5], [10], we enforce the non-negativity constraint

θ ≥ 0 using the log-barrier function [4]:

?

The solution of this modified problem will always be strictly

positive. One may obtain a close approximation to the

optimal solution of the original problem for sufficiently small

values of ζ [5]. Efficient algorithms start by (approximately)

solving this problem for a large value of ζ and then iteratively

updating the solution for a decreasing sequence of ζ values.

It is straight-forward, using the formula above, to implement

Newton’s method with back-tracking line search to minimize

this convex objective function [4].

We remark one technical difficulty we have encountered.

Using the formula L(θ) = Tr(K+(θ)B), it may not always

be possible to make ζ arbitrarily small. This is due to

numerical difficulties with computing L(θ) when the graph G

is becoming effectively disconnected due to many θ’s going

to zero. The matrix K(θ) + 11Tis becoming singular in

such cases, such that the formula for L(θ) should really be

reformulated with respect to a subgraph of G with non-zero

conductances. However, these technical difficulties may be

avoided by not letting ζ become so small that K(θ) + 11T

becomes numerically singular. In future work, it may be

desirable to develop a robust way of computing L(θ) in such

cases so that ζ can be made arbitrarily small.

E. Demonstrations

We now describe our fist set of demonstrations, based on

four examples that we revisit in later sections. All exam-

ples having essentially the same graph topology but with

different configurations of demand and generation nodes.

The graph G is comprised of a w × w grid of nodes (with

w = 9 or 10) and has lines between nearest and second-

nearest neighbors of the grid (resulting in vertical/horizontal

edges between nearest neighbors and diagonal edges between

second-nearest neighbors). Transmission, consumption and

generation nodes are respectively marked as black dots, blue

dots and red dots. We set α = 1 on horizontal/vertical edges

and α = 2 on diagonal edges. We have¯b = −1 and σ =1

consumer nodes; and¯b = σ = 0 at transmission nodes. Fig. 1

shows the result of solving the convex network optimization

in four examples. In the multiple generator case (seen at

lower-right), we do not show the virtual generator or the

lines to this generator. We observe that:

• The solution is somewhat sparse in these examples

(it does not use all edges of G) but is not as sparse

as possible (it is not a minimal tree/forest needed to

connect consumers to generators).

• It is not necessary that all transmission nodes are

involved in the solution, as shown by the example seen

at the lower-left of the figure.

• With multiple generators (lower-right), the power trans-

mission network may become disconnected, with each

generator serving a particular subset of nearby consumer

nodes.

min

θ>0

L(θ) + αTθ − ζ

?

?∈G

logθ?

?

3at

Fig. 1.

network optimization method. The strength of a line (conductance) is

indicated by the darkness of the drawn edge, such that zero-conductance

lines are not seen.

Illustration of globally optimal network designs in the convex

IV. SELECTING NETWORK STRUCTURE

In this section we present a non-convex generalization of

the approach taken in the preceding section. As we have seen,

the convex optimization method does not always produce

sparse solutions, i.e., it will typically use most of the edges

of the graph G. In practical applications, we expect that such

solutions are undesirable, as we would like to use the sim-

plest network (with as few edges as possible) that is sufficient

to meet power transmission requirements. Towards this end,

we reformulate the network cost part of our optimization

objective so as to favor solutions with fewer edges. However,

this then gives a non-convex optimization problem, which is

generally intractable to solve exactly. Hence, we develop a

heuristic approach to find good solutions of this non-convex

optimization problem. Using the majorization-minimization

algorithm, we are able to approximately solve the non-

convex problem by instead solving a sequence of convex

optimization problems. Moreover, each convex optimization

problem will be of the form solved in Section III, with

the vector α being iteratively modified. Thus, the methods

of Section III provide an optimization engine for the non-

convex optimization method developed in this section.

A. Sparsity-Favoring Network Cost

In practice, we may also require that the network should

by sparse. We formulate this by adding a cost on all lines

with non-zero conductance so as to encourage solutions with

as few lines as possible:

min

θ≥0{L(θ) + αTθ + βTφ(θ)}

where φ(t) is the unit-step function, φ(0) = 0 and φ(t) = 1

for all t > 0, which is applied element-wise to θ such that

βTφ(θ) =?

Whereas the linear cost αTθ essentially represents the

cost of copper needed to build the network, the non-convex

?β?φ(θ?). Note the φ(t) is non-convex (in fact,

it is concave on t ≥ 0) and discontinuous at t = 0,