Self-concordant barriers
Here is the general definition of a self-concordant barrier (SCB).[2]: Def.3.1.1
Let C be a convex closed set in Rn with a non-empty interior. Let f be a function from interior(C) to R. Let M>0 be a real parameter. We say that f is a M-self-concordant barrier for C if it satisfies the following:
1. f is a self-concordant function on interior(C).
2. For every point x in interior(C), and any direction h in Rn, let gh be the function f restricted to the direction h, that is: gh(t) = f(x+t*h). Then the one-dimensional function gh should satisfy the following differential inequality:
.
Constructing SCBs
Due to the importance of SCBs in interior-point methods, it is important to know how to construct SCBs for various domains.
In theory, it can be proved that every closed convex domain in Rn has a self-concordant barrier with parameter O(n). But this “universal barrier” is given by some multivariate integrals, and it is too complicated for actual computations. Hence, the main goal is to construct SCBs that are efficiently computable.[4]: Sec.9.2.3.3
SCBs can be constructed from some basic SCBs, that are combined to produce SCBs for more complex domains, using several combination rules.
Substitution rule
Let G be a closed convex domain in Rn, and g an M-SCB for G. Let x = Ay+b be an affine mapping from Rk to Rn with its image intersecting the interior of G. Let H be the inverse image of G under the mapping: H = {y in Rk | Ay+b in G}. Let h be the composite function h(y) := g(Ay+b). Then, h is an M-SCB for H.[2]: Prop.3.1.1
For example, take n=1, G the positive half-line, and
. For any k, let a be a k-element vector and b a scalar. Let H = {y in Rk | aTy+b ≥ 0} = a k-dimensional half-space. By the substitution rule,
is a 1-SCB for H. A more common format is H = {x in Rk | aTx ≤ b}, for which the SCB is
.
The substitution rule can be extended from affine mappings to a certain class of "appropriate" mappings,[2]: Thm.9.1.1 and to quadratic mappings.[2]: Sub.9.3
Cartesian product rule
For all i in 1,...,m, let Gi be a closed convex domains in Rni, and let gi be an Mi-SCB for Gi. Let G be the cartesian product of all Gi. Let g(x1,...,xm) := sumi gi(xi). Then, g is a SCB for G, with parameter sumi Mi.[2]: Prop.3.1.1
For example, take all Gi to be the positive half-line, so that G is the positive orthant
. Let
is an m-SCB for G.
We can now apply the substitution rule. We get that, for the polytope defined by the linear inequalities ajTx ≤ bj for j in 1,...,m, if it satisfies Slater's condition, then
is an m-SCB. The linear functions
can be replaced by quadratic functions.
Intersection rule
Let G1,...,Gm be closed convex domains in Rn. For each i in 1,...,m, let gi be an Mi-SCB for Gi, and ri a real number. Let G be the intersection of all Gi, and suppose its interior is nonempty. Let g := sumi ri*gi. Then, g is a SCB for G, with parameter sumi ri*Mi.[2]: Prop.3.1.1
Therefore, if G is defined by a list of constraints, we can find a SCB for each constraint separately, and then simply sum them to get a SCB for G.
For example, suppose the domain is defined by m linear constraints of the form ajTx ≤ bj, for j in 1,...,m. Then we can use the Intersection rule to construct the m-SCB
(the same one that we previously computed using the Cartesian product rule).
SCBs for epigraphs
The epigraph of a function f(x) is the area above the graph of the function, that is,
. The epigraph of f is a convex set if and only if f is a convex function. The following theorems present some functions f for which the epigraph has an SCB.
Let g(t) be a 3-times continuously differentiable concave function on t>0, such that
is bounded by a constant (denoted 3*b) for all t>0. Let G be the 2-dimensional convex domain:
Then, the function f(x,t) = -ln(f(t)-x) - max[1,b2]*ln(t) is a self-concordant barrier for G, with parameter (1+max[1,b2]).[2]: Prop.9.2.1
Examples:
- Let g(t) = t1/p, for some p≥1, and b=(2p-1)/(3p). Then
has a 2-SCB. Similarly,
has a 2-SCB. Using the Intersection rule, we get that
has a 4-SCB.
- Let g(t)=ln(t) and b=2/3. Then
has a 2-SCB.
We can now construct a SCB for the problem of minimizing the p-norm:
, where vj are constant scalars, uj are constant vectors, and p>0 is a constant. We first convert it into minimization of a linear objective:
, with the constraints:
for all j in [m]. For each constraint, we have a 4-SCB by the affine substitution rule. Using the Intersection rule, we get a (4n)-SCB for the entire feasible domain.
Similarly, let g be a 3-times continuously differentiable convex function on the ray x>0, such that:
for all x>0. Let G be the 2-dimensional convex domain: closure({ (t,x) in R2: x>0, t ≥ g(x) }). Then, the function f(x,t) = -ln(t-f(x)) - max[1,b2]*ln(x) is a self-concordant barrier for G, with parameter (1+max[1,b2]).[2]: Prop.9.2.2
Examples:
- Let g(x) = x−p, for some p>0, and b=(2+p)/3. Then
has a 2-SCB.
- Let g(x)=x ln(x) and b=1/3. Then
has a 2-SCB.
SCBs for cones
- For the second order cone
, the function
is a self-concordant barrier.
- For the cone of positive semidefinite of m*m symmetric matrices, the function
is a self-concordant barrier.
- For the quadratic region defined by
where
where
is a positive semi-definite symmetric matrix, the logarithmic barrier
is self-concordant with 
- For the exponential cone
, the function
is a self-concordant barrier.
- For the power cone
, the function
is a self-concordant barrier.
History
As mentioned in the "Bibliography Comments"[5] of their 1994 book,[6] self-concordant functions were introduced in 1988 by Yurii Nesterov[7][8] and further developed with Arkadi Nemirovski.[9] As explained in[10] their basic observation was that the Newton method is affine invariant, in the sense that if for a function
we have Newton steps
then for a function
where
is a non-degenerate linear transformation, starting from
we have the Newton steps
which can be shown recursively
.
However, the standard analysis of the Newton method supposes that the Hessian of
is Lipschitz continuous, that is
for some constant
. If we suppose that
is 3 times continuously differentiable, then this is equivalent to
for all 
where
. Then the left hand side of the above inequality is invariant under the affine transformation
, however the right hand side is not.
The authors note that the right hand side can be made also invariant if we replace the Euclidean metric by the scalar product defined by the Hessian of
defined as
for
. They then arrive at the definition of a self concordant function as
.
Properties
Convex conjugate
If
is self-concordant, then its convex conjugate
is also self-concordant.[6][11]
Non-singular Hessian
If
is self-concordant and the domain of
contains no straight line (infinite in both directions), then
is non-singular.
Conversely, if for some
in the domain of
and
we have
, then
for all
for which
is in the domain of
and then
is linear and cannot have a maximum so all of
is in the domain of
. We note also that
cannot have a minimum inside its domain.
Applications
Among other things, self-concordant functions are useful in the analysis of Newton's method. Self-concordant barrier functions are used to develop the barrier functions used in interior point methods for convex and nonlinear optimization. The usual analysis of the Newton method would not work for barrier functions as their second derivative cannot be Lipschitz continuous, otherwise they would be bounded on any compact subset of
.
Self-concordant barrier functions
- are a class of functions that can be used as barriers in constrained optimization methods
- can be minimized using the Newton algorithm with provable convergence properties analogous to the usual case (but these results are somewhat more difficult to derive)
- to have both of the above, the usual constant bound on the third derivative of the function (required to get the usual convergence results for the Newton method) is replaced by a bound relative to the Hessian
Minimizing a self-concordant function
A self-concordant function may be minimized with a modified Newton method where we have a bound on the number of steps required for convergence. We suppose here that
is a standard self-concordant function, that is it is self-concordant with parameter
.
We define the Newton decrement
of
at
as the size of the Newton step
in the local norm defined by the Hessian of
at 
![{\displaystyle \lambda _{f}(x)=\langle f''(x)[f''(x)]^{-1}f'(x),[f''(x)]^{-1}f'(x)\rangle ^{1/2}=\langle [f''(x)]^{-1}f'(x),f'(x)\rangle ^{1/2}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ebdbe02bad547b15d7693147b0318bc981a97e8d)
Then for
in the domain of
, if
then it is possible to prove that the Newton iterate
![{\displaystyle x_{+}=x-[f''(x)]^{-1}f'(x)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/322071437533f8751f1457a642861d1afd834280)
will be also in the domain of
. This is because, based on the self-concordance of
, it is possible to give some finite bounds on the value of
. We further have

Then if we have

then it is also guaranteed that
, so that we can continue to use the Newton method until convergence. Note that for
for some
we have quadratic convergence of
to 0 as
. This then gives quadratic convergence of
to
and of
to
, where
, by the following theorem. If
then


with the following definitions



If we start the Newton method from some
with
then we have to start by using a damped Newton method defined by
![{\displaystyle x_{k+1}=x_{k}-{\frac {1}{1+\lambda _{f}(x_{k})}}[f''(x_{k})]^{-1}f'(x_{k})}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ca8d4a1ca3d9fac68d5fb62ba65cff0b40a11a54)
For this it can be shown that
with
as defined previously. Note that
is an increasing function for
so that
for any
, so the value of
is guaranteed to decrease by a certain amount in each iteration, which also proves that
is in the domain of
.