Content uploaded by Michael Soltys

Author content

All content in this area was uploaded by Michael Soltys on Mar 24, 2015

Content may be subject to copyright.

Proving properties of matrices over Z2

Michael Soltys

March 14, 2012

Abstract

We prove assorted properties of matrices over Z2, and outline the

complexity of the concepts required to prove these properties. The goal of

this line of research is to establish the proof complexity of matrix algebra.

It also presents a diﬀerent approach to linear algebra: one that is formal,

consisting in algebraic manipulations according to the axioms of a ring,

rather than the traditional semantic approach via linear transformations.

Keywords: Proof complexity, matrix identities, Frege and extended Frege.

1 Introduction

We are interested in the proof complexity of matrix algebra over the ﬁeld of two

elements GF(2). In particular, we examine the identity AB =I→BA =I

which has been proposed as a candidate for separating the Frege and extended

Frege propositional proof systems. We investigate the properties of matrices

that can be proven over the simplest of ﬁelds, with the hope that understanding

these properties will yield a low-complexity proof of AB =I→BA =I.

All matrices are considered to be over the ﬁeld of two elements {0,1}; in the

literature this ﬁeld is denoted as Z2or as GF(2). Let Mn×m(F) be the set of

n×mmatrices over a ﬁeld F; let M(n) := Mn×n(Z2), i.e., M(n) is the set of

all square n×nmatrices over Z2. If the size of a matrix Ais not speciﬁed, we

assume A∈M(n). We use Aij to denote entry (i, j) of the matrix A.

We let Atdenote the transpose of matrix A, i.e., the matrix whose (i, j )

entry is entry (j, i) of A. Let In,0nbe the identity matrix and zero matrix,

respectively, in M(n).

Given a matrix A∈M(n), we often ﬁnd it useful to represent Ait terms of

its principal minor, denoted MA, as follows:

a RA

SAMA,

where ais the top-left entry of A, i.e., a=A11, and

RA=A12 A13 . . . A1n,

SA=A21 A31 . . . An1t.(1)

1

The results in this paper can be interpreted in various extensions of the

theory LA, for example LAP or ∃LA, where LAP. The theory LA is capable

of rudimentary ring reasoning about matrices; the theory LAP adds matrix

powering, P(A, i) = Ai, and ∃LA permits induction on ΣB

1-formulas, which

are formulas with existential quantiﬁcation over matrices of bounded size. Over

Z2,LA corresponds to AC0(2), LAP to NC2(in fact, slightly weaker), and

∃LA corresponds to polynomial-time reasoning. See section 13, the appendix,

and [SC04] for more details.

Alternatively, we can employ theories corresponding to the complexity class

⊕L=AC0(det2), as deﬁned in [CF10] where the theory is called V⊕L. The

advantage of these theories over the LA-family of theories is that LA is ﬁeld

independent, and it is not clear that LA can put to use the fact that Z2is a

particularly simple ﬁeld.

2 Matrix identities

A motivation for this paper is to understand the complexity of concepts required

to prove:

AB =I→BA =I , (‘Inverse Identity’)

and related matrix identities (see [SC04]), and to understand the proof com-

plexity of combinatorial matrix algebra in general. That is, we would like to

know what is the complexity of the concepts required to reason about basic

linear algebra, and about the combinatorial applications of matrix algebra—as

presented, for example, in [BR91].

There are two main motivations for this line of research; ﬁrst, in reverse

mathematics we are interested in the weakest logical theories capable of for-

malizing linear algebra. But mainly, Cook proposed the ‘Inverse Identity’ as

a candidate for separating the Frege and extended Frege propositional proof

systems—this is one of the principal open questions in theoretical computer

science.

Lemma 1 If AB =Iand we can show that Ahas some left-inverse, then

BA =I; furthermore, this can be shown in LA. In other words,

LA `(AB =I∧ ∃C(CA =I)) →BA =I.

Proof: AB =Iimplies ABA =A, so A(BA −I) = 0; so if Ahas some left

inverse, call it C, then CA(BA −I) = 0, so BA =I.

3 Powers and products

Over the ﬁeld Z2,c+c= 0, and hence, for any A∈M(n), we have that

A+A= 0n. Lemmas 2 and 3 come from [Cob58]. The lemmas in this section

can be shown in LAP augmented by the index function y= 2x, which we deﬁne

2

as exp(0) = 1 and exp(i+ 1) = 2 ·exp(i). This permits repeated squaring of a

matrix, polynomially many times of course.

Lemma 2 (I+A)2i=I+A2i.

Proof: (I+A)2= (I+A)(I+A) = I+A+A+A2=I+A2.

Lemma 3 (I+A)2i=I⇐⇒ A2i= 0.

Proof: A2i= 0 ⇐⇒ I+A2i=I⇐⇒ (I+A)2i=I, where the last equality

follows by lemma 2.

Lemma 4 If (I+A)2i−1=Iand AB =I, then A2i−1=I.

Proof: Suppose that (I+A)2i−1=I; multiply both sides by (I+A) to obtain

(I+A)2i= (I+A) and using lemma 2 we have I+A2i=I+A, and so

A2i=A⇒A2iB=AB ⇒A2i−1=I.

Lemma 5 AB =BA ⇐⇒ (I+A)(I+B)=(I+B)(I+A).

Lemma 6 If A, B are inverses, i.e., AB =BA =I, then (A+B)2i=A2i+B2i.

Proof: First note that (I+A)(I+B) = I+A+B+AB =I+A+B+I=A+B;

similarly (I+B)(I+A) = I+B+A+BA =B+A, and in particular

(I+A)(I+B)=(I+B)(I+A). Therefore:

(A+B)2i= (I+A)2i(I+B)2i

(∗)

= (I+A2i)(I+B2i)

=I+A2i+B2i+A2iB2i

=I+A2i+B2i+I

=A2i+B2i,

where the (∗) equality follows from lemma 2.

4 Idempotence, nilpotence and zero-divisors

A matrix Ais idempotent if A2=A, it is nilpotent if Ai= 0 for some i > 0 and

it is a right-zero-divisor (respectively, left-zero-divisor) if there exists a matrix

C6= 0 such that AC = 0 (respectively, CA = 0). If Ais a right-zero-divisor

then it is also a left-zero-divisor, but the Cmight diﬀer; for example, if

A=1 1

0 0 , C =1 0

1 0 , D =0 1

0 1 ,

then AC = 0, but CA 6= 0 while DA = 0.

3

If AB =Ithen LA proves that BA is idempotent, and also that neither

Acan be a left-zero-divisor nor Bcan be right-zero-divisors; for example, if

AB =Iand BC = 0, then ABC = 0, so C= 0.

By lemma 2, if Ais idempotent, so is (I+A).

If AB =I, then LAP shows that neither Anor Bcan be nilpotent. Suppose

Ai= 0. Then AiBi= 0 but AiBi=I.

5 Symmetric matrices

Let Aby a symmetric matrix, i.e., A=At, that is Aequals its transpose.

We say that x, y are A-orthogonal if xtAy = 0; we sometimes write this as

hx, yiA=xtAy.

Lemma 7 If Ais a symmetric matrix, then V={x∈Zn

2:hx, xiA= 0}is a

vector space.

Proof: If Ais symmetric then:

hx, yiA=xtAy = (xtAy)t=ytAt(xt)t=ytAx =hy , xiA.

Then, if x, y ∈Vthen:

h(x+y),(x+y)iA= (x+y)tA(x+y)

=xtAx +ytAx +xtAy +ytAy

=hx, xiA+hy, xiA+hx, yiA+hy , yiA,

and hx, xiA=hy, yiA= 0 since we assumed that x, y ∈V, and

hy, xiA+hx, yiA= 0,

as well since hy, xiA=hx, yiA.

Lemma 8 LA proves that if Ais symmetric, and AB =I, then BA =I, and

furthermore Bis also symmetric.

Proof: AB =Ithen I=It= (AB)t=BtAt=BtA, as Ais symmetric.

By lemma 1 we have BA =I. On the other hand, from BtA=Iwe have

BtAB =Band so Bt=B, and hence Bis also symmetric.

Let d(A) denote the diagonal of a symmetric matrix A, i.e.,

d(A)=[A11A22 . . . Ann ].

In an unpublished note, Filmus ([Fil10]) make an interesting observation, which

we present as lemma 9. We give a simpliﬁed proof, however, in the style of

Gaussian Elimination.

Lemma 9 ∃LA proves that for all symmetric A,∃vsuch that Av =d(A).

4

Proof: The proof is by induction on n,A∈M(n). For n= 1 let v= 1. For

n > 1 let

A=a X

XtM,

and we consider the following cases: if X= 0 then by IH ∃vMsuch that MvM=

d(M), where Mis the principal submatrix of A(and Mis also symmetric if A

is symmetric). So let v= [1 vM], and then Av =d(A). If X6= 0 and a= 1

then let

C=1 0

XtIn−1,

and observe that

A0=CAC t

=1 0

XtIn−1 a X

XtM 1X

0In−1

=1 0

XtIn−1 a aX +X

XtXtX+M

=a aX +X

aXt+XtaXtX+XtX+XtX+M

=a aX +X

aXt+XtaXtX+M

=1 0

0XtX+M

where we remind the reader that over Z2,x+x= 0, and so XtX+XtX= 0

and since a= 1, aX +X=X+X= 0. Using the IH we know that ∃wsuch

that A0w=d(A0), i.e., CAC tw=d(CACt), and since CC =In(over Z2), we

have

A(Ctw) = Cd(C ACt)

=C1

d(XtX+M)

=C1

Xt+d(M)

were d(XtX+M) = Xt+d(M) follows (over Z2) from the fact that the diagonal

of XtXequals [ x1x1x2x2. . . xn−1xn−1] which in turn is just Xsince

xixi=xi,

=1

Xt+Xt+d(M)

=1

d(M).

5

Since a= 1, A(Ctw) = d(A) and so letting v=Ctwwe are done showing that

in the case X6= 0 and a= 1, ∃vsuch that Av =d(A).

If, on the other hand, X6= 0 and a= 0, there are two possibilities. First,

d(M) = 0, in which case v= 0 and Av = 0 = d(A), or there is some diagonal

entry of M, say Mii, which is not equal to zero; let Pibe the identity matrix

with row 1 and row ipermuted. Then, we can repeat the argument for the

second case with A0= (CPi)A(C Pi)t, since A0=C(PiAP t

i)Ctwhich has the

eﬀect of bringing Mii 6= 0 (and hence Mii = 1) to the position (1,1) of A.

Symmetric matrices over ﬁnite ﬁelds have been considered in [Mac69], where,

in section I, the author shows the following interesting results—originally due

to A. A. Albert, and can be found in [Alb38]:

Theorem 1 If A∈M(n)is an invertible symmetric matrix of GF(2m)then A

can be factored in the form A=MtMif and only if d(A)6= 0, i.e., some entry

on the diagonal of Ais not zero.

The proof of theorem 1, as presented in [Mac69], can be formalized in ∃LA.

6 Trace

The proof of lemma 10 can be formalized in LAP with exp (as deﬁned in

section 3), while lemma 11 can be (obviously) formalized in LA.

Lemma 10 tr(A) = tr(A2i)for all i.

Proof: Note that

a RA

SAMA2

=a2+RASAaRA+MARA

aSA+MASASARA+M2

A,

so

tr(A2) = a2+RASA+ tr(SARA+M2

A)

=a+RASA+ tr(SARA) + tr(M2

A)

=a+RASA+RASA+ tr(M2

A),

and again RASA+RASA= 0 and tr(M2

A) = tr(MA) by induction; an induction

that can be carried out in LA. On the other hand, tr(A) = tr(A2i) can be

carried out in LAP.

Lemma 11 tr(AtA) = tr(AAt) = Pi,j Aij .

Proof: tr(AtA) = Pi(AtA)ii =Pi,j At

ij Aji =Pi,j Aj iAj i =Pij Aij .

In fact, from the proof of lemma 11 we see that (AtA)ii is the sum of the

elements in row iof A.

6

7 Annihilating polynomials

We say that p(x)6= 0 is an annihilating polynomial of Aif p(A) = 0. Of course,

the annihilating polynomial par excellence of any matrix is its characteristic

polynomial; this is the famous Cayley-Hamilton theorem that can be shown in

∃LA (see [SC04]).

Lemma 12 LA proves that p(A)2=p(A2).

Proof: p(A)2=PiaiAi2=Pi,j aiajAi+j=Pia2

iA2i, and we lost terms

where i6=jsince aiajAi+j+ajaiAj+i= 0, so p(A)2=Piai(Ai)2=p(A2).

It follows from lemma 12 that p(A)2i=p(A2i), and that this can be shown

in LAP with exp. The following lemma is an immediate consequence of the

previous one.

Lemma 13 If p(x)is an annihilating polynomial for A, then p(x)is also an

annihilating polynomial for A2ifor all i.

Proof: Suppose p(A) = 0. Then 0 = p(A)2i=p(A2i).

Lemma 14 If AB =Iand p(x)is an annihilating polynomial for A, then we

can prove, in LAP, that BA =I. That is,

LAP `(AB =I∧p6= 0 ∧p(A) = 0) →BA =I.

Proof: Suppose that p(A) = a0I+a1A+· · · +akAk= 0. As p6= 0, let aibe

the smallest non-zero coeﬃcient of this polynomial; so, in eﬀect,

p(A) = aiAi+ai+1Ai+1 +· · · +akAk.

Consider

0 = p(A)Bi=aiI+ai+1A+· · · +akAk−i

=I+A(ai+1I+· · · +akAk−i−1)

=I+ (ai+1I+· · · +akAk−i−1)A,

i.e., (ai+1I+· · · +akAk−i−1) is the (two-sided) inverse of A, and so AB =I

implies ABA =Awhich implies A(BA −I) = 0, and now using the inverse of

Awe obtain BA −I= 0 and so BA =I.

8 Pigeonhole principle and counting

Let PHP denote the Pigeonhole principle. In this section we present three

proofs of the ‘Inverse Identity’ that can be formalized in LAP augmented with

PHP over sets of exponential size. What is interesting about these “counting

arguments” is that they dispense with linear algebra in proving the ‘Inverse

Identity’; they rely on the ﬁniteness of the underlying ﬁeld, and basic ring

properties of matrix addition and multiplication. As such, they oﬀer a limited

proof-complexity insight into the ‘Inverse Identity’.

7

Proof I of the ‘Inverse Identity’

This is a simple proof of the ‘Inverse Identity’, which extends easily to any ﬁnite

ﬁeld. It uses PHP over sets of size 2n2for matrices in M(n).

Consider the sequence I, A, A2, A3, . . . , A2n2

. Since |M(n)|= 2n2it follows

by PHP that there exist 0 ≤i < j ≤2n2such that Ai=Aj; but then, using

AB =I, we have AiBi=AjBi⇒I=Aj−i, where j−i > 0, and so Aj−i−1is

a (left and right) inverse of A, and using lemma 1 we are done.

Proof II of the ‘Inverse Identity’

Let Φ : M(n)−→ M(m) be a mapping.

Lemma 15 If n>m, then ∃Y∈M(m)such that

|{X∈M(n) : Φ(X) = Y}| ≥ 2(n−m)2.

Proof: Let SΦ(Y) := {X∈M(n) : Φ(X) = Y}. Since

M(n) = [

Y∈M(m)

SΦ(Y)

it follows that |M(n)|≤|M(m)| · max{|SΦ(Y)|:Y∈M(m)}, and so we have

that 2(n−m)2≤2n2−m2≤max{|SΦ(Y)|:Y∈M(m)}.

Suppose that AB =I, and let ΦA:M(n+ 1) −→ M(n) be a mapping

deﬁned as follows:

ΦA(C) := c0A+c1A2+· · · +c(n+1)2−1A(n+1)2−1,(2)

where Ais assumed to be n×n, and entry cjis C1+div(j,n),rem(j,n), i.e., given

q, r such that j=qn +rwhere 0 ≤r < n, then cjis entry cq+1,r. By lemma 15

we have that there is a Y∈M(n) such that there exist C6=C0mapping to it,

i.e., ΦA(C) = ΦA(C0), and so ΦA(C)+ΦA(C0) = 0. Therefore, we obtain an

annihilating polynomial for Aof degree (n+ 1)2−1; by lemma 14 we are done.

Proof III of the ‘Inverse Identity’

We deﬁne ΦA(C) as in (2), and we present a variation of the argument in

proof II. Since |M(n+ 1)|= 2(n+1)2, one of the following two must hold:

|{C∈M(n+ 1) : (ΦA(C))11 = 0}| ≥ 2(n+1)2/2,

|{C∈M(n+ 1) : (ΦA(C))11 = 1}| ≥ 2(n+1)2/2.

We pick the set for which it is true; we now repeat the argument with the C’s

in that set and (ΦA(C))12. We end up with C, C 0such that ΦA(C) = ΦA(C0),

and once again obtain an annihilating polynomial for A.

8

9 Gaussian Elimination

We give a proof of the ‘Inverse Identity’ based on the Gaussian elimination

algorithm; this proof can be formalized in ∃LA, and hence it is a polynomial

time proof. In fact, in [Sol02, TS05] it has been shown that polysize extended

Frege can prove the correctness of the Gaussian elimination procedure (over Z2,

and over bigger ﬁelds as well). ∃LA allows induction over formulas asserting

the existence of matrices; such formulas can express the existence of row and

column operations necessary to carry out the Gaussian elimination algorithm.

Hence ∃LA proves the correctness of Gaussian elimination (every matrix can

be put in upper triangular form), and this in turn can be employed to prove the

‘Inverse Identity’.

Suppose that AB =I; we show that Ahas some left-inverse (lemma 1).

Recall the deﬁnition of SAin (1). If SAis zero then repeat the argument

inductively on MAMB=In−1.

Otherwise, if SA6= 0, let Pbe a permutation matrix deﬁned as follows: if

a= 1, P=In; if a= 0, then Pswaps the ﬁrst row of Awith a row whose

ﬁrst entry is non-zero; Pis just the identity matrix with the corresponding rows

swapped. Also let

C=1 0

SP A In−1,

that is, Cis the identity matrix where the ﬁrst column, except for the top entry,

is replaced by the corresponding entries in P A. Observe that CC =P P =In.

Then

AB =In⇒(CP )AB(P C )=(C P )In(P C)

⇒(CP A)(B P C) = C(P P )C=C C =In,

and

CP A =1 0

SP A In−1 1RP A

SP A MP A =1RP A

0SP ARP A +MP A .

We now repeat the argument inductively on MCP AMB P C =In−1. Notice that

this proof is intrinsically sequential, as at each step we construct a new matrix,

and we need the previous steps to do that.

10 Quasi-triangular matrices

In this section we explore the following question: what is the weakest condition

on matrices A, B that allows a proof of the ‘Inverse Identity’ in LA? We already

know from lemma 8 that if A, B are symmetric matrices then LA proves the

‘Inverse Identity’ for such matrices. Here, inspired by the Gaussian elimination

proof in the previous section, we deﬁne the notion of a “quasi-triangular pair”

of matrices.

9

Given A∈M(n), let MA,i be its i-th minor. That is MA,0=A,MA,1=MA,

and for i≥2, MA,i =MMA,i−1, i.e., MA,i is Awith the ﬁrst irows and columns

removed. Let ai, RA,i , SA,i be deﬁned as follows: ai= (MA,i−1)ii =Aii, and

RA,i =RMA,i−1and SA,i =SMA,i−1. We say that a matrix Ais quasi-triangular

if for all iwe have that RA,i is zero or SA,i is zero; it is clear how this is a

generalization of the notion of a triangular matrix.

We deﬁne recursively what it means for a matrix Ato be a quasi-transpose

of a matrix B; if A, B ∈M(1), then Ais a quasi-transpose of Bif A=B. For

i > 1, Ais a quasi-transpose of Bif A11 =B11 and either RA=RB∧SA=SB,

or RA=St

B∧SA=Rt

B, and MAis a quasi-transpose of MB.

Lemma 16 The matrix Ais quasi-triangular if Ais the quasi-transpose of a

triangular matrix.

Finally, we say that matrices (A, B) are a quasi-triangular pair if for all i

at least one of {RA,i, SA,i , RB,i , SB,i }is zero. Observe that if (A, B) is a quasi-

triangular pair then so is (MA, MB); indeed, the point of this deﬁnition is to

ﬁnd as weak a condition on A, B as possible to ensure that if AB =In, then

MAMB=In−1, which allows induction on LA formulas and consequently an

LA proof of the ‘Inverse Identity’.

Lemma 17 LA proves that if (A, B)is a quasi-triangular pair, and AB =I,

then BA =I.

Proof: Suppose that A, B ∈M(n), and they are a quasi-triangular pair. We

prove that LA `AB =I→BA =I, by induction on n, by cases on the

deﬁnition of a “quasi-triangular pair.”

AB =a RA

SAMA b RB

SBMB=ab +RASBaRB+RAMB

bSA+MASBSARB+MAMB(3)

Case 1: SA= 0 or RB= 0, then we can see from equation (3) that

AB =ab +RASBaRB+RAMB

bSA+MASBMAMB,

since SARB= 0 ⇐⇒ SA= 0 ∨RB= 0, and since AB =In, it follows that

MAMB=In−1, and given that (A, B) is a quasi-triangular pair, so is (MA, MB),

and hence by induction MBMA=In−1.

Suppose now that SA= 0. Then,

BA =ba bRA+RBMA

aSBSBRA+In−1,

and from (3) we see that 0 = bSA+MASB=MASB, and since MBMA=In−1

it follows that SB= 0, so ab = 1 and so ba = 1. Finally,

bRA+RBMA=RA+RBMA=RA+RAMBMA=RA+RA= 0.

10

Therefore, BA =In. The case where RB= 0 is symmetric.

Case 2: SB= 0 or RA= 0, then since ab+RASB= 1 it follows that ab = 1,

and so a=b= 1. Therefore,

In=AB =1RB+RAMB

SA+MASBSARB+MAMB,

thus MASB=SAand RAMB=RB, and hence

MASBRA

| {z }

=0

MB+MAMB=In−1,

and so MAMB=In−1. By induction we conclude that MBMA=In−1, and we

are done.

11 Lanczos algorithm

The “Block Lanczos Algorithm for Finding Dependencies over a Finite Field”

was invented by Peter L. Montgomery ([Mon95]). Recall that if AB =I, then

from basic algebraic manipulations we obtain that A(BA −I) = 0; thus, if we

could show that Ahas any left-inverse, we would be able to show the ‘Inverse

Identity’. As symmetric matrices have a lot of nice properties, perhaps one could

work with the symmetric matrix ˆ

A=AtAinstead of working with A. Note that

showing that ˆ

Ahas a left-inverse would still prove, from A(BA −I) = 0, that

BA =I. This is because A(BA −I) = 0 implies AtA(BA −I) = 0, and so

ˆ

A(BA −I) = 0. This section suggests a connection between the “Block Lanczos

Algorithm” and our ‘Inverse Identity’.

Let Abe a symmetric matrix and suppose that we have a set of mvectors

{w1, w2, . . . , wm}, and that they satisfy the following three conditions:

wt

iAwi6= 0 1 ≤i≤m

wt

iAwj= 0 i6=j

∃X(AW =W X )

(4)

where the second condition says that the wi’s are A-orthogonal, as deﬁned in

section 5, and where in the last condition W=w1w2. . . wm, i.e.,

Wis a matrix whose rows are the wi’s. Note that saying ∃X(AW =W X)

is equivalent to stating that for all wi,Awi∈span{w1, w2, . . . , wm}. Suppose

further that ∃y(b=W y) and deﬁne the vector vas follows:

v:=

m

X

i=1

wt

ib

wt

iAwi

wi.(5)

Theorem 2 LA `[(4) ∧ ∃y(b=W y)∧(5)] →Av =b.

11

Proof: We prove the theorem in the case F=Z2. Over the ﬁeld Z2we have the

implication wt

iAwi6= 0 ⇒wt

iAwi= 1, and hence deﬁnition (5) can be restated

as v:= Pm

i=1(wt

ib)wi. Then:

Wt(Av −b) = Wt(A(

m

X

i=1

(wt

ib)wi)−b)(∗)

= (

m

X

i=1

(wt

ib)WtAwi)−Wtb

= (

m

X

i=1

(wt

ib)(wt

iAwi)ei)−Wtb(∗∗)

= (

m

X

i=1

(wt

ib)ei)−Wtb=Wtb−Wtb= 0,

where (∗) and (∗∗) follow from (4) since wt

jAwi= 0 for j6=iand wt

iAwi= 1,

respectively.

Thus we obtain Wt(Av −b) = 0 and hence, for any n×mmatrix X,

(W X )t(Av −b) = 0. By (4) ∃X(AW =W X ), and so (AW )t(Av −b) = 0, and

since Ais symmetric it follows that

WtA(Av −b)=0.(6)

By assumptions ∃y(Av −b=W y), i.e., Av −b=Pm

i=1 yiwi, and so multiplying

on the left by wt

iAwe conclude, for every i, that yi= 0. Hence Av −b= 0 and

so Av =b.

Note that given (4) and (5) and ∃y(b=W y), then Av =bfor any ﬁeld F,

ﬁnite or inﬁnite.

Suppose that AB =I, and consider the matrix ˆ

A=AtA. Represent Bas

b1b2. . . bnwhere biis the i-th column of B. Then observe:

bt

iˆ

Abj=bt

iAtAbj= (Abi)t(Abj) = et

iej=(1 if i=j

0 if i6=j

and thus Bsatisﬁes the ﬁrst two Lanczos conditions for ˆ

A=AtAgiven in (4).

The third Lanczos condition is equivalent to BA =I.

12 Open questions

If AB =I, can we show that AtA(which is symmetric) is also invertible? Since

LA `AB =I→A(BA −I) = 0, it follows that (AtA)(BA −I) = 0 is

provable from AB =Iin LA. Once we have a left-inverse for AtAwe would

have BA =I. It would be interesting to see if Lanczos’ algorithm can be of

help here.

Matrices over Z2are often employed as adjacency matrices; that is, given a

graph G= ([n], E), where E⊆[n]×[n], it can be represented by AG∈M(n)

as follows: (i, j)∈E⇐⇒ Aij = 1. But matrices over Z2can be also seen

as incidence matrices where given a collection of nelements, and a collection

X1, X2, . . . , Xmof subsets of [n], i∈Xk⇐⇒ Aij = 1, where A∈Mn×m(Z2).

An interesting paper examining incidence matrices is [Rys60].

12

13 Appendix

The logical theory LA is strong enough to prove the ring properties of matrices

such as A(BC)=(AB)C, A +B=B+A, but weak enough so that the theorems

of LA translate into propositional tautologies with short Frege proofs. LA

has three sorts of object: indices (i.e., natural numbers), ring elements, and

matrices, where the corresponding variables are denoted i, j, k, ...;a, b, c, ...; and

A, B, C, ..., respectively. The semantic assumes that objects of type ring are from

a ﬁxed but arbitrary ring (for the purpose of this paper we are only interested

in Z2, which is a ﬁeld), and objects of type matrix have entries from that ring.

Terms and formulas are built from the following function and predicate sym-

bols, which together comprise the language LLA:

0index,1index ,+index,∗index ,−index,div,rem,

0ring,1ring ,+ring,∗ring ,−ring,−1,r,c,e,Σ,

≤index,=index ,=ring,=matrix ,condindex,condring

(7)

The intended meaning should be clear, except in the case of −index, cut-oﬀ

subtraction, deﬁned as i−j= 0 if i<j. For a matrix A:r(A),c(A) are the

numbers of rows and columns in A,e(A, i, j) is the ring element Aij (where

Aij = 0 if i= 0 or j= 0 or i > r(A) or j > c(A)), Σ(A) is the sum of the

elements in A. Also cond(α, t1, t2) is interpreted if αthen t1else t2, where α

is a formula all of whose atomic sub-formulas have the form m≤nor m=n,

where m, n are terms of type index, and t1, t2are terms either both of type

index or both of type ring. The subscripts index ,ring, and matrix are usually

omitted, since they ought to be clear from the context.

We use n, m for terms of type index, t, u for terms of type ring, and T , U for

terms of type matrix. Terms of all three types are constructed from variables

and the symbols above in the usual way, except that terms of type matrix are

either variables A, B, C, ... or λ-terms λijhm, n, ti. Here iand jare variables

of type index bound by the λoperator, intended to range over the rows and

columns of the matrix. Also m, n are terms of type index not containing i, j

(representing the numbers of rows and columns of the matrix) and tis a term

of type ring (representing the matrix element in position (i, j)).

Atomic formulas have the forms m≤n, m =n, t =u, T =U, where the

three occurrences of = formally have subscripts index,ring ,matrix, respectively.

General formulas are built from atomic formulas using the propositional con-

nectives ¬,∨,∧and quantiﬁers ∀,∃.

13.1 Axioms and rules of LA

For each axiom listed below, every legal substitution of terms for free variables

is an axiom of LA. Note that in a λterm λijhm, n, tithe variables i, j are

bound. Substitution instances must respect the usual rules which prevent free

variables from being caught by the binding operator λij. The bound variables

i, j may be renamed to any new distinct pair of variables.

13

13.1.1 Equality Axioms

These are the usual equality axioms, generalized to apply to the three-sorted

theory LA. Here = can be any of the three equality symbols, x, y, z are variables

of any of the three sorts (as long as the formulas are syntactically correct). In

A4, the symbol fcan be any of the non-constant function symbols of LA.

However A5 applies only to ≤, since this in the only predicate symbol of LA

other than =.

A1 x=x

A2 x=y→y=x

A3 (x=y∧y=z)→x=z

A4 x1=y1, ..., xn=yn→fx1...xn=f y1...yn

A5 i1=j1, i2=j2, i1≤i2→j1≤j2

13.1.2 Axioms for indices

These are the axioms that govern the behavior of index elements. The index

elements are used to access the entries of matrices, and so we need to deﬁne

some basic number theoretic operations.

A6 i+ 1 6= 0

A7 i∗(j+ 1) = (i∗j) + i

A8 i+ 1 = j+ 1 →i=j

A9 i≤i+j

A10 i+ 0 = i

A11 i≤j∧j≤i

A12 i+ (j+ 1) = (i+j)+1

A13 [i≤j∧j≤i]→i=j

A14 i∗0=0

A15 [i≤j∧i+k=j]→j−i=k

A16 ¬(i≤j)→j−i= 0

A17 [α→cond(α, i, j) = i]∧[¬α→cond(α, i, j ) = j]

13.1.3 Axioms for a ring

These are the axioms that govern the behavior for ring elements; addition and

multiplication, as well as additive inverses. We do not need multiplicative in-

verses.

A18 06= 1 ∧a+ 0 = a

A19 a+ (−a)=0

A20 1∗a=a

A21 a+b=b+a

A22 a∗b=b∗a

A23 a+ (b+c)=(a+b) + c

A24 a∗(b∗c)=(a∗b)∗c

14

A25 a∗(b+c) = a∗b+a∗c

A26 [α→cond(α, a, b) = a]∧[¬α→cond(α, a, b) = b]

13.1.4 Axioms for matrices

Axiom A27 states that e(A, i, j) is zero when i, j are outside the size of A. Axiom

A28 deﬁnes the behavior of constructed matrices. Axioms A29-A32 deﬁne the

function Σ recursively by ﬁrst deﬁning it for row vectors, then column vectors

(At:= λijhc(A),r(A), Aj ii), and then in general using the decomposition (8).

Finally, axiom A33 takes care of empty matrices.

A27 (i= 0 ∨r(A)< i ∨j= 0 ∨c(A)< j)→e(A, i, j )=0

A28 r(λijhm, n, ti) = m∧c(λijhm, n, ti) = n∧[1 ≤i∧i≤m∧1≤j∧j≤n]

→e(λijhm, n, ti, i, j ) = t

A29 r(A)=1,c(A) = 1 →Σ(A) = e(A, 1,1)

A30 r(A)=1∧1<c(A)→Σ(A) = Σ(λijh1,c(A)−1, Aij i) + A1c(A)

A31 c(A)=1→Σ(A) = Σ(At)

A32 1<r(A)∧1<c(A)→Σ(A) =e(A, 1,1) + Σ(R(A)) + Σ(S(A)) + Σ(M(A))

A33 r(A)=0∨c(A)=0→ΣA= 0

Where

R(A) := λijh1,c(A)−1,e(A, 1, i + 1)i,

S(A) := λijhr(A)−1,1,e(A, i + 1,1)i,

M(A) := λijhr(A)−1,c(A)−1,e(A, i + 1, j + 1)i.

(8)

13.1.5 Rules for LA

In addition to all the axioms just presented, LA has two rules: matrix equality

and induction.

Matrix equality rule

From the three premises:

1. e(T, i, j) = e(U, i, j )

2. r(T) = r(U)

3. c(T) = c(U)

we conclude T=U.

The only restriction is that the variables i, j may not occur free in T=U;

other than that, Tand Ucan be arbitrary matrix terms. Our semantics implies

that iand jare implicitly universally quantiﬁed in the top formula. The rule

allows us to conclude T=U, provided that Tand Uhave the same numbers of

rows and columns, and corresponding entries are equal.

15

Induction rule α(i)→α(i+ 1)

α(0) →α(n)

Here α(i) is any formula, nis any term of type index, and α(n) indicates nis

substituted for free occurrences of iin α(i). (Similarly for α(0).)

This completes the description of LA. We ﬁnish this section by observing

the substitution property in the lemma below. We say that a formula S0of

LA is a substitution instance of a formula Sof LA provided that S0results by

substituting terms for free variables of S. Of course each term must have the

same sort as the variable it replaces, and bound variables must be renamed as

appropriate.

Lemma 18 Every substitution instance of a theorem of LA is a theorem of

LA.

This follows by straightforward induction on LA proofs. The base case

follows from the fact that every substitution instance of an LA axiom is an LA

axiom.

References

[Alb38] A. Adrian Albert. Symmetric and alternating matrices in an arbitrary

ﬁeld, i. Transactions of the American Mathematical Society, 43(3):386–

436, May 1938.

[BR91] Richard A. Brualdi and Herbert J. Ryser. Combinatorial Matrix The-

ory. Cambridge University Press, 1991.

[CF10] Stephen Cook and Lila Fontes. Formal theories for linear algebra.

Presented at the Federated Logic Conference, 2010.

[Cob58] S. M. Cobb. On powers of matrices with elements in the ﬁeld of integers

modulo 2. The mathematical gazette, 42(342):267–271, December 1958.

[Fil10] Yuval Filmus. Range of symmetric matrices over GF(2). Unpublished

note, University of Toronto, January 2010.

[Mac69] Jessie MacWilliams. Orthogonal matrices over ﬁnite ﬁelds. The Amer-

ican Mathematical Monthly, 76(2):152–164, February 1969.

[Mon95] Peter L. Montgomery. A block Lanczos algorithm for ﬁnding depen-

dencies over GF(2). In EUROCRYPT, pages 106–120, 1995.

[Rys60] H. J. Ryser. Matrices of zeros and ones. Bul letin of the American

Mathematical Society, 66(6):442–464, February 1960.

[SC04] Michael Soltys and Stephen A. Cook. The complexity of derivations of

matrix identities. Annals of Pure and Applied Logic, 130(1–3):277–323,

December 2004.

16

[Sol02] Michael Soltys. Extended Frege and Gaussian Elimination. Bulletin

of the Section of Logic, 31(4):1–17, 2002.

[TS05] Neil Thapen and Michael Soltys. Weak theories of linear algebra.

Archive for Mathematical Logic, 44(2):195–208, 2005.

17