Content uploaded by Juan Uriagereka
Author content
All content in this area was uploaded by Juan Uriagereka on Jun 20, 2015
Content may be subject to copyright.
Available via license: CC BY-NC 4.0
Content may be subject to copyright.
SOME THOUGHTS ON ECONOMY WITHIN LINGUISTICS *
(Algumas Observações sobre Economia dentro da Lingüística)
Juan URIAGEREKA
(University of Maryland at College Park)
ABSTRACT: One of the cornerstones of Chomsky’s Minimalist Program is the
role played by economy. This paper discusses different ways in which
Chomsky’s notion of economy in linguistics can be understood, given current
views on dynamic systems and, in particular, on evolution in biological
systems.
KEY WORDS: Economy in Linguistics, Minimalism, Exaptation, Dynamic
Systems
RESUMO: Um dos pontos principais do Programa Minimalista de Chomsky é
o papel desempenhado pela noção de economia. Este trabalho discute
várias maneiras como essa noção de economia em lingüística pode ser en-
tendida em face de recentes concepções sobre sistemas dinâmicos e, em
particular, sobre evolução nos sistemas biólogicos.
PALAVRAS-CHAVE: Economia em Lingüística, Minimalismo, Exaptação, Siste-
mas Dinâmicos
1. Three (more or less) reasonable takes on Minimalism
I’ll start with a quote by Stephen Jay Gould, who I take to have
understood the significance of generative grammar when accounting
for faculty psychology within the confines of evolution. “The traits,” he
writes in (1991: 59), “that Chomsky (1986) attributes to language –
universality of the generative grammar, lack of ontogeny, . . . highly
peculiar and decidedly non-optimal structure, formal analogy to other
attributes, including our unique numerical faculty with its concept of
discrete infinity – fit far more easily with an exaptive, rather than a
adaptive, explanation.” By an exaptation Gould means an individual
feature that did not emerge adaptively for its current purpose, but was
*The research behind this note was partly funded by NSF grant SBR9601559. I am indebted
to Elena Herburger, Jairo Nunes, Carlos Otero, and Phil Resnik for specific comments on an
earlier draft.
D.E.L.T.A., Vol. 16, N.º ESPECIAL, 2000 (221-243)
222 D.E.L.T.A., Vol. 16, N.º ESPECIAL
co-opted by the individual; for example, for Gould brain size is not the
consequence of “intelligence” related to language, rather the brain got
big for whatever reason (e.g. circulatory benefits), which somehow
caused linguistic competence.
Another thinker who eloquently presents a view like Gould’s is
David Berlinski, who writes in (1986: 130): “mathematicians thought
they might explain the constraints of grammar on such grounds as
effectiveness or economy. . . In fact, the rules of English grammar appear
to owe nothing to any principles of economy in design.” This line is
well-known from the work of, in particular, Jerry Fodor, who has
systematically built arguments against an unstructured mind,
constructivism, connectionism, gradualism, or behaviorism, on the basis
of clever, yet sub-obtimal quirks of language that linguists have found.
Such comments emphasizing the “decidedly non-optimal structure”
of language or “grammar ... ow[ing] nothing to ... economy in design”
seem at odds with Chomsky’s Minimalist program. I can only see four
possible routes one can take in light of that. The most reasonable one is
that the Minimalist Program is just too good to be true; maybe we, linguists,
have planted the elegance that we now harvest, unaware of our acts. Even
Chomsky constantly admits that the program is partly a bold speculation,
rather than a specific hypothesis. So it could just be all wrong.
A second possibility is that, right though the program may be, it is
not meant to square its basic tenets with the Gouldian rhetoric behind
them. Suppose that the computational system of Human language is in
some interesting sense optimal; why this should be is the obvious
question. Schoemaker (1991) analyzes optimality as a possible
organizing principle of nature, instantiated in terms of least action in
physics, entropy in chemistry, survival of the fittest in biology, and utility
maximization in economics, among the more or less “natural” sciences.
If we play in this key, one could try to argue that economy in linguistics
should reduce to survival of the fittest.
Such a view is reasonably defended by important scientists, perhaps
most popularly by Pinker in his 1994 best-seller:
Selection could have ratcheted up language abilities by favoring the speakers
in each generation that the hearers could best decode, and the hearers who
could best decode the speakers. . . Grammars of intermediate complexity. . .
URIAGEREKA 223
could have symbols with a narrower range, rules that are less reliably applied,
modules with fewer rules. . . I suspect that evolving humans lived in a world in
which language was woven into the intrigues of politics, economics, technology,
family, sex, and friendship that played key roles in individual reproductive
success. (p. 365-9)
Of course, this adaptation story still raises non-trivial questions about
what is selective about, say, having thousands of languages, or a parser
which doesn’t decipher some simple grammatical sentences like the
mouse the cat the dog bit chased left, or why parsable and sound sentences
– like *who do you think that left – are ungrammatical to begin with.
Nonetheless, the view is very reasonable, and would take (generic)
Minimalism as significant evidence: if you claim language has evolved
adaptively, you expect it to show its success right up its sleeve.
A third possibility (for why Minimalism) could be that all this
linguistic elegance is nothing but mathematical clout. In one sense, this
is trivially acceptable: if interpreted in purely methodological terms.
One can be interested in language for its value as a tool to better a logical
system, an algebraic apparatus, a computer program, or some such thing.
There is meaning to the notion of (often various ways of) optimizing a
function, and it could just be that linguistic functions (whatever those
turn out to be) happen to be an interesting sub-case of that. Notice, this
isn’t saying much about language as a natural object, but one doesn’t
have to. However, many linguists and philosophers do not take any of
this as just methodological (cf. Richard Montague’s famous (1974: 188)
rejection of “the contention that an important theoretical difference exists
between formal and natural languages”). That poses very different questions.
The most immediate issue is what underlies formal and natural
languages, which from the evolutionary perspective we are now
considering should be what has evolved, and furthermore optimally, by
hypothesis. As Partee (1996: 26) notes, “[o]ne can easily understand
Chomsky’s negativity towards Montague’s remark that he failed to see
any interest in syntax other than as preliminary to semantics.” But
Montague’s perspective is reasonable, particularly if as he thought syntax
should be homomorphic with semantics; what would be the naturalistic
(evolutionary) point in having both syntax and semantics evolve as
separate systems that (presumably, then) get connected? It is more sound
to expect one of these systems to piggy-back on the other, as the
aerodynamical structure of a wing allegedly piggy-backs on its flying
224 D.E.L.T.A., Vol. 16, N.º ESPECIAL
function. To the extent that there is a function to language, it surely
must have to do with such things as communicating truths or denoting
referents – -semantic stuff. Clearly, from this perspective, the third view
reduces to the second: syntax is just the epiphenomenologial form that
semantic function has met in the course of evolution.
Less reasonably, I suppose, view number three stays distinct if one
goes metaphysical, and claims that English as a formal language actually
exists out there in some primitive sense. I don’t understand anything
about metaphysics, so I won’t venture much beyond this point; but I
guess this view is also entertained, and I imagine once one is in that
world of logic, mathematics, or ideas, to find that they should be elegant
a priori – Platonic, after all – is perhaps not that amazing.
Judging from his writings, none of these alternatives is what
Chomsky is entertaining, which moves me to the fourth possibility I
said I see – the most unreasonable of all.
2. A different take
Chomsky (1995) declares it “of considerable importance that we
can at least formulate [minimalist] questions today, and even approach
them in some areas with a degree of success” (p. 9). He furthermore
takes the matter to be far reaching, daring to claim that if on track “a
rich and exciting future lies ahead for the study of language and related
disciplines.” This isn’t just rhetoric; later on (p. 169) Chomsky admits
that “[s]ome basic properties of language are unusual among biological
systems, notably the property of discrete infinity, . . . that the language
faculty is nonredundant, in that particular phenomena are not
‘overdetermined’ by principles of language, . . . [and] the role of
‘principles of economy’.” To the remark about biological oddity, he
appends that the basic linguistic properties are “more like what one
expects to find (for unexplained reasons) in the study of the inorganic
world.” None of these ideas should be taken lightly. “There is,” Chomsky
thinks, “good reason to believe that [such considerations] are funda-
mental to the design of language, if properly understood.”
The design of language? What could that possibly mean if not any
of the things just discussed? This is the main question that will concern
me here.
URIAGEREKA 225
Of course, the emergence of language is not the only difficulty for
standard optimistic stories in the theory of evolution. There to be
explained are also incredible convergences in organ structures without
any shared functions (e.g. Fibonacci patterns in cordate skins, mollusc
shells, jellyfish tentacle arrangements, plant phyllotaxis, microtubules
within cytoskeletons in every eukaryotic cell), the very concept of
speciation (how do you go from a mutant to an array of them that stay
close enough to matter for reproduction?), or individual (what makes a
prokaryotic cell turn into an eukaryotic one, in the process subsuming
nucleus, mitochondria, organelles, presumably through symbiosis with
other micro-organisms, and then start aggregating to what we now see?),
not to speak about the competence underlying the observable behavior
of animals – including, in some, altruism.
To all those questions, the standard answer is the Neo-Darwinian
synthesis (of Darwinism and neo-Mendelian genetics) which in its
highlights speaks of selfish genes using individuals as their mere vehicle
for survival and reproduction (Dawkings 1987). How one goes from a
couple of (our) selfish genes to our exchanging thoughts this very minute,
nobody knows.
These matters have a long history within biology, and have now
been retaken by researchers dissatisfied with the party line. This is, for
instance, what Goodwin (1994: xiii) has to say about the pioneering On
Growth and Form, D’Arcy Thompson’s 1917 classic:
[H]e single-handedly defines the problem of biological form in mathematical
terms and re-establishes the organism as the dynamic vehicle of biological
emergence. Once this is included in an extended view of the living process, the
focus shifts from inheritance and natural selection to creative emergence as the
central quality of the evolutionary process. And, since organisms are primary
loci of this distinctive quality of life, they become again the fundamental units
of life, as they were for Darwin. Inheritance and natural selection . . . become
parts of a more comprehensive dynamical theory of life which is focused on
the dynamics of emergent processes.
Of course, the devil is in the details, and one wants to know what is
meant by the dynamics of emergent processes. I’ll spare the reader the
specifics, but I would like to give at least some sketch of the general
picture from Stuart Kauffman’s work (1995: 18):
226 D.E.L.T.A., Vol. 16, N.º ESPECIAL
[M]uch of the order seen in development arises almost without regard for how
the networks of interacting genes are strung together. Such order is robust and
emergent, a kind of collective crystallization of spontaneous structure. ... Here
is spontaneous order that selection then goes on to mold. ... Examples that we
shall explore include the origin of life as a collective emergent property of
complex systems of chemicals, the development of the fertilized egg into the
adult as an emergent property of complex networks of genes controlling one
another’s activities, and the behavior of coevolving species in ecosystems that
generate small and large avalanches of extinction and speciation. ... [T]he order
that emerges depends on robust and typical properties of the systems, not on
the details of structure and function.
Kauffman aptly sums up this view: “Under a vast range of different
conditions, the order can barely help but express itself.”
What I have reported may sound like prestidigitation, but
Kauffman’s book seeks to show that it is not – I cannot go into that here,
although see the fractal example below. My point is this: If any of this is
independently argued for, or at least plausible, the question of language
design doesn’t have to reduce, in the course of evolution, to “the natural
response of an organism looking for the most efficient way in which to
transmit its thoughts”, as Berlinski jokes in 1986: 130. It could be that
linguistic order can barely help to express itself, in whatever sense other
kinds of biological order do. If that is the case, we do expect it to “involve
a bewildering pattern without much by way of obvious purpose,” to
again borrow from Berlinski’s prose, and pace those who find a grand
purpose to syntactic principles.
3. Arguments against the unreasonable view
Chomsky often appeals to the metaphor that Kauffman uses:
crystallization. He seems to think that grammar could have emerged in
roughly the way a crystal does, only at a more complex and arcane level
of physics for, as he puts it, “[w]e have no idea, at present, how physical
laws apply when 1010 neurons are placed in an object the size of a
basketball, under the special conditions that arose during human
evolution.” This passage (see also Chomsky 1993 and 1994) is cited by
Pinker, who then adds (p. 363) “the possibility that there is an
undiscovered corollary of the laws of physics that causes brains of human
size and shape to develop the circuitry for Universal Grammar seems
URIAGEREKA 227
unlikely for many reasons.” He gives two. To start with, “what sets of
physical laws could cause a surface molecule guiding an axon. . . to
cooperate with millions of other such molecules to solder together just
the kinds of circuits that would compute. . . grammatical language?”
The presuppositions of that question are curious. On the one hand,
that neurons regulate mind functions (and not something more basic, as
suggested for instance by Penrose (1994), or something else entirely) is
just a hypothesis, even if this isn’t always remembered or even admitted.
On the other hand, if there ever is an answer to the rhetorical question
Pinker poses, it will most likely arise not from a reduction of mind to
whatever the known laws of physics happen to be when the answer is
sought, but rather from a serious unification between the two (or more)
empirical sciences involved. It is perfectly possible that physics will
have to widen or deepen or strengthen (or whatever) its laws as
understood at a given time, precisely in order to accomodate the
phenomenon of mind – just as they had to be modified to accomodate
the phenomenon of chemistry, and to some extent are being streched
when contemplating the phenomenon of life in an entropic universe. I
realize I’m appealing to caution in the presence of ignorance – but that’s
shown better results than letting ignorance dictate.
In relation to Pinker’s question to Chomsky, an intriguing instance
that comes to mind and seems significant is the discovery by Barbara
Shipman that von Friesch’s arcane observations regarding bee-dances
can be best described in terms of mapping objects existing in six-
dimensional flag manifolds to a two-dimensional expression. This
already interesting formal fact becomes fascinating when Shipman, a
physicist and mathematician studying quarks – which also happen to be
aptly described in terms of six-dimensional flag manifolds – speculates
that bees might be sensitive to quantum fields. At some level, that they
are is already known: given their orientation system (which is not object-
driven, like ours, but geodesycally grounded), bees apparently can be
“fooled” by placing them in the presence of heavy magnetic fields. But
what Shipman is exploring is in a sense extraordinary: that a creature
may be able to use sensitivity to quantum fields as a system to
communicate information. Now imagine we had posed Pinker’s question
to von Friesch instead of Chomsky, at a time when quantum physics
was either not developed or even very well-known, in light of the bizarre
behavior of bees. What physical laws could possibly cause molecules in
228 D.E.L.T.A., Vol. 16, N.º ESPECIAL
bee neurons to cooperate with millions of other such molecules to com-
pute bee dances? Well, who knew; of course, who knows now as well,
but at the very least the Shipman take on these matters puts things in
perspective: perhaps the little creatures are sensing something on the
basis of the laws of physics, crucially as presently understood.
Pinker’s second problem (with Chomsky’s crystallization metaphor,
or more generally Gould’s argument that large brains predate speaking
humans) insists on a common-place: that large brains are, per se,
maladaptive, and hence they could have emerged only as a result of
some good associated function.
Suppose we grant that initial premise (in terms of metabolic cost,
for instance). Still, the general reasoning misses the point I’m trying to
establish, which Kauffman so poetically expressed: some order can barely
help to express itself, maladaptive or not. Of course, if some expressed
order turns out to be so maladaptive that you won’t transmit your genes,
then your kind dies out. That is tough to prove, though; play in animals,
for instance, is somewhat maladaptive: you waste time and energy, you
get injured, you expose yourself to predators... Does play kill you?
Obviously not, or scores of species would have vanished; does that mean
there is a tremendous alternative, systematic benefit in play, that so many
species have it? If there is, it isn’t obvious. And incidentally, in the case
of language, if what one is seeking is a function as a way out of the
puzzle of maladaptive brains, just about any function does the job (e.g.
Gould’s circulatory gains). You certainly don’t need the whole “benefit”
of language for that, which simply highlights the general problem with
doing “reverse engineering”: my reason is as good as yours, so nobody
wins.
Those are the “many reasons” the “circuitry for Universal Grammar”
should be blamed on Darwinian adaptationism. Perhaps, but the force
of the argument is nowhere to be seen. Yet Pinker’s reasoning does
certainly go with the mainstream in biology, which seems ready to
presuppose answers to these fascinating questions on form – on the basis
of the dogma of function. Witness in this respect the critique that Givnish
(1994) gives of the extremely interesting work by Roger Jean (1994),
who attempts an analysis and explanation of Fibonacci patterns in plant
phyllotaxis. After asserting that Jean’s explanation for plants displaying
geometrical patterns is not compelling, Givnish writes (p. 1591):
URIAGEREKA 229
He raises no adaptive explanation for phyllotactic patterns.. . and fails to cite
relevant papers on the adaptive value of specific leaf arrangements. Worse, the
author espouses Lima-de-Faria’s bizarre concept of autoevolution, arguing that
phyllotaxis is nonadaptive and reflects a pattern of self-assembly based on
prebiotic evolution of chemical and physical matter. . . recapitulating the natu-
ral philosophy of D’Arcy Thompson that led many biologists to abandon
phyllotaxis as a subject of study. . . [N]othing in biology makes sense except in
the light of evolution.
The last sentence is a famous prayer by Dobzhansky, which exhorts
the listener to follow the (here useless) party line. But in this instance
(perhaps even more so that in the case of language) no imaginable
adaptive story could serve to explain the (mathematically) exact same
form that arises allover the natural world (cf. for instance a viral coating
vs. the feather display in a peacock’s tail, both arrangements of the sort
seen in plants).
4. Basic minimalist properties and how unusual they are
I have argued elsewhere (e.g. (1995)) that Fibonacci patterns present
all three of Chomsky’s basic unusual properties among biological
systems: discrete infinity, underdetermination, and economy. To
demonstrate that this is not an isolated instance, I’d like to present another
case, which as it turns out also exhibits the property of self-similarity, or
fractality. Fractals are recursive structures (hence discretely infinite) of
extreme elegance (the economy bit, which in fractals is easily
expressible); as for their underdetermination, it usually shows up in the
system not coding some of its overt properties, such as handedness or
various details about systemic implementation.
The example I have in mind comes from a piece by West, Brown,
and Enquist (1997), who analyzed the vertebrate cardiovascular system
as a fractal, space filling network of branching tubes, under the economy
assumption that the energy dissipated by this transportation system is
minimized, and supposing the size of terminal tubes (reaching sub-tissue
levels) does not significantly vary across species. In so doing, they deduce
scaling laws (among vertebrates) that have been known to exist for qui-
te some time, but hadn’t been accounted for as of yet.
230 D.E.L.T.A., Vol. 16, N.º ESPECIAL
What was known already is that biological diversity – from
metabolism to population dynamics – correlates with body size (itself
varying over twenty one orders of magnitude). Allometric scaling laws
typically relate some biological variable to body mass M, by elevating
M to some exponent b, and multiplying that by a constant characteristic
of a given organism. This leads one to thinking that b should be a multiple
of 1/3, so that the cubic root of an organism’s mass relates to some of its
internal functions in the way that a tank with 1000 cubic feet of water
has multiples of 10 as a natural scale to play tricks with. Instead, what
researchers have found is that b involves not cubic roots, but rather
quarter roots, unexpectedly, at least if one is dealing with standard
geometric constraints on volume. For example, the embryonic growth
of an organism scales as M1/4, or the quarter root of its mass (the larger
the mass of the organism, the slower its embryonic growth, but as mass
increases, embryonic growth differences decrease). These quarter-power
scalings are apparently present all throughout the living kingdoms
I’ll spare the reader most of the geometrical details of why a fractal
network does involve quarter powers as the scaling factor. The gist can
be seen by entertaining the exercise of systematically producing holes
in a cylindrical, solid Manchego cheese. Suppose you isolate an outer
layer from an inner core, with the intention of producing holes, first, in
the outside. Call C the volume of the entire cheese and L the volume of
the outer layer; obviously, the relation between C and L is cubic,
corresponding to the three dimensions of lenght, width, and heighth.
But now consider a further dimension: that by which we systematically
produce holes in the outer layer. Call H the volume of L minus the
holes. To express the relation between C and H, we need a more complex
exponential function than the cubic one: we must add the contribution
of the fourth dimension. More generally, if we continue producing layers
inside the cheese, theoretically ad infinitum (if the holes get smaller and
smaller, up to some limit), the basic dimensions won’t have to change.
That will create a fractal structure of holes in the cheese – a Swiss cheese.
The fractal model was cleverly used to describe the inner “guts” of
an organism, where tubules of various sorts play the role of the holes,
and of course the entire organism is the cheese. The model predicts
facts with an incredible degree of accuracy: (where the P[redicted] and
O[bserved] numbers express the scaling exponent, as is obvious a
multiple of 1/4) aorta radius P3/8=.375, O=.36; circulation time P1/
URIAGEREKA 231
4=.25, O=.25; cardiac frequency P=-1/4=-.25, O=.25; metabolic rate
P=3/4 =.75; O=.75. The list goes on. West, Brown, and Enquist observe
that “the predicted scaling properties do not depend on most details of
system design, including the exact branching pattern, provided it has a
fractal structure” (p. 126).
That last sentence, apart from directly illustrating systemic
underspecification, resonates directly with Kauffman’s contention that
some kinds of order arise without regard for how underlying gene
networks are put together – which is well, considering that we may be
dealing with species that have few genes in common, particularly when
extending these observations to other kingdoms.
In sum, a kind of biology is beginning to gain momentum; it is
focused on systemic properties that arise via principles of reality which
are more elementrary than adaptations. Needless to say, the emergence
of one of these core systems may well have been adaptive to an organism,
but crucially not (at least not necessarily) for whatever it is eventually
put to use.
I think this is relevant to Chomsky’s recent (or old) ideas in two
respects. First, it directly shows, at least to my mind, that if Chomsky
has gone supernova, he has together with a very exciting branch of
biology. One may have biases against whatever is biological and non-
adaptive or touches on weird physics, but there is no crisis here for
standard linguistics as we know and love it, even if one is as skeptical
about adaptative explanations of language as Gould, Berlinski, and Fodor
all strung together. One doesn’t then have to turn to metaphysics,
mathematics, functionalism, or deny the facts. Or to put it bluntly:
Chomsky isn’t doing now what he hasn’t done before.
Second, if properly understood, the fact that fractals appear so cen-
tral to organic nature may give us another argument for the autonomy of
syntax. At first, this doesn’t seem so. After all, am I not saying the
language faculty is, in relevant, according to Chomsky, “basic” respects,
like the scaling system? It is, I think, in those basic properties (of discrete
infinitude via recursivity, underspecified or gene-independent plasticity,
and structural economy with no direct functional correlate); but it’s
plainly the case that language has properties that, for example, Fibonacci
patterns do not. For instance, some phyllotactic patterns, underspecified
for handedness, branch rightwards or leftwards depending on whether
232 D.E.L.T.A., Vol. 16, N.º ESPECIAL
the previous branch was in some definable sense heavy or light
(branching opposite to a heavy branch’s direction; see Jean 1994: chapter
3); phrasal linguistic structures are also underspecified for handedness,
and if Kayne is right in his (1994) proposal, they branch in the direction
that codes command, or the history of their merging process, in Epstein’s
(1995) interpretation. As I now proceed to show, the effects of this minor
difference are drastic.
5. From a minor change to a major consequence
Imagine the last sentence in the previous paragraph had branched
according to the Fibonacci display in vegetable trees, instead of Kayne’s.
That is, rather than (1), we would have (2), assuming the “heavy” branch
is the one with more letter symbols and that the first heavy branch goes
right:
(1)
(2)
URIAGEREKA 233
This is an adequate way to linearize plants (indeed a more
“balanced” way than the one seen in (1): here the right branches have a
total of 22 symbols, for the 23 of the left branches, whereas in (1) the
right branches summed 17 symbols, against the 28 of the left branches).
But (2) creates a hopeless instability for linguistic objects. Thus, imagi-
ne substituting drastic and utterly incomprehensible for drastic in these
structures. Now that material is heavier than the rest (including 26 new
symbols), and hence would seek its place in the linearized structure to
the right of the effects of this very minor difference.
In other words, depending on how large a predicate is, it may be
pronounced before or after the subject. As a system of communication
of the sort we have, a general procedure of that sort would be insane.
Then again, Kayne’s linearization applied to natural trees would yield
heavily inclined trunks, which is probably also insane for adequate
photosynthesis, the clorophilic function, pollinization, and what not.
Differently put, nobody can seriously deny the role of use and others
(e.g., learnability in the case of language) in certain structural decisions
– and nobody does, to my knowledge. But to borrow Kauffman’s
expression, here is where selection goes on to mold spontaneous order.
I should clarify that. I’m not saying that communicative reasons
directly yield Kayne’s procedure, as opposed to the plant one. It’s hard
to imagine how at the stage where evolving hominids went from not
having the linearization procedure to finding it (assuming that is actually
what happened) anything other than the crudest proto-language could
have been in place. If the picture Chomsky paints in his (1995) book is
remotely close to accurate, even word-formation a la Hale and Keyser
(1993), or any such variant, is dependent on a transformational process
that has to involve Kayne’s linearization procedure. This is easy to show.
A transformation involves a phrase-marker K and a target symbol
T from inside K that is added to the root of K. As a consequence, a
dependency or chain is formed between two pairs: the moved T and its
(after movement) immediate syntactic context K, and T’s copy, call it
(T), and its immediate syntactic context X: {{T, K}, {(T), X}} (see
Chomsky (1995: 252)). The minute movement takes place, you involve
at least four symbols, all appropriately arrayed into some phrase-marker.
Remember, the phrase-marker per se codes no linear order, but mere
hierarchical arrangements that the linearization procedure lays out in an
appropriate phonetic row. Now, if you just have one symbol, you only
234 D.E.L.T.A., Vol. 16, N.º ESPECIAL
have one linearization possible; with two symbols, you obviously have
two, in principle; with three symbols, six possible linearizations ensue;
more generally, with n symbols you have n! linearizations possible (these
are the possible permutations of those symbols). For example, the
sentence Tarzan loves Jane can be linearized in six ways – without
counting any possible movements. If you add those (at least a couple of
A-movements, a couple of head movements, and perhaps others), the
linearizations jump to over five thousand; even if you took a mere
millisecond to consider each of those orderings, it would take you five
seconds to parse the sentence unequivocally. As far as anybody knows,
that’s just unworkable as a communication system; it is as unstable as a
Calder mobile on a windy day, with the complicating factor that we
actually see a mobile, but we only hear words one at a time...
Plainly put, no linearization equals no overt movement – hence
nothing remotely close to human language – and furthermore not even
(at any rate, appropriately complex) words (formed by movement).
Maybe that lets you go by with Me-Tarzan stuff, but you can hardly
speak of any transitive actions, for instance, which presuppose movement
in Chomsky’s system. Nonetheless, Me-Tarzan stuff certainly worked
much better in Saturday afternoon classics than Cheetah’s
‘chimpanzeese’, and it’s not trivial to go from Tarzan to us right here
just by the alleged selective pressure that lack of communication with
Jane-like figures would arguably impose. It is more likely, it seems to
me, that Tarzan’s child just stumbled onto something like Kayne’s
linearization, the way one probably stumbled onto the linearizing device
that translates hierarchical musical structures to a whistling tune. Once
an accident like that took place within the evolving human brain, the
benefits of linearization could all be harvested, perhaps from whistling
to word-formation. God only knows.
Admittedly, that was a just-so story of my own, but Chomsky’s
elegant system invites just this sort of speculation, for anyone who cares
to look at the details. The Minimalist syntax is so subtle and far reaching
that a minor change in one of its components can carry you from
something like language to something like a plant. This might seem like
autonomous syntax by a hair’s length, but isn’t everything concerning
form out there, and by even smaller lengths? The difference between a
“sparsely connected” network and one less or more connected appears
to be one, so Kauffmann shows in 1995, between nothing at all, utter
URIAGEREKA 235
chaos, and complex order; curiously, binarity seems to do the trick among
possible relations: less leads to nothing, more to chaos – two does it. Of
course, we don’t need to go into the arcane issues I’m talking about
here to illustrate this point about subtlety in this universe. Change the
one in one trillion imbalance between protons and antiprotons and the
known universe vanishes, matter annihilating antimatter (thus no fami-
liar forms: stars, atoms, we or anything).
Why is this relevant to the general point of this exercise? I cannot
put it better than Fodor (1998: 12), in his recent critical review of Pinker’s
new book (1998):
[W]hat matters with regard to . . . whether the mind is an adaptation is not how
complex our behaviour is, but how much change you would have to make in
an ape’s brain to produce the cognitive strucutre of a human mind. And about
this, exactly nothing is known. That’s because nothing is known about the way
the structure of our minds depends on the structure of our brains. . . Unlike our
minds, our brains are, by any gross measure, very like those of apes. So it
looks as though relatively small alterations of brain structure must have produced
very large behavioural discontinuities in the transition from the ancestral apes
to us. If that’s right, then you don’t have to assume that cognitive complexity is
shaped by the gradual action of Darwinian selection on pre-human behavioural
phenotypes. . . [M]ake an ape’s brain just a little bigger (or denser, or more
folded, or, who knows, greyer) and it’s anybody’s guess what happens to the
creature’s behavioural repertoire.
The subtly dynamic system that the Minimalist Program implies
illustrates Fodor’s point from a different angle. When all is said and
done (?!) about the ultimate physical support of the syntax of natural
language, you may well find something as deeply surprising as the honey-
bees story I reported above.
6. Autonomous syntax redux
Even if autonomous syntax comes from a remote corner of
structuring – albeit one whose consequence is the possibility of forming
words (hence a lexicon, hence anything socially useful about linguistic
structuring) – we should really welcome it. This is not just a matter of
turf. The only reasonable alternative is functionalist, and it reduces to
some variant of Montague’s skepticism noted before. Personally, I’m
236 D.E.L.T.A., Vol. 16, N.º ESPECIAL
willing to wait and see what Montague grammarians have to say about
why local movement, expletive replacement, agreement, and all the rest.
Unfortunately, the answer so far has been nothing much.
In Chomsky’s (1995) view, particularly in chapter 4, the stuff that
transformations manipulate is features, understood as properties of
lexicon units (the equivalent of charge or spin in a sub-atomic particle).
The movement of T to K that I sketched above is broken down into
smaller parts, with a feature F of T being the trigger. It’s as if K tries to
attract F (Chomsky’s actual term) and the item that contains F is forced
to move as a result, much as a box full of nails moves when the nails are
pulled by a magnet. From this perspective, we expect that a feature F’
which is closer to K than F should interfere with K’s relation to F, just as
a magnet cannot “ignore” a paper clip, say, to attract a nail that is further
away. This “dumbness” of the linguistic system is not even surprising if
matters are, in some appropriate sense, the way I have been presenting
them.
In an interesting paper, Fukui (1996) extends these technical points
to an equally technical point about physics. He emphasizes an analogy
between Chomsky’s economy of derivations and Maupertuis’s principle
of Least Action. One of Chomsky’s main ideas is that alternative
derivations (in some precise sense that I describe immediately) compete
in grammaticality. That recalls various scenarios in mechanics and optics
where, of several alternative paths that an object or a beam of light may
follow, only the optimal one is chosen.
It is curious to note, as Schoemaker (1991) observes regarding these
matters, that optimality principles in physics raised, virtually from the
time they were proposed, the same sorts of questions that Chomsky’s
idea has, in the recent critical literature. Perhaps nobody expresses this
so well as Feynman in his lectures, which Schoemaker appropriately
cites (p. 209):
The principle of least time is a completely different philosophical principle
about the way nature works. Instead of saying it is a causal thing, . . . it says
this: we set up the situation, and light decides which is the shortest time, or the
extreme one, and chooses the path. But what does it do, how does it find out?
Does it smell the nearby paths, and check them against each other? The answer
is, yes.
URIAGEREKA 237
Needless to say, Feynman’s little joke at the end comes from the
fact that he has a quantum-mechanical explanation.
I only wish I had such a quantum-mechanical explanation about
Chomsky’s optimal derivation smelling the nearby alternatives;
unfortunately I don’t. I wouldn’t be surprised, however, if there is one,
even outside the reach of present-day science.
In fact, Chomsky’s derivations are behaving somewhat like beams
of light in an even more obvious way. In his treatment of the impossible
*there seems someone to be in the room, as opposed to there seems to
be someone in the room (see his (1995) pp. 344 and ff. and 366 and ff.),
Chomsky wants the derivation leading to (3b’) below to outrank the
derivation leading to (3a’). But how can that be, if the sentences involve
exactly the same words and exactly the same numbers of mergers and
movements?
(3) [to [be [someone [in [the room]]]]] {there, seems}
a. [someone [to [be [(someone) [in [the room]]]]]] {there, seems}
b. [there [to [be [someone [in [the room]]]]]] {seems}
a’. [there [seems [someone [to [be [(someone) [in [the room]]]]]]]]
b’. [there [seems [(there) [to [be [someone [in [the room]]]]]]]]
Topmost is the chunk of structure that both derivations share, with
the remaining words to be used in a lexical array (which Chomsky calls
a numeration). In (3a) we see how someone moves, leaving a
parenthesized copy or trace, while in (3b) there is inserted instead.
Assuming (non-trivially) that movement is more expensive than merging,
then it is clear that (3a) is outranked by (3b). But now consider (3a’), the
continuation of (3a); here, there is merged, while in (3b’), the continuation
of (3b), there moves leaving a trace behind. So now it seems that, after
all, both derivations are equally costly: one takes an extra step early on;
the other takes it later – but both take the extra step. ..
It doesn’t matter. Chomsky invites us to think of derivations as
unfolding in successive cascades of structural dependency, narrowing
down the “derivational horizon”, as it were, as further decisions are
made. Intuitively, the horizon is completely open when no words are
arranged into a phrase-marker, and it shrinks down as some words are
attached (e.g. as in the top-most structure in (3)). Only derivations with
238 D.E.L.T.A., Vol. 16, N.º ESPECIAL
the same derivational horizon compete, like (3a) and (3b). By the time
we’re asking (3a’) and (3b’) to compete they are already part of two
entirely different derivational histories, like those science fiction characters
that get killed in a parallel universe but still make it in this one.
To see that light behaves in similar ways, consider an illustration of
Feynman’s that is meant to show why often the path of least action is
not the shortest. You’re lying on the beach and suddenly somebody starts
drowning two hundred yards to your left. What do you do, run in a
straight line? Not so, because swimming is harder than running; you
run the shore until a critical point, and then you swim. Light acts
somewhat similarly when going from air to water, maximizing the “easy”
path vis-a-vis the “difficult” one (across a denser material), even if the
combined path is not the shortest. But now imagine a more complicated
scenario. You’re still on the beach, but suddenly you see somebody
trapped inside a building on fire; you could run directly to the building,
or actually take a small detour to the water and then go inside the building.
Here, obviously, you first get wet, and then run to save the person on
fire. Light doesn’t have such a “look ahead”. You can construct scenarios
where it would have to transverse three media, say air, oil, and water, in
such a way that you could optimize the total path by doing this, that, or
the other. But what light does instead is optimize the transition from air
to oil, and as its traveling horizon narrows (that is, whatever the result
of that first transition is), a new optimization takes place for the transition
from oil to water. The trick is as dumb as the one played by the syntactic
derivation because neither is really smelling anything – they are just a
bunch of photons or words going about their business, bumping against
other stuff.
I make much of this syntactic dumbness (as opposed to the
interpretive smartness of semantics, say), as an argument not just for the
autonomy of syntax, but in fact for its primacy as well. Many, if not all
of Chomsky’s (1995) core principles could be seen in this light. His
Inclusiveness Condition, that “any structure formed by the computation.
. . is constituted of elements already present in the lexical items selected
for [the numeration]; no new objects are added in the course of
computation” (p. 228), coupled with the Recoverability Condition
ensuring “that no information be lost by [an] operation” (p. 44),
immediately recalls Conservation Laws in physics and chemistry (except
those deal with quantities, and syntax deals with qualities). His Last
Resort condition, “that computational operations must be driven by some
URIAGEREKA 239
condition on representations, as a ‘last resort’ to overcome failure to
meet such a condition” (p. 28), resembles Haken’s Slaving Principle in
synergetics: stable modes of the old states of a system are dominated by
unstable modes (see Mainzer (1994)). The Condition on Chain
Uniformity, that “the chain C is uniform with respect to P. . . if each ?i [a
link of C] has property P” (p. 91) is best treated as a condition on the
stability of an object constructed by the derivation, thus relating to the
stability of wave functions in quantum mechanics; collapsing (and hence
interpreting) a chain, in the sense of Martin’s (1996) developments of
Chomsky’s ideas, could be then akin to collapsing a quantum wave – to
my mind a fascinating prospect, suggesting that interpretation amounts
to observation of a quantum state.
Is all of this metaphoric reminiscence or day dreaming – or folly?
Perhaps. Then again, the alternatives (denying the facts, blaming
adaptations, going metaphysical) don’t seem all that promising. What
does it all mean, though? Well, I don’t know – how could anyone? I do
know, however, what it does not mean. In this respect, I think it is rather
interesting that, if something makes the Minimalist Program different
from the Principles and Parameters model, which it springs from, that is
the new reliance on economy, rather than the modules of the predeces-
sor. Where one found Theta, Case, Binding, and similar modules, one
now seeks just economy in different guises (or pushing the phenomenon
out of narrow syntax). This is rather crucial.
Again, Fodor puts it well in his 1998 piece:
A module is a more or less autonomous, special-purpose, computational system.
It’s built to solve a very restricted class of problems and the information it can
use to solve them is proprietary. . . If the mind is massively modular, then
maybe the notion of computation that Turing gave us is, after all, the only one
that cognitive science needs. It would be nice to believe that. . . But, really, one
can’t. For, eventually, the mind has to integrate the results of all those modular
computations and I don’t see how there could be a module for doing that.
It is not surprising that Fodor was never too happy with the modu-
lar property of the Principles and Parameters system, since it basically
postulated modules (Theta, Case, Binding modules) within a module,
the Language Faculty. The modules ruled some particular local
interactions, and such notions as government were thought to determine
non-local, interactive relations among modules – a sort of “central
240 D.E.L.T.A., Vol. 16, N.º ESPECIAL
system” in Fodor’s terminology. This is, clearly, not the sort of
architecture that Fodor initially (and plausibly) sought. Quite simply, if
you allow modules within modules, you may then have modules within
modules within modules (e.g. Conditions A and B and C, which were also
modularly defined), and it’s then modules all the way down. Which is a
form of connectionism. The new architecture is much more in consonance
with Fodor’s view: there is a syntax, its own module, which interfaces with
other modules – whatever those are. Period. Now, this has a consequence.
The modular view of mind lends itself nicely to the adaptationist
view of linguistic evolution (putting aside the problem of our ignorance
about the brain support). The more modules you have that connect in
reasonable ways, the more you expect the connection to be adaptive.
Even Fodor would accept that, I think, so long as the module itself is
left untouched. Note, in fact, that his argument immediately above takes
no issue with the Turing interpretation of the module – what he sees
implausible is a Turing interpretation of the central system. By parity of
reasoning, the Principles and Parameters model could have been
interpreted (not necessarily, but somewhat plausibly) in similar
evolutionary terms: you have Theta and Case modules, for instance,
that evolved for whatever reason (even a crazy reason), but the way they
got connected, through government, let’s say, is not implausibly adaptive.
No Case/Theta connection, no visibility of arguments, no interpretable
structures. But all of that is now gone, and with it goes another possible
adaptationist argument for language in its glorious complexity. You’re
left with structural economy of the sort we’ve seen, and good luck
connecting that to any direct function of the usual sort.
7. By way of a conclusion
In his Fall 1997 class lectures, and again in unpublished work,
Chomsky has pushed some of the ideas discussed above even further,
going into what I like to think of as a more “dynamically derivational”
way – particularly when seriously exploring the possibility of multiple
applications of Spell-Out, or various consequences of accessing to the
initial numeration cyclically. All this talk of dynamic systems, of course,
is very much intended in the sense of Goodwin’s “dynamics of emergent
processes”, mentioned above. As far as I’m concerned, the more research
goes in this direction (and there is a long way to go), the closer we are to
URIAGEREKA 241
speaking in terms that complexity theorists can relate to, thus moving
the syntax project in a new direction.
There, I should say, lie two presently serious problems. One is that
(although there is no “complexity theory”) many of the “complexity”
pioneers come from very different assumptions from the ones linguists
usually make, and in particular from the connectionist arena that is alien
to Chomskyan concerns – particularly if interpreted in the modular ways
that Fodor has naturally advocated. A second problem is that, up to now
at least, these people are usually profoundly ignorant of linguistic facts,
and even when the best among them try in good faith to discuss language,
the result is often gibberish (see e.g. the deep misunderstandings of the
otherwise intriguing book by Cohen and Stewart (1997), particularly
around p. 247). I don’t think either of these are fundamental problems,
but they should be kept in mind.
At any rate, mine has been a mildly ontological take on the
Minimalist program. I say “mild” because I’ll be the last one to want to
fall onto “hard” ontological commitments; my argument hasn’t been
that at all. Rather, the issue is simple: stuff out there, in the natural world of
physics, chemistry, or if one looks, organisms, has the core properties that
Chomsky thinks language exhibits. Whatever that “optimal” form is, it is
far away from a simple consequence of some (unclear) function.
REFERENCES
BERLINSKI, D. (1986) Black Mischief. The Mechanics of Modern Science.
New York: William Morrow and Co.
CHOMSKY, N. (1981) Lectures on Government and Binding. Dordrecth:
Foris.
_____ (1986) Knowledge of Language. New York: Praeger.
_____ (1993) Language and thought. Wakefield: Moyer Bell.
_____ (1994). Language and nature. Mind 104: 1-61.
_____ (1995) The Minimalist Program. Cambridge: MIT Press.
COHEN, J. & I. STEWART (1994) The collapse of chaos. London: Penguin.
DAWKINS, R. (1987) The blind watchmaker. Harlow: Longmans.
242 D.E.L.T.A., Vol. 16, N.º ESPECIAL
EPSTEIN, S. D. (1999) Un-principled syntax: the derivation of
syntactic relations. In: Working minimalism, (ed.) Samuel
D. Epstein and Norbert Hornstein, 317-345. MIT Press, Cambridge,
Mass.
FODOR, R. (1998) The trouble with psychological darwinism. London
Review of Books, January 22: 11-13.
FUKUI, N. (1996) On the Nature of Economy in Language. Cognitive
Studies 3: 51-71.
GIVNISH, T. J. (1994) The golden bough. A review of Jean 1994. Science
266: 1590-1591.
GOODWIN, B. (1994) How the leopard changed its spots. London:
Weidenfeld & Nicolson.
GOULD, S. J. (1991) Exaptation: A crucial tool for evolutionary
psychology. Journal of Social Issues 47: 43-65.
HALE, K. & S. J. KEYSER. (1993) On argument structure and the lexical
expression of syntactic relations. In: Hale and Keyser (eds.) The
View from Building Twenty. Cambridge: MIT Press.
JEAN, R. V. (1994) Phyllotaxis: A systematic study in plant
morphogenesis. Cambridge: Cambridge University Press.
KAUFMAN, S. (1995) At home in the universe: The search for laws of
self-organization and complexity. New York: Oxford University
Press.
KAYNE, R. (1994) The antisymmetry of syntax. Cambridge, Mass.: MIT
Press.
MEINZER, K. (1994) Thinking in complexity. Berlin: Springer Verlag.
MARTIN, R. (1996) A Minimalist Theory of PRO and control. Doctoral
dissertation, University of Connecticut, Storrs..
MONTAGUE, R. (1974) Formal Philosophy. New Haven: Yale University
Press.
PARTEE (1996) The Development of Formal Semantics in Linguistic
Theory. In: S. Lappin (ed.) Contemporary Semantic Theory. Oxford:
Blackwell 1996.
PENROSE, R. (1994) Shadows of the Mind. Oxford: Oxford University
Press.
PINKER, S. (1994) The language instinct. New York: Morrow.
_____ (1998) How The Mind Works. New York: W.W. Norton Co.
URIAGEREKA 243
SCHOEMAKER, P. (1991) The Quest for Optimality: A Positive Heuristic
of Science? Brain and Behavioral Studies 14: 205-245.
THOMPSON, D. W. (1917) On growth and form. Cambridge: Cambridge
University Press.
URIAGEREKA, J. (1998) Rhyme and reason: an introduction to minimalist
syntax. MIT Press, Cambridge, Mass.
WEST, G., J. BROWN & B. ENQUIST (1997) A general model for the origin
of allopmetric scaling laws in biology. Science 280: 122-125.