ArticlePDF Available

Abstract and Figures

We provide a practical step-by-step methodology of how to build a full-scale constructicon resource for a natural language, sharing our experience from the nearly completed project of the Russian Constructicon, an open-access searchable database of over 2,200 Russian constructions (https://site.uit.no /russian-constructicon/). The constructions are organized in families, clusters , and networks based on their semantic and syntactic properties, illustrated with corpus examples, and tagged for the CEFR level of language proficiency. The resource is designed for both researchers and L2 learners of Russian and offers the largest electronic database of constructions built for any language. We explain what makes the Russian Constructicon different from other constructicons, report on the major stages of our work, and share the methods used to systematically expand the inventory of constructions. Our objective is to encourage colleagues to build constructicon resources for additional natural languages, thus taking Construction Grammar to a new quantitative and qualitative level, facilitating cross-linguistic comparison.
Content may be subject to copyright.
John Benjamins Publishing Company
This is a contribution from Belgian Journal of Linguistics 34
© 2020. John Benjamins Publishing Company
This electronic le may not be altered in any way. The author(s) of this article is/are permitted to use
this PDF le to generate printed copies to be used by way of oprints, for their personal use only.
Permission is granted by the publishers to post this le on a closed server which is accessible only to
members (students and faculty) of the author's/s' institute. It is not permitted to post this PDF on the
internet, or to share it on sites such as Mendeley, ResearchGate, Academia.edu.
Please see our rights policy on https://benjamins.com/content/customers/rights
For any other use of this material prior written permission should be obtained from the publishers or
through the Copyright Clearance Center (for USA: www.copyright.com).
Please contact rights@benjamins.nl or consult our website: www.benjamins.com
How to build a constructicon in ve years
The Russian example
Laura A. Janda,
Anna Endresen,
Valentina Zhukova,
Daria Mordashova, and Ekaterina Rakhilina,
UiT The Arctic University of Norway |National Research University
Higher School of Economics |Institute of Linguistics of the Russian
Academy of Sciences |Lomonosov Moscow State University |
Vinogradov Institute for Russian language of the Russian Academy of
Sciences
We provide a practical step-by-step methodology of how to build a full-scale
constructicon resource for a natural language, sharing our experience from
the nearly completed project of the Russian Constructicon, an open-access
searchable database of over , Russian constructions (https://site.uit.no
/russian-constructicon/). The constructions are organized in families, clus-
ters, and networks based on their semantic and syntactic properties, illus-
trated with corpus examples, and tagged for the CEFR level of language
prociency. The resource is designed for both researchers and L learners of
Russian and oers the largest electronic database of constructions built for
any language. We explain what makes the Russian Constructicon dierent
from other constructicons, report on the major stages of our work, and
share the methods used to systematically expand the inventory of construc-
tions. Our objective is to encourage colleagues to build constructicon
resources for additional natural languages, thus taking Construction Gram-
mar to a new quantitative and qualitative level, facilitating cross-linguistic
comparison.
Keywords: constructicon, construction grammar, Russian, corpus
1. Why build a constructicon?
If you are a linguist working on individual constructions in a language X, you
might wonder why one should bother building a constructicon resource, and even
if you accept this challenge, you might wonder where to start, how to proceed,
and how to organize this endeavor.
https://doi.org/10.1075/bjl.00043.jan
Belgian Journal of Linguistics Volume 34 (2020), pp. 161–173. issn 0774-5141 |eissn 1569-9676
© John Benjamins Publishing Company
The primary objective of this article is to address linguists working in the
framework of Construction Grammar in order to inspire and motivate them to
build constructicon resources for their languages, by presenting the ideas and tools
we utilized in building a constructicon for Russian.
Constructions are the elements that structure languages (Fillmore, Kay, and
O’Connor ; Cro ; Goldberg ). In essence, each language is a struc-
tured inventory of constructions, and thus it is theoretically possible to model an
entire language as a constructicon. The term constructicon refers to both a struc-
tured inventory of grammatical constrtuctions and a description of this inventory.
Today, constructicon resources are under development for only a handful of lan-
guages, namely English, Swedish, German, Brazilian Portuguese, Japanese, and
Russian (Lyngfelt, Borin, et al. ).
The growth of this emergent sub-discipline of Construction Grammar, termed
constructicography’, promises crucial benets both for linguists and for language
learners. Our understanding of how networks of constructions work largely de-
pends on the amount of publicly available data on constructions. Moreover, thor-
oughly annotated and searchable databases of constructions can serve the needs of
Natural Language Processing (NLP). Recognizing semi-compositional construc-
tions in running text is crucial for machine translation, extraction of information
and other applications (Dunietz, Levin, and Petruck ). It is now high time to
build comparable constructicon resources for additional natural languages.
In what follows, we provide a practical guide for how to build a full-scale con-
structicon resource for a natural language, sharing our experience from the Russ-
ian Constructicon project (https://site.uit.no/russian-constructicon/). We report
on a group project carried out over a ve-year period (–) that succeeded
to collect, describe and illustrate an inventory of over  multi-word construc-
tions of Contemporary Standard Russian (Janda et al. ; Endresen et al. ).
We start with a brief overview of characteristics of the Russian Constructicon
resource (Section ), then outline the major stages of our work, focusing on meth-
ods for expanding and structuring the inventory of constructions (Sections  and
). Section  presents an illustration of our method. The article concludes with
recommendations based on our experience.
2. Features of the Russian Constructicon resource
The Russian Constructicon resource provides a large-scale model of the system of
Russian constructions for the benet of linguists, second language learners, and
NLP. The goal of modelling a language as a constructicon and the needs of users
have motivated the design of the project. The scope and organization of the pro-
ject are detailed in this section.
162 Laura A. Janda et al.
© 2020. John Benjamins Publishing Company
All rights reserved
2.1 The scope of the project
In the broadest sense, a construction is any recurrent form-meaning pairing in
a language, at any level of complexity, from morpheme through lexeme through
phrase to discourse structure (Goldberg , ). The constructicon of a language
is an open-class inventory that is potentially limitless. Therefore it would be unre-
alistic to expect to produce a comprehensive constructicon resource. Furthermore,
many items that a comprehensive constructicon should contain are already avail-
able in existing reference works, such as dictionaries (that contain lexeme-level
constructions), phraseological dictionaries (that contain idioms where all the slots
are xed), and grammars (that explain basic schematic types of sentences and use
of function words).
What remains are entrenched multi-word expressions that contain at least one
open (not xed) slot, and these are the strategic target of the Russian Constructicon
resource. More precisely we have collected partially schematic phrases that are
repeatedly used in Russian to convey meanings that range along a scale from fully
transparent (compositional) to opaque. A salient feature of such constructions is
the fact that their form, while motivated, is also to some extent arbitrary.
The following examples illustrate the type of constructions targeted in the
Russian Constructicon resource, namely constructions that are neither merely
schematic sentence types nor fully xed idioms. A typical construction in the re-
source includes a xed part, called the ‘anchor’ and one or more slots that can
be lled with a restricted set of lexemes. This type of construction is partially
schematic because part of it (the anchor) is xed, while the rest is variable. Par-
tially schematic constructions are likewise the focus of the Swedish constructicon
resource (Lynfgelt, Bäckström, et al. , ), and are referred to as ‘constructions
of microsyntax’ in the Russian linguistic literature. For example, in the construc-
tion net čtoby VP-Inf, Cl [instead of X-ing, Y] illustrated in (), the anchor is net
čtoby literally ‘no in-order’, and the open slots are the innitive verb (here lled by
podoždat’ ‘wait’) and the following clause.
() Net
no
čtoby
in.order
podoždat’,
wait-
on
he
uše-l-ø
leave--.
bez
without
nas!
we.
‘Instead of having waited for us, he just le!’
This example strongly illustrates non-compositionality since it is not possible to
predict the meaning of this construction based on its components. Linguists, learn-
ers, and NLP specialists face challenges in accounting for such constructions.
In addition to non-compositional constructions like the one above, high-
frequency compositional constructions are targeted in our project, such as (NP-
Dat) Cop možno VP-Inf [possible to X ] illustrated in (), where the adverb možno
‘possible’ is added to an innitive to mean ‘it is possible to X’.
How to build a constructicon in ve years 163
© 2020. John Benjamins Publishing Company
All rights reserved
() Do
to
Moskv-y
Moscow-
iz
from
London-a
London-
možno
possible
dolete-t’
fly-
za
behind
četyr-e
four-
čas-a.
hour-.
‘It is possible to y from London to Moscow in four hours.
Even such a construction is somewhat arbitrary, since it would be theoretically
possible to use a dierent adverb or a dierent form of the verb (perhaps a gerund
or a deverbal noun), however in Russian the usual way to express this meaning is
with precisely this construction.
Further types of compositional but arbitrary constructions targeted in the
Russian Constructicon resource include constructions where the anchor is a verb
with a specic argument structure or where a derivational morpheme serves as part
of the anchor. For example, in NP-Nom načinat’ NP-Ins [X begin as Y ] illustrated
in (), the conventionalized choice of the instrumental case with the verb načinat’
‘begin’ indicates the status of the person as a salient and temporary property.
() On
he
načina-l-ø
begin--.
učitel-em.
teacher-.
‘He began his career as a teacher.
An example of a derivational morpheme embedded in a construction is NP-Nom
pere-Verb vse NP-Acc.Pl[re-X all Ys] as in (), where the prex pere- species dis-
tributive semantics.
() Ja
I
pere-my-l-ø
-wash--.
vs-e
all-.
tarelk-i
dish-.
v
in
dom-e.
house-.
‘I washed all of the dishes in the house.
While the instrumental case and the prex pere- are motivated from the perspective
of Russian grammar, their use in these constructions is also an arbitrary language-
specic fact that must be accounted for by linguists and mastered by learners.
In sum, the Russian Constructicon resource targets recurrent linguistic pat-
terns that ‘fall between the cracks’ of dictionaries and grammars, yet are essential
to full mastery of the language.
Some constructicons are connected to a FrameNet resource, based on Fill-
more’s work on frames. Accordingto Fillmore and Atkins (, ), a frame is a cog-
nitive structure, the knowledge of which is presupposed for the concepts encoded
by constructional constituents. Though Russian lacks a fully developed FrameNet
resource, there exists a FrameBank (https://github.com/olesar/framebank) that
focuses primarily on verbs and their argument structure. The data of FrameBank
and the Russian Constructicon partially overlap. In the future, we might add cross-
references to frames described in the Russian FrameBank where appropriate.
164 Laura A. Janda et al.
© 2020. John Benjamins Publishing Company
All rights reserved
2.2 The presentation of constructions
The presentation of constructions in the Russian Constructicon resource is tai-
lored to the needs of the projected users: linguists, second language learners, and
NLP researchers. To this end, we provide both detailed linguistic classication
and user-friendly guidance. Each construction is supplied with:
a name, which is a schematic description of the construction; such as net čtoby
VP-Inf, Cl [instead of X-ing, Y]
a brief illustration; such as ()
a denition stated in non-technical language in Russian (with translations
into English and Norwegian); in this case: “The construction indicates that
the speaker expresses dissatisfaction with the fact that the interlocutor has not
taken a given action or is undertaking or has undertaken a dierent action.
a CEFR language prociency level (from A to C) to help learners target appro-
priate constructions; in this case C
a series of semantic and syntactic tags
a list of common llers for the open slot(s)
a usage label specifying the type of speech (Neutral, Colloquial, Formal, Ob-
solete)
a structure in terms of Universal Dependencies (https://universaldependencies
.org)
three to ve corpus examples from the Russian National Corpus (www.rus
corpora.ru)
In addition, both the denition and the corpus examples are tagged for semantic
roles (Agent, Experiencer, etc.). All of the information about each construction is
searchable. For example, linguists can search for semantic and syntactic parame-
ters, learners can search for constructions at a given prociency level, and both
types of users can enter strings (for example, of anchor words) to search for specic
constructions. The Universal Dependency structure, the glossing system, and the
lists of common llers of the slots serve the purposes of Natural Language Process-
ing, facilitating automatic recognition of constructions in authentic Russian texts.
The system of semantic tags is based on terminology from typological literature
(cf. the “universal grammatical set of meanings”, Plungian , ). Taken together,
these features make the Russian Constructicon a multi-functional resource, de-
signed for language pedagogy, language research, and language technology. Among
other constructicon projects, only the Swedish Constructicon (Lyngfelt,
Bäckström, et al. , , ) pursues pedagogical goals and has been created not
only for linguists but also for learners of Swedish.
How to build a constructicon in ve years 165
© 2020. John Benjamins Publishing Company
All rights reserved
3. Reaching and exceeding a critical mass of constructions
Linguistically, we can classify constructions according to their semantics and their
formal structures. However, the classication becomes reliable only aer a repre-
sentative sample has been obtained. A critical mass of constructions is needed in
order to establish their classication, which is uncertain prior to that point. In other
words, we had to repeatedly cycle through the tasks of collecting and classifying
constructions in order to arrive at a stable system which could then be exploited for
further expansion of the constructicon with only minor adjustments. Our process
proceeded in three stages, visualized in Figure  as the Initial inventory, Corpus-
based expansion, and System-based expansion. Numbers inside the bars reect the
quantity of constructions added in each stage, and dates indicate the approximate
timing of the stages.
Figure 1. Stages of the Russian Constructicon project
The Initial inventory of  constructions was amassed manually in Stage  from
a variety of sources including textbooks for learners of Russian (especially Janda
and Clancy ) and scholarly literature on Russian constructions (especially
Rakhilina ), as well as a crowd-sourced Google spreadsheet. At this stage we
decided what kinds of constructions to focus on in our project (see Section .),
established most of the conventions that would be used in the presentation of con-
structions (see Section .) and began to explore the semantic and syntactic sys-
tem of the constructicon (see Section ). This stage involved continuous revisions
in our procedure as we grappled with the dimensions of the project.
The Corpus-based expansion in Stage  continued the manual heterogeneous
collection of constructions, at this stage culled from running texts of various kinds,
particularly those that contain dialogues and spoken discourse, as well as an auto-
matically extracted list of highly frequent collocations attested in the Russian
National Corpus. In this stage, we added  constructions to the Initial inventory.
In addition to adding constructions, we continued the work on classication of
semantic and syntactic types, using the new constructions to verify and rene the
166 Laura A. Janda et al.
© 2020. John Benjamins Publishing Company
All rights reserved
classication. Once we had reached a critical mass of over one thousand construc-
tions, the classication became stable and robust enough to facilitate the identica-
tion of ‘families’ of constructions (see Section ). In other words, on the basis of our
semantic and syntactic tags we were able to discover groups of constructions that
were internally relatively homogeneous.
Families of constructions served as the basis for the more rapid and extensive
System-based expansion of the constructicon in Stage , which more than doubled
the size of the inventory to over , items. We examined semantic families of
constructions found in the database and searched for their synonyms, antonyms,
and related constructions containing the same or similar anchor words in order
to ll gaps in each family. Thus the classication system facilitated addition of
constructions in a signicantly more ecient manner. This stage yielded not only
quantitative but also qualitative change in the constructicon: semantic classica-
tion of constructions turned what initially was a list of unrelated items into a struc-
tured system of constructions.
4. Identifying families: Theoretical motivation and methodology
4.1 Theoretical motivation
One of the tenets of Construction Grammar is the idea that constructions are
related to each other. Following the example of Goldberg () and her analysis
of the English Subject Auxiliary Inversion family of constructions, we have devel-
oped the means to transform the inventory of constructions into a structured sys-
tem. One of the crucial challenges of a constructicon resource is to reveal and
represent this system, that is, the complex relationships (both hierarchical and lat-
eral) between constructions. One strategy is to focus on the relationship of parent
vs daughter constructions, i.e. a more abstract schema vs its specic instantiation.
In addition, we identify meaningful groupings: families that form clusters, and
ultimately networks.
We dene a family of constructions as a relatively homogeneous group of about
two to nine constructions that exhibit family resemblance in that they share some
semantic, syntactic (function in a clause and structure of the xed part), and struc-
tural properties (e.g. reduplication, negation, inversion, etc.). Family resemblance
means that the constructions in a family share various subsets of these properties.
The families within a cluster in turn share properties in a prototypical vs. periph-
eral distribution. We have elaborated a multi-level set of semantic and syntactic
tags that facilitate identication of families and clusters.
How to build a constructicon in ve years 167
© 2020. John Benjamins Publishing Company
All rights reserved
4.2 Methodology
Annotation was undertaken by a panel of three native speakers who worked to
achieve consensus on the tagging of each construction. A number of semantic and
syntactic tags were assigned to each construction by the panel. The annotation was
continuously rened and cross-checked by the entire panel, minimizing subjec-
tivity and guaranteeing consistency. In this process we took into account existing
scholarship relevant to semantic and syntactic classication, from both Russian
and typological scholarly traditions (Plungian ).
In all we employ  general semantic tags, many of which have subtags, yield-
ing an overall inventory of  subtags. Over  of the constructions bear more
than one semantic tag. Figure  displays the distribution of the most frequent gen-
eral semantic tags. Figure  displays the distribution of constructions across eleven
syntactic tags.
Figure 2. Distribution of constructions across twenty most frequent general semantic
tags
We investigated the intersection of semantic and syntactic classications to identify
meaningful groupings of constructions. Among the constructions that received
each general semantic tag, we examined syntactic patterns in order to nd more
homogeneous groups of constructions. Thus, we arrived at smaller groups of –
constructions that shared more or less the same syntactic structure and more nar-
rowly specied semantics. These smallest groups we call families. We furthermore
168 Laura A. Janda et al.
© 2020. John Benjamins Publishing Company
All rights reserved
Figure 3. Distribution of constructions across general syntactic tags
examined how families are related to each other within clusters and how clusters
comprise networks. As a rule, our general semantic tags correspond to networks,
and the subtypes correspond to clusters. An illustration of this approach is pre-
sented in Section .
5. Turning a list into a structured inventory
We illustrate the method outlined in Section  with the network of Prohibitive con-
structions diagrammed in Figure , consisting of two clusters and a total of eleven
families.
Figure 4. Network of prohibitive constructions
How to build a constructicon in ve years 169
© 2020. John Benjamins Publishing Company
All rights reserved
In Figure , boxes represent families indexed as cluster:family, followed by a brief
description and illustrative example. Thick boxes indicate prototypes. Lines with
arrows indicate semantic transitions. Lines without arrows indicate syntactic/for-
mal similarities. Dotted lines and arrows indicate weaker relationships. Thick ar-
rows indicate overlap with other networks of constructions.
Whereas constructions in Cluster  ask a hearer to refrain from doing some-
thing, constructions in Cluster  express ‘continuative prohibition, asking a hearer
to stop doing something. All constructions in Cluster  contain overt markers of
negation; such markers are absent from Cluster . Cluster  is centered around its
prototype, family :, containing negated imperative constructions. Lines represent
the relationships that hold among families and are tagged for semantic transitions
and shared formal properties. A semantic transition to generalized prohibitions
connects : to :, with transitions to the remaining families in Cluster  labeled
in Figure . Prohibitions in : can be either generalized or individual, indicated
by a dotted arrow, and : shares the syntactic form of predicative with :. Three
families in Cluster  (:, :, and :) share constructions across other networks
(Request, Intensity, and Warning), indicated by the thick arrows.
Cluster  is connected to Cluster  through three pairs of families. In each pair,
the semantic transition is from standard prohibition in Cluster  to continuative
prohibition in Cluster . In both clusters, families to the le represent general-
ization and attenuation, as opposed to more combative prohibitions on the right.
In addition, the two prototypical families (: and :) share the syntactic form of
imperative (also shared by :) and families : and : share the form of predica-
tive. The po- prex is a necessary feature of : and :, and optionally found in
:, where there is also some use of imperative forms.
The Prohibitive network demonstrates the complex of semantic and formal
properties that structure the constructicon.
6. Conclusion
We hope that this article will encourage the building of constructicons for a wider
variety of languages to serve both language learners and linguists. While the Russ-
ian Constructicon represents just one possible model, we can share lessons that
from our experience can be valuable to other similar projects. This is not a project
for an individual; it is essential to build a team of researchers because a construc-
ticon requires a variety of skills and a long-term commitment. As with any col-
laborative project, funding is essential. We found that it was possible to ‘package’
funding for the Russian Constructicon under the umbrella of grant projects pri-
marily aimed at language pedagogy and international cooperation. A strategic
170 Laura A. Janda et al.
© 2020. John Benjamins Publishing Company
All rights reserved
focus on constructions that are otherwise underrepresented in pedagogical and
reference works helps to keep the project manageable and also makes it easier
to ‘sell’ in grant proposals. A further ‘selling point’ is a user-friendly design that
addresses the needs of multiple audiences: the Russian Constructicon is a resource
both for learners and for linguists. In terms of presentation, we started by ‘piggy-
backing’ on an existing architecture (the Swedish Constructicon), making it possi-
ble to work through the rst two stages of our project without having to start from
scratch with the design of an interface. We are grateful for the big advantage this
gave us, which ultimately made it possible to envision something that would bet-
ter represent the Russian Constructicon. Once we began to uncover the relation-
ships among constructions (illustrated in Section ), we had something that was no
longer an inventory, but a system, and we needed a new interface that could do jus-
tice to that structure. We look forward to further expanding and rening the Russ-
ian Constructicon in its new design and welcome comments and critique.
Funding
We acknowledge funding from the Norwegian Agency for International Cooperation and Qual-
ity Enhancement in Higher Education: grants NCM-RU-/ and CPRU-/.
References
Cro, William. . Radical Construction Grammar. Syntactic Theory in Typological
Perspective. Oxford: Oxford University Press.
https://doi.org/10.1093/acprof:oso/9780198299554.001.0001
Dunietz, Jesse, Lori Levin, and Miriam R. L. Petruck. . “Construction Detection in a
Conventional NLP Pipeline.” In The papers from the 2017 AAAI Spring Symposium on
Computational Construction Grammar and Natural Language Understanding. Technical
Report SS-17-02, –.
Endresen, Anna, Valentina Zhukova, Daria Mordashova, Ekaterina Rakhilina, and
Olga Lyashevskaya. . “Russkij konstruktikon: Novyj lingvističeskij resurs, ego
ustrojstvo i specika [The Russian Constructicon: A new linguistic resource, its design
and key characteristics].” In Computational Linguistics and Intellectual Technologies.
Papers from the Annual International Conference “Dialogue-2020”, –. Published
on-line.
Fillmore, Charles J., and Beryl T. Atkins. . “Toward a Frame-Based Lexicon: The
Semantics of RISK and Its Neighbors.” In Frames, Fields, and Contrast: New Essays in
Semantics and Lexical Organization, ed. by Adrienne Lehrer, and Eva Kittay, –.
Hillsdale, NJ: Lawrence Erlbaum.
How to build a constructicon in ve years 171
© 2020. John Benjamins Publishing Company
All rights reserved
Fillmore, Charles J., Paul Kay, and Mary C. O’Connor. . “Regularity and Idiomaticity in
Grammatical Constructions: The Case of Let Alone.Language (): –.
https://doi.org/10.2307/414531
Goldberg, Adele. . Constructions at Work. The Nature of Generalization in Language.
Oxford: Oxford University Press.
Janda, Laura A., and Steven J. Clancy. . The Case Book for Russian. Bloomington, IN:
Slavica Publishers.
Janda, Laura A., Olga Lyashevskaya, Tore Nesset, Ekaterina Rakhilina, Francis M. Tyers. .
“A Constructicon for Russian: Filling in the Gaps.” In Constructicography: Constructicon
Development Across Languages, ed. by Benjamin Lyngfelt, Lars Borin, Kyoko Ohara, and
Tiago T. Torrent, –. Amsterdam: John Benjamins. https://doi.org/10.1075/cal.22.06jan
Lyngfelt, Benjamin, Linnéa Bäckström, Lars Borin, Anna Ehrlemark, and Rudolf Rydstedt.
. “Constructicography at Work: Theory Meets Practice in the Swedish
Constructicon.” In Constructicography: Constructicon Development Across Languages, ed.
by Benjamin Lyngfelt, Lars Borin, Kyoko Ohara, and Tiago T. Torrent, –.
Amsterdam: John Benjamins. https://doi.org/10.1075/cal.22.03lyn
Lyngfelt, Benjamin, Lars Borin, Kyoko Ohara, and Tiago T. Torrent (eds.). .
Constructicography: Constructicon Development Across Languages. Amsterdam: John
Benjamins. https://doi.org/10.1075/cal.22
Plungian, Vladimir A. . Vvedenie v grammatičeskuju semantiku: Grammatičeskie značenija
i grammatičeskie sistemy jazykov mira [An introduction to grammatical semantics:
Grammatical meanings and grammatical systems in the languages of the world]. Moscow:
Russian State University for the Humanities Press.
Rakhilina, Ekaterina V. (ed.) . Lingvistika konstrukcij [Linguistics of constructions].
Moscow: Izdatel’stvo Azbukovnik.
Abbreviations
nd person
 accusative
 dative
 future
 genitive
 imperative
 innitive
 instrumental
 imperfective
 locative
NP noun phrase
 masculine
 plural
PP personal pronoun
 past
 singular
VP verb phrase
() optional element
172 Laura A. Janda et al.
© 2020. John Benjamins Publishing Company
All rights reserved
Authors’ addresses
Laura A. Janda
Department of Language and Culture
UiT The Arctic University of Norway
Hansine Hansens veg 
N- Tromsø
Norway
laura.janda@uit.no
Anna Endresen
Department of Language and Culture
UiT The Arctic University of Norway
Hansine Hansens veg 
N- Tromsø
Norway
anna.endresen@uit.no
Valentina Zhukova
School of Linguistics
National Research University Higher School
of Economics
 Myasnitskaya Street
 Moscow
Russia
valentina.zh@gmail.com
Daria Mordashova
Minority Language Research and
Preservation Lab
Institute of Linguistics, Russian Academy of
Sciences
 bld.  Bolshoy Kislovsky Lane
 Moscow
Russia
mordashova.d@yandex.ru
Ekaterina Rakhilina
School of Linguistics
National Research University Higher School
of Economics
 Myasnitskaya Street
 Moscow
Russia
rakhilina@gmail.com
How to build a constructicon in ve years 173
© 2020. John Benjamins Publishing Company
All rights reserved
... This reflects the potential for constructicons to bridge the gap between grammars and dictionaries (e.g. Janda et al. 2020). • constructions of particular relevance to L2 learners (e.g., ). ...
... Constructions may thus be associated by semantic associations (e.g., 'resultative' or 'polarity sensitive'), particular construction elements (e.g., expletives or reflexives), etc. The Russian constructicon, for example, has an elaborate system for grouping constructions by shared functional features (e.g., Janda et al. 2020). ...
... While a core function of most reference constructicons is that of a descriptive reference work, they may also serve other purposes; and they may or may not be adapted to particular kinds of application. The Russian Constructicon (Janda et al. 2020), for example, is specifically built for language pedagogy, the FrameNet Brasil Constructicon is developed for NLP purposes (Matos et al., 2017;Lorenzi et al., 2023), whereas the Swedish constructicon is designed to serve as multi-purpose resources ). ...
Preprint
Overview article for the forthcoming Elsevier Encyclopedia of Language and Linguistics, 3rd ed.
... A thorough explanation of the internal structure of semantic networks exceeds the limits of this article. The analysis of two networks of evaluative constructions (Assessment and Attitude) is available in Endresen and Janda (2020), while the network of Prohibitive constructions consisting of two clusters and eleven families is described in Janda et al. (2020). ...
... The annotation of constructions was also an iterative process, where the annotators would revisit the same constructions several times whenever prominent patterns emerged. The process of semantic and syntactic annotation is described in more detail in Janda et al. (2020) and Janda et al. (2023). ...
Book
Full-text available
This collection of papers addresses the issues that arise when Russian Grammar meets new linguistic paradigms and new empirical challenges and discusses the new insights since the latest revision of the Academy Grammar of Russian (Академическая грамматика русского языка) (Shvedova, 1980). The contributions have been written by representatives of different scientific schools from across the academic world. […] The inspiration for this collection of papers came after the publication of the selected proceedings of the 5th International Symposium „Russian Grammar: System – Usus – Variation“ in honour of Alan Timberlake (Warditz, 2022). I would like to take the opportunity to express my deep gratitude to the journal‘s editors for their organisational support and also to the anonymous reviewers for their benevolent criticism and instructive comments. This publication coincides with a grim time for Slavic Studies in general and Russian Studies in particular. However, even when linguistic research is instrumentalized for political goals, we can, as shown by Max Vasmer, still preserve rigorous philological foundations (ʻdie streng philologische Grundlage bewahrenʼ) and see our research as a contribution to the victory of humanitarian and humanist values (Bott, 2004). Without this hope and vision, our intellectual efforts remain in vain. I am therefore grateful to the contributors to this issue, especially to Igor’ Melchuk for his irrepressible optimism and humour. Despite Alan Timberlake‘s inability to take part for personal reasons, I hope he will enjoy reading our papers. Together with the other contributors, I hope that this special issue will contribute to the discussion about old and new findings in the field of Russian Grammar and spur on modern approaches for its further development. Vladislava Warditz (Guest Editor)
... A thorough explanation of the internal structure of semantic networks exceeds the limits of this article. The analysis of two networks of evaluative constructions (Assessment and Attitude) is available in Endresen and Janda (2020), while the network of Prohibitive constructions consisting of two clusters and eleven families is described in Janda et al. (2020). ...
... The annotation of constructions was also an iterative process, where the annotators would revisit the same constructions several times whenever prominent patterns emerged. The process of semantic and syntactic annotation is described in more detail in Janda et al. (2020) and Janda et al. (2023). ...
Article
Full-text available
While linguistics traditionally keeps lexicon separate from grammar, Construction Grammar takes the grammatical construction as the basic unit of language. A grammatical construction integrates the roles of lexemes with their typical grammatical contexts, suggesting the advantages of a comprehensive approach. Furthermore, according to Construction Grammar, grammatical constructions comprise a system in which constructions mutually reinforce each other. We reveal the complex connections among Russian grammatical constructions that emerge from the Russian Constructicon, a resource with over 2200 annotated constructions. We achieve this by focusing on a single semantic subclass of 110 constructions labeled Sets and elements. Our analysis follows the connections among constructions through two domains: semantics and syntax. We find that all constructions fit into groupings at various levels of semantic schematicity, as well as presenting various syntactic dimensions. A given construction has affinities both to constructions with similar meanings and with similar form, and any given construction may have multiple affinities in either or both of these domains. Through our focus on multiword grammatical constructions, we reach beyond traditional approaches that separate words from grammar, instead viewing words in their grammatical context and grammar in its lexical context.
... Only the smallest groupings at the family level are relevant for this article. We define a family as a relatively small and homogeneous group of constructions (usually 2 to 9) that exhibit family resemblance and share semantic and often also syntactic or other structural properties (including reduplication, inversion, double negation, etc.) (see Janda et al., 2020;. Family resemblance means that the constructions in a family share not necessarily all properties but various subsets of these properties (cf. ...
... The constructions that belong to the semantic type Prohibitive, defined as expressing strict prohibition to perform an action in the future, 9 yield 57 entries that constitute a network comprised of two clusters that are formed by 12 families (cf. Janda et al., 2020). On the other end of the scale are some semantic types that are very sparsely populated, partly because their semantics is very specific and narrow. ...
Article
Full-text available
We expanded the database of the Russian Constructicon from 1,087 to over 2,200 constructions in seven months and worked out a methodology that can be used by other linguists working on comparable resources. We explain how we collected an inventory of constructions from various sources and significantly expanded it in a systematic way by modelling the relationships of constructions in terms of families. Each family was analyzed from the perspective of what members might be missing in the resource and could be added to optimize that family’s representation in the overall inventory. We provide five practical strategies for family-based expansion, each illustrated with an empirical case study of the families of Comparative, Addressee-encoding, Evaluative, Mirative, and Minimizing constructions.
... Work on the creation of such tools is currently in progress for a variety of languages and offer an overview in their book Constructicography . To our knowledge, among the most advanced constructicons are the Russian constructicon (Janda et al. 2020) and the Swedish constructicon, with the former having been mostly constituted manually and the latter using NLP tools (cf. for more details on the tools used). ...
Chapter
Full-text available
Construction Grammar is one of the fastest-growing branches of functional syntax. Bringing together an international team of scholars, this handbook provides a complete overview of the current issues and applications in this approach. Divided into six thematic parts, it covers the fundamental notions of Construction Grammar, its conceptual origins and the basic ideas that unite its various branches, its solid empirical grounding and affinities with corpus linguistics, and the diverse perspectives in constructional scholarship. It highlights advances in discourse-related topics and applications to various domains, including multimodal communication, language learning and teaching and computational linguistics, and each chapter contains numerous illustrative examples and case studies involving a variety of languages. It also includes in-depth, empirically-grounded analyses of diverse theoretical, methodological, and interdisciplinary issues, alongside step-by-step introductions to the theory, making it essential reading for both researchers and students working in functional and cognitive approaches to linguistic analysis and syntactic theory.
... Today, teachers and instructors of RFL can make use of a large-scale collection of over 2200 prominent Russian constructions publicly available in the digital educational resource called the Russian Constructicon (https://constructicon.github.io/russian/; presented in Janda et al., 2020;Janda et al., 2023). 2 Crucially, RusCon targets non-transparent constructions that puzzle learners of Russian. ...
Chapter
In this chapter, we advocate a constructionist approach to language pedagogy, emphasizing the importance of incorporating multi-word constructions in teaching and assessment of Russian as a foreign language (RFL). Constructions, consisting of fixed elements and variable parts, structure language and can serve as a shortcut for language acquisition by allowing learners to grasp patterns beyond individual words. We analyze two standardized RFL tests (TBL and TORFL-I) to evaluate the representation of constructions. Despite constructions being present in learning and testing materials, a consistent constructionist approach to RFL is missing. We argue that constructions should not only be an integral part of language instruction, but also play a central role in the assessment of language proficiency. Drawing on the Russian Constructicon, a digital repository of over 2200 constructions (https://constructicon.github.io/russian/), we outline practical scenarios for using this resource in designing language assessment tasks for both comprehension and production.
... We argue that the construction-based approach to language learning is highly beneficial for L2 learners because it focuses instruction on the most strategic constructions widely used by native speakers (see also Janda et al., 2020;Nesset et al., this volume). This approach is more efficient than traditional instruction because it provides learners with ready-to-use communicative patterns that can be easily employed for building sentences and texts. ...
Preprint
This article addresses the usefulness of constructicons, and of the Swedish Constructicon (SweCcn) in particular, in application to (additional) language teaching. We present a number of small case studies on pedagogical application of SweCcn: some of which are classroom studies of construction-based teaching of Swedish as an additional language, and others concerned with identification of pedagogically relevant constructions by text analyses of teaching aids. The combined overall outcome of these case studies is one of mutual benefit: constructicons may, indeed, be a valuable resource for construction-oriented language pedagogy, and the experiences from teaching application, in turn, feed back into further development of the resource. The text analyses also highlight the importance of not only treating individual constructions, but also addressing the interplay between constructions in combination.
Preprint
Full-text available
We analyze repetition in Russian from the perspective of the Russian Constructicon which represents over 2200 grammatical constructions described in terms of anchors (fixed elements) and slots (for various filler elements) and fully annotated for their syntactic and semantic characteristics. The Russian Constructi-con facilitates the first large-scale investigation of reduplication across a representative sample of an entire language, enabling us to map out a typology invoking these and other factors in the context of Construction Grammar. Our data on repetitions includes 118 constructions tagged the Russian Constructicon for Reduplication, meaning that repetition occurs within a clause, and 28 entries tagged as Discourse "Echo" Constructions because they require the repetition of a word or phrase from a previous clause (often provided by an interlocutor). Five constructions carry both tags. We propose a theoretical expansion of the definition of reduplication to include the Discourse "Echo" type, arguing that constructions are not limited to a single clause or even to a single speaker. Our typology further explores the distribution of various formal and semantic factors observed in constructions with repetition and compares them with both previous typological research on reduplication and their distribution across the entire Russian Constructicon. Despite the fact that Russian does not use reduplication as a productive grammatical marker, we argue that reduplication is widespread and systematic in Russian.
Article
Full-text available
A constructicon, i.e., a structured inventory of constructions, essentially aims at documenting functions of lexical and grammatical constructions. Among other parameters, so-called constructional collo-profiles, as introduced by Herbst (2018, 2020), are conclusive for determining constructional meanings. They provide information on how relevant individual words are for construction slots, they hint at usage preferences of constructions and serve as a helpful indicator for semantic peculiarities of constructions. However, even though collo-profiles constitute an indispensable component of constructicon entries, they pose major challengers for constructicographers: For a constructicographic enterprise it is not feasible to conduct collostructional analyses for hundreds or even thousands of constructions. In this article, we introduce a procedure based on the large language model BERT that allows to predict collo-profiles without having to extensively annotate instances of constructions in a given corpus. Specifically, by discussing the constructions X macht Y ADJP (‘x makes Y ADJ’, e.g. he drives him crazy) and N 1 PREP N 1 (e.g., bumper to bumper, constructions over constructions), we show how the developed automated system generates collo-profiles based on a limited number of annotated instances. Finally, we place collo-profiles alongside other dimensions of constructional meanings included in the German Constructicon.
Chapter
Full-text available
In constructionist theory, a constructicon is an inventory of constructions making up the full set of linguistic units in a language. In applied practice, it is a set of construction descriptions – a “dictionary of constructions”. The development of constructicons in the latter sense typically means combining principles of both construction grammar and lexicography, and is probably best characterized as a blend between the two traditions. We call this blend constructicography. The present volume is a comprehensive introduction to the emerging field of constructicography. After a general introduction follow six chapters presenting constructicon projects for English, German, Japanese, Brazilian Portuguese, Russian, and Swedish, respectively, often in relation to a framenet of the language. In addition, there is a chapter addressing the interplay between linguistics and language technology in constructicon development, and a final chapter exploring the prospects for interlingual constructicography. This is the first major publication devoted to constructicon development and it should be particularly relevant for those interested in construction grammar, frame semantics, lexicography, the relation between grammar and lexicon, or linguistically informed language technology.
Article
Full-text available
Abstract Through the detailed investigation of the syntax, semantics and pragmatics of one GRAMMATICAL CONSTRUCTION , that containing the conjunction let alone, we explore the view that the realm of idiomaticity in a language includes a great deal that is productive, highly structured and worthy of serious grammatical investigation. It is suggested that an explanatory model of grammar,will include principles whereby,a language can associate semantic and pragmatic interpretation principles with syntactic configurations,larger and more,complex,than those definable by means,of single phrase structure rules. 1. Background
Chapter
In constructionist theory, a constructicon is an inventory of constructions making up the full set of linguistic units in a language. In applied practice, it is a set of construction descriptions – a “dictionary of constructions”. The development of constructicons in the latter sense typically means combining principles of both construction grammar and lexicography, and is probably best characterized as a blend between the two traditions. We call this blend constructicography. The present volume is a comprehensive introduction to the emerging field of constructicography. After a general introduction follow six chapters presenting constructicon projects for English, German, Japanese, Brazilian Portuguese, Russian, and Swedish, respectively, often in relation to a framenet of the language. In addition, there is a chapter addressing the interplay between linguistics and language technology in constructicon development, and a final chapter exploring the prospects for interlingual constructicography. This is the first major publication devoted to constructicon development and it should be particularly relevant for those interested in construction grammar, frame semantics, lexicography, the relation between grammar and lexicon, or linguistically informed language technology.
Book
This book investigates the nature of generalizations in language, drawing parallels between our linguistic knowledge and more general conceptual knowledge. The book combines theoretical, corpus, and experimental methodology to provide a constructionist account of how linguistic generalizations are learned, and how cross-linguistic and language-internal generalizations can be explained. Part I argues that broad generalizations involve the surface forms in language, and that much of our knowledge of language consists of a delicate balance of specific items and generalizations over those items. Part II addresses issues surrounding how and why generalizations are learned and how they are constrained. Part III demonstrates how independently needed pragmatic and cognitive processes can account for language-internal and cross-linguistic generalizations, without appeal to stipulations that are specific to language.
Article
This book presents a profound critique of syntactic theory and syntactic argumentation. Recent syntactic theories are essentially formal models for the representation of grammatical knowledge. These theories posit complex syntactic structures in the analysis of sentences, consisting of atomic primitive syntactic categories and relations. The result of this approach to syntax has been an endless cycle of new and revised theories of syntactic representation. The book argues that these types of syntactic theories are incompatible with the grammatical variation found within and across languages. The extent of grammatical variation demonstrates that no scheme of atomic primitive syntactic categories and relations can form the basis of an empirically adequate syntactic theory. This book defends three theses: (i) constructions are the primitive units of syntactic representation, and grammatical categories are derivative; (ii) the only syntactic structures are the relations between a construction and the elements that make it up (that is, there is no need to posit syntactic relations); and (iii) constructions are language-specific. Constructions are complex units pairing form and meaning. Grammatical categories within and across languages are mapped onto a universal conceptual space, following the semantic map model in typology. The structure of conceptual space constrains how meaning is encoded in linguistic form, and reflects the structure of the human mind.
Construction Detection in a Conventional NLP Pipeline
  • Jesse Dunietz
  • Lori Levin
  • Miriam R L Petruck
Dunietz, Jesse, Lori Levin, and Miriam R. L. Petruck. 2017. "Construction Detection in a Conventional NLP Pipeline. " In The papers from the 2017 AAAI Spring Symposium on Computational Construction Grammar and Natural Language Understanding. Technical Report SS-17-02, 178-184.
Russkij konstruktikon: Novyj lingvističeskij resurs, ego ustrojstvo i specifika [The Russian Constructicon: A new linguistic resource, its design and key characteristics
  • Anna Endresen
  • Valentina Zhukova
  • Daria Mordashova
  • Ekaterina Rakhilina
  • Olga Lyashevskaya
Endresen, Anna, Valentina Zhukova, Daria Mordashova, Ekaterina Rakhilina, and Olga Lyashevskaya. 2020. "Russkij konstruktikon: Novyj lingvističeskij resurs, ego ustrojstvo i specifika [The Russian Constructicon: A new linguistic resource, its design and key characteristics]. " In Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference "Dialogue-2020", 226-241. Published on-line.
The Case Book for Russian
  • Laura A Janda
  • Steven J Clancy
Janda, Laura A., and Steven J. Clancy. 2002. The Case Book for Russian. Bloomington, IN: Slavica Publishers.
Vvedenie v grammatičeskuju semantiku: Grammatičeskie značenija i grammatičeskie sistemy jazykov mira [An introduction to grammatical semantics: Grammatical meanings and grammatical systems in the languages of the world
  • Vladimir A Plungian
Plungian, Vladimir A. 2011. Vvedenie v grammatičeskuju semantiku: Grammatičeskie značenija i grammatičeskie sistemy jazykov mira [An introduction to grammatical semantics: Grammatical meanings and grammatical systems in the languages of the world]. Moscow: Russian State University for the Humanities Press.