Technical ReportPDF Available

Abstract

This tutorial briefly introduces class-modeling as necessary for meta-modeling. We introduce the basic concepts, including classes and relationships, upon the simple class-modeling language Ecore, which is a dialect of UML class models, and which implements the MOF standard. The tutorial complements the book Domain Specific Languages -- Effective Modeling, Automation, and Reuse.
DSL DESIGN TUTORIAL
Tutorial A: Class Modeling
Andrzej W ˛asowski1and Thorsten Berger2
Version 1, 2024-03-01
1
9 783031 236686
ISBN 978-3-031-23668-6
Wąsowski · Berger
Domain-Specic
Languages
Andrzej Wąsowski
Thorsten Berger
Eective Modeling, Automation,
and Reuse
Andrzej Wąsowski · Thorsten Berger
Domain-Specific Languages
Eective Modeling, Automation, and Reuse
This textbook describes the theory and the pragmatics of using and engineering high-
level software languages – also known as modeling or domain-specific languages
(DSLs)– for creating quality software. This includes methods, design patterns, guideli-
nes, and testing practices for defining the syntax and the semantics of languages. While
remaining close to technology, the book covers multiple paradigms and solutions, avo-
iding a particular technological silo. It unifies the modeling, the object-oriented, and
the functional-programming perspectives on DSLs.
The book has 13 chapters. Chapters 1 and 2 introduce and motivate DSLs. Chapter 3 kicks
off the DSL engineering lifecycle, describing how to systematically develop abstract
syntax by analyzing a domain. Chapter 4 addresses the concrete syntax, including the
systematic engineering of context-free grammars. Chapters 5 and 6 cover the static se-
mantics – with basic constraints as a starting point and type systems for advanced DSLs.
Chapters 7 (Transformation), 8 (Interpretation), and 9 (Generation) describe different
paradigms for designing and implementing the dynamic semantics, while covering
testing and other kinds of quality assurance. Chapter 10 is devoted to internal DSLs.
Chapters 11 to 13 show the application of DSLs and engage with simpler alternatives to
DSLs in a highly distinguished domain: software variability. These chapters introduce
the underlying notions of software product lines and feature modeling.
The book has been developed based on courses on model-driven software engineering
(MDSE) and DSLs held by the authors. It aims at senior undergraduate and junior gra-
duate students in computer science or software engineering. Since it includes examples
and lessons from industrial and open-source projects, as well as from industrial re-
search, practitioners will also find it a useful reference. The numerous examples include
code in Scala 3, ATL, Alloy, C#, F#, Groovy, Java, JavaScript, Kotlin, OCL, Python, QVT,
Ruby, and Xtend. The book contains as many as 277 exercises. The associated code
repository facilitates learning and using the examples in a course.
Domain-Specic Languages
Andrzej W ˛asowski and Thorsten Berger. Domain-specific Languages:
Effective Modeling, Automation, and Reuse. Springer Nature. 2023.
Book website: http://dsl.design
Suppl. material: https://bitbucket.org/dsldesign/dsldesign
1IT University of Copenhagen, Denmark
2Ruhr University Bochum, Germany
AA Class Modeling
We use class modeling as one of the two main meta-modeling formalisms
in the book [5]. In this tutorial, we recall the main aspects of class modeling.
The Unified Modeling Language (UML) provides a complete set of nota-
tions for modeling [3]. These can be used for modeling many different as-
pects in development of software systems, including requirements, architec-
ture, types, structure, processes, and behaviors. In this book, we primarily
use the class diagrams from UML, and only for structural modeling of the
abstract syntax of languages. Importantly, we use a dialect of class diagrams
called Ecore, as implemented in the Eclipse Modeling Framework (EMF),
which does include some non-standard extensions, while missing many
complex constructs in UML. This restricted subset is entirely sufficient
for our purposes though, while we can enjoy its lightweight and portable
implementation (a few jar files).
AA.1 Classes and Objects
Let us start with a definition of a class:
Definition AA.1. Aclass is an abstraction that specifies attributes of a set
of concept instances (objects) and their relations to other sets of concept
instances (objects).
Importantly, for us, a class is not a programming language concept that can
be identified in the system source code. It is not a program type, or this is
not the primary aspect of interest in the domain analysis perspective. We use
classes to model real-world concepts or—more precisely—domain-level
concepts. These are often much more abstract than what Java, Python, Scala,
or C# classes are. Our classes rarely map one-to-one to implementation
classes. This big leap from implementation to a conceptual application
domain is necessary in order to obtain the productivity gains promised by
domain-specific languages and MDSE.
Both classes and objects are depicted by boxes with three compartments:
a name, attributes, and operations. When visualizing models, attributes and
operations can be omitted if they are not essential. In high-level modeling,
we usually ignore the class operations entirely. Operations are useful when
describing low-level implementation aspects (such as APIs), but they are
rarely used in DSL design—so, you will not see them in this book. At the
same time, names and attributes are essential for us.
1
2
Figure AA.1: An instance
specification diagram (also
called an object diagram)
containing two instances of the
class Car. Instance names are
underlined, and the class
names appear after a colon.
Instance names are optional,
but used in this figure (myCar,
yrCar). Attributes have
concrete values after an
equality sign
yrCar: Car
no: String="AB 5678"
color: String="red"
myCar: Car
no: String="WN 1234"
color: String="silver"
Consider the object diagram in Fig. AA.1 (also known as instance
specification diagrams in the more recent versions of UML). The diagram
states that there exist two car objects, each with two attributes. In order to
distinguish object diagrams from class diagrams, the object names (instance
names) are underlined and followed by a colon and the name of the class
that the object instantiates.
In the simplest view, classes are just types for objects. Our objects in
Fig. AA.1 are of type Car:
Figure AA.2: An example
class
The block above represents the class Car. The name is not underlined, and
attributes have types, not values. They also have a multiplicity constraint
(here simply “1”, which says that both attributes are mandatory for any
instance of this class). By convention, class names are capitalized, and
instance names are not.
AA.2 Generalization
Ageneralization relation (also known as inheritance) specifies that instance
sets of two classes are included:
Definition AA.2. A class Ageneralizes class B, if each instance of Bis also
an instance of A.
AA.2 Generalization 3
Inheritance vs Generalization
You have probably realized that we prefer to use the term generalization over the, possibly more common,
inheritance. This is not without a reason. The modeling experts prefer the former term, because it
captures more precisely the meaning of the kind-of relationship in concept modeling: it states that
one concept is more general than the other, and the latter is a specialization of the former, a kind-of
the former. Inheritance is a particular implementation mechanism for generalization, used when this
relationship has to be represented at runtime, in an interpreter or in generated code. Programming
languages implement generalization by inheriting (including) attributes and operations of the super class
(the generalized class) in the object. This is the reason why programming language experts, who are
typically also compiler writers, would tend to prefer the term inheritance. You can safely assume that
the two terms are synonyms, if you find this discussion confusing.
Engine
HPower : EInt
CombustionEngine
capacity : EDouble = 0.0
GasEngine
DieselEngine
ElectricMotor
HybridEngine
Figure AA.3: An example
generalization hierarchy of car
engine designs
We illustrate the generalization hierarchy using a model of the design
space of car engines. In Fig. AA.3, we read that, among others, every
CombustionEngine is an Engine, and so is every ElectricMotor. A HybridEngine
is both a combustion engine and electric motor. We sometimes say that a
generalization relation expresses the kind-of or an is-a relationship: “an
electric motor is a kind of engine.”
The diagram in Fig. AA.3 contains two abstract classes (Engine and
CombustionEngine). An abstract class has no instances of its own. In the
diagrams, we mark abstract classes using a slanted font for the class name.
Finally, Fig. AA.3 also includes a case of multiple-inheritance: a hybrid
engine is both a gas engine and an electric motor.
1
Multiple inheritance
often appears in meta-modeling applications, when a concept shares features
properties with more than one concepts, or when a concept can appear in
more than one role.
Class diagrams also allow the modeling of interfaces—this is done by
adding an interface property to a class. Note that Eclipse EMF’s class-
diagram editor puts a little interface icon next to the box label, as shown in
Fig. AA.4.
1
Technically, one can argue that this modeling is incorrect. In reality, or at least at a lower level
of abstraction, a hybrid engine is not a specialization of a gas engine and an electric motor, but
acomposition of both—see below for the composition relation. See Exercise AA.1 on p. 8
4
Figure AA.4: EMF’s syntax for
interfaces
When EMF generates code, interfaces are mapped to Java interfaces,
while classes are mapped both to classes and interfaces. The latter is a
simple pattern (a workaround, if you prefer) for Java’s lack of multiple
inheritance.
AA.3 Simple Types
EMF provides simple types (for example the EString used above), which
are mapped to Java types during code generation. An attribute declaration
can be followed by a default value, as in: color : EString = "red".
Enumerations are used to capture a small finite number of discrete simple
values of an attribute, for instance a color, as shown in Fig. AA.5.
Figure AA.5: EMF syntax for
enumerations
Enumerations can be used as types for attributes, but cannot be the ends of
associations (explained shortly), which is reserved for classes. Figure AA.6
shows the usage of the enumeration Color as the type for the attribute color
in the abstract class Vehicle.
It is also possible to introduce new basic types by providing their Java
implementations. Details about this mechanism can be found in [1] (search
for EDataType in the index).
Figure AA.6: Using an
enumeration as an attribute
type
AA.4 Associations
An association represents a relation between instances of two classes.
Note that association is qualitatively different from generalization. We
AA.4 Associations 5
use associations to model all other kinds of relations between objects than
kind-of. Associations can be bidirectional, then they have no arrows on the
ends, or uni-directional (indicated by an arrow). An example of the latter is
shown in Fig. AA.7.
Figure AA.7: A single
directional association (also
known as a reference)
The navigable name of the association (the name used for navigation in
transformation and constraint code) is written on the “far end. For example,
the
myCar.owner
gives the object representing the owner of
myCar
. Note
that the owner label is on the other side of the vehicle class—the intuition
is that it shows the name that we can use in the context of Vehicle to name
the associated person object.
In the example, the reference is also decorated with a multiplicity con-
straint
1..
, meaning that a vehicle must have at least one owner. More
than one owner is allowed in the example (for modeling co-ownership).
This also means that technically,
myCar.owner
returns a collection and not
a single instance.
In EMF, associations are unidirectional binary references. Unidirectional
references can only be navigated in one direction. In UML, references
can be bidirectional and
n
-ary. For our purpose of meta-modeling, binary
references are typically sufficient. Higher-arity references can be always
handled by creating an explicit class that will reify the association (similar
to UML association classes). But, as already said, we rarely need it in
language design.
Bidirectional references can be simulated using two unidirectional refer-
ences. EMF allows to link two unidirectional references using the EOpposite
property of the reference. In such a case, the generated code maintains
links in both directions: whenever you add a link in one direction, the link
in the other direction is updated automatically. The mechanism is a bit
complicated and has shortcomings, so test well when you rely on it. In
particular, a reference cannot be EOpposite to itself,
2
and special care might
be needed if you use references of multiplicity higher than 1.
2
This sounds a bit complicated, but in fact it appears in real domains for symmetric associations
between objects of the same class. For example, consider a class Person and a unidirectional
reference marriedTo. One way to model this in EMF would be to make the reference a
bidirectional association, but this would require that it becomes an EOpposite of itself, which
is not supported.
6
AA.5 Containment (Part-Of)
Associations can be used to denote a part-of relation (in contrast to the
kind-of relation of generalization). This is denoted using a black diamond
on the owner side, as shown in Fig. AA.8. In this example, we state that
each Vehicle contains four Wheels as its integral part. This means that a
Vehicle instance without four wheels cannot exist (such an instance is not
well-formed). When a Vehicle object is deallocated, the objects representing
wheels are also removed.
Figure AA.8: A containment
(part-of) relation
The black-diamond symbol is the UML syntax (notation) for the part-of
associations. The black-diamond semantics is that every object in such
a relation can only have one owner—so there could not be cars sharing
wheels, and that the owned objects exist only with the owner. Finally,
objects cannot be owners of themselves, so a directed sub-graph of an
object diagram (instance), in which all the links instantiate containment
edges must be a tree (or a forest).
The black-diamond associations are interchangeably called “composi-
tions,” “aggregations,” and ”part-of relations.” Out of these names, we find
“part-of” and “containment” most intuitive, and this is why we use them in
our book [5].
AA.6 Views on Class Models
So far it was not evident that we distinguish class models from class
diagrams. However, the difference is important when discussing modeling
in detail. Diagrams are mere views on models, thus, a diagram might
only be showing a fragment of a model, and multiple diagrams can show
overlapping fragments. The model is usually identified as the collection of
all model elements (all classes and relationships). Tools, including Eclipse
EMF, typically show models as abstract syntax trees and allow constructing
diagrams for parts of these trees.
Two views on class models are particularly interesting: a taxonomy and
partonomy. The taxonomy view is consistent with the standard use of this
term in knowledge classification:
Definition AA.3. Ataxonomy of a class model is a diagram present-
ing specialization-generalization relations (kind-of relations) between the
classes of this model.
AA.6 Views on Class Models 7
Figure AA.3 presents an example of a taxonomy. A generalization view
is always a directed acyclic graph containing only classes and generalization
arrows. This graph may be disconnected if we have unrelated concepts in
the model’s taxonomy.
In general knowledge classification a partonomy is a hierarchy that deals
with part–whole relationships (after Wiktionary). The use of this term in
class modeling is consistent with the general definition, as well, by relating
parts using the composition associations:
Definition AA.4. Apartonomy of a class model is a diagram (a view) pre-
senting only the part-of relationships between classes, so a view presenting
the composition associations and classes.
A partonomy view is always a tree (or more generally a forest if we
have classes that are not associated using composition)—this is due to
the semantics of containment which disallows sharing of subtrees of the
partonomy (no class can be a part of to disjoint containers). Most modeling
tools, including Eclipse EMF, requires a single, connected partonomy hier-
archy, thus forest partonomy are rarely seen in practice of class modeling. A
single partonomy is usually achieved by creating a root class containing (via
composition associations) all roots of otherwise disconnected partonomies.
Class diagrams share a lot of commonality with entity-relationship mod-
eling (E/R) used to specify database schema. One key difference introduced
by class diagrams was including the taxonomy and partonomy in the model.
None of these were part of the original E/R model.
Further Reading
If you have never been exposed to class diagrams, we recommend the
book by Seidl et al. [4] called UML
@
Classroom, which is among the best
textbooks on UML. Another good resource is the book by Fowler [2].
An excellent online resource that does not only explain UML class
diagrams, but many other types of UML diagrams as well, is https://www.uml-
diagrams.org. Its page on class diagrams
3
explains class diagrams and
instance specifications. Remember that the latter are not modeled separately
in an object diagram anymore, which is deprecated, but directly in the class
diagram. We discuss this in Sec. 3.9 in our book [5].
Furthermore, many reference guides (a.k.a., cheat sheets) exist that
provide a brief overview on class diagrams, such as DZone’s Refcardz
at https://dzone.com/refcardz/getting-started- uml. Jordi Cabot provides an
overview on such cheat sheets at https://modeling-languages.com/best-uml-
cheatsheets-and- reference-guides.
Finally, since we use Ecore as the class-modeling language of our choice,
we also recommend directly looking at Tutorial B (Eclipse Modeling
Framework) and the list of further reading in it.
3https://www.uml-diagrams.org/class-diagrams-overview.html
8
Exercises
Exercise AA.1. Change the model of Fig. AA.3 to more properly reflect the fact
that a hybrid engine is not a refinement of a combustion engine and an electric
motor, but has both of these as parts combined.
Exercise AA.2. A family consists of persons. Each person may be married to
another person. Each person may have a parent, and each parent may have
multiple children. Each person has exactly one name, exactly one age and exactly
one person number (a unique ID of type String). Each person may be enrolled in
a university. University must own one or more study programs.
a)
Create a simple class model using the tree editor of Eclipse following this
description of a domain:
b)
Create a valid instance of your diagram representing Bob married to Alice,
with their son Sam enrolled in the “SDT programme” of “IT University”.
c)
For pedagogical reasons we recommend using the tree editor and understand-
ing the relation between the tree editor and the diagram editor in Eclipse (or
any other modeling tool you are using). This will yield useful intuitions when
we start to build abstract syntax trees of models in the book.
d)
Explore different views: Create three diagrams for your model, a complete
diagram (that contains all model elements), a diagram only showing the family
relations without the enrollment aspects, and a diagram showing university
enrollment without family aspects.
Exercise AA.3. Consider the following example class diagram.
This diagram is valid in the sense that we can construct its instances. Two
example instances are shown as simple instance specification diagrams to the
right (a–b). Now consider the three unrelated class diagrams that follow. For each
of the diagrams decide whether it is valid (well-formed). If it is valid, draw an
example of a non-empty instance diagram. If invalid, explain why.
AA.6 Views on Class Models 9
a) b) c)
Figure AA.9
Exercise AA.4. In this exercise (a mini-project, in fact) we use class modeling as a
method for system comprehension. Recall from Section 1.2 in the book [5] that
using models in software development fosters knowledge conservation and reuse
by improving the domain understanding as a key strength.
We will use the implementation of JUnit 4 framework as a case study. We
assume that you are familiar with unit testing using JUnit, which will make the
exercises easier.
To avoid excessive use of time, for this and the following three exercises (that
should be solved in order) we bound the time to be used on them. This should give
you an impression of the expected level of details. Read exercises AA.4AA.6
entirely, before starting to solve this one.
Build a conceptual model of JUnit as a class model, based on available user
oriented documentation. Start with reading the user oriented documentation of
JUnit: http://junit.sourceforge.net/doc/cookbook/cookbook.htm or https://github.com/
junit-team/junit/wiki, but do ignore the javadoc for the time being. The high-level
documentation will give names of the key concepts, which will likely translate
to a handful of class names and relations between them. These concepts do not
necessarily correspond to low level implementation classes precisely.
Identify key concepts, objects, subsystems and record them as classes, associa-
tions, generalizations, and aggregations. For example when you find the concept
of Test, create the corresponding class. Then you encounter a concept of a Suite
that aggregates multiple tests. You can create a Suite class, and make it own one
or more tests using composition.
Record cardinalities precisely in your model. If you, at any point, encounter
constraints, dependencies between concepts, which cannot be expressed using
class diagrams, then note them down in English, either in a separate file, or in an
annotation. They will be input for exercises on constraints.
All modeling should be done using a modeling tool (not on paper, not using
a drawing tool). An indicative model size is circa 12 classes. Estimated time:
approximately 1 hour.
10
Obviously, this small project can be run on other frameworks than JUnit, be it
other implementations of unit testing frameworks, or any other software projects.
Exercise AA.5. In continuation of the above exercise, perform a cursory analysis of
the developer
a
oriented documentation of JUnit to refine your model. Developer
documentation for Junit is essentially only javadoc, available at: http://junit.org/
javadoc/latest/. Start with concepts that seem to be already connected to elements
in your model. When you study them, refine the model appropriately. Do not
mean converting your model to an implementation level model, just codifying
classes in JUnit’s source code. Rather try adding further abstract concepts and
relations to your existing high-level model.
Do not grow your model too much. Focus on understanding whether the
selection of classes, associations and generalizations is correct (so whether the
lower level documentation confirms your initial sketch from the previous point).
Also try to understand and record any constraints (including cardinalities) that
you might have spotted.
The objective is not to create a diagram of implementation classes. So there
does not have to be (and should not be!) a one-to-one mapping between your
model and the classes in the JUnit implementation. We only look at lower level
artifacts to understand details of the system that were too hard to understand from
user documentation. Estimated time: 1–2 hours.
Exercise AA.6. In this exercise, we delve into the JUnit code. Hopefully, after
the first two steps, reading the JUnit code is relatively easier. JUnit is a small
and well implemented framework, done by some of the best programmers out
there. Whenever you get frustrated, remember that it is by orders of magnitude
better experience to read it than reading any code you might need to inspect or
understand in your job.
Obtain the source code from JUnit GitHub repository (clone https://github.com/
junit-team/junit.git) For pedagogical reasons, it is better to work with a stable release,
than with a snapshot code that may be buggy. Switch to a stable released branch
after checking out the code.
It is good for the project to be set up, so that you can compile it. Then you can
use IDE searching, navigation support, tool tips, etc, to orientate yourself much
faster in the implementation. You can get a sanity check of the build environment
by running JUnit’s own unit tests (about 200 would likely fail out of 2000+, don’t
worry about that).
While studying the code, record new information you as learn it by enriching
and revising the class model, and your list of constraints. Again it is not our point
to reflect implementation classes one-to-one in your high-level model; rather to
add information and to correct what was misunderstood.
Task size guide: You will end up with ca. 25-30 classes, including those added
in the two prior exercises. Estimated time: about 2–3 hours, assuming that you are
reasonably fluent with the modeling tool, can read Java code and documentation,
and have used Git before.
a
We mean a contributor to JUnit project, and not developers writing tests using JUnit in other
projects.
REFERENCES 11
Exercise AA.7. After completing the three exercises above, reflect how modeling
has supported the process of understanding the implementation of an unfamiliar
system. Has it made the investigation more systematic? Has it made the compre-
hension easier? Have you ever referred to your models when trying to understand
something in later phases? If you have worked in the group: did the models, and
ability to draw support group discussions? Would the created model help, if you
needed to explain what you understood to a colleague?
References
[1]
Frank Budinsky, David Steinber, Ed Merks, Raymond Ellersick, and Timothy
Groose J. Eclipse Modeling Framework. Addison-Wesley, 2004 (cit. p. 4).
[2]
Martin Fowler. UML Distilled: A Brief Guide to the Standard Object Modeling
Language. Addison-Wesley Professional, 2004 (cit. p. 7).
[3]
Object Management Group. Unified Modeling Language Specification 2.5.1.
https://www.omg.org/spec/UML. 2017 (cit. p. 1).
[4]
Martina Seidl, Marion Scholz, Christian Huemer, and Gerti Kappel. UML-
Classroom: An Introduction to Object-Oriented Modeling. Springer, 2012
(cit. p. 7).
[5]
Andrzej W ˛asowski and Thorsten Berger. Domain-Specific Languages: Effec-
tive Modeling, Automation, and Reuse. Springer, 2023. UR L:http://dsl.design
(cit. pp. 1,6,7,9).
ResearchGate has not been able to resolve any citations for this publication.
Eclipse Modeling Framework
  • Frank Budinsky
  • David Steinber
  • Ed Merks
  • Raymond Ellersick
  • Timothy Groose
Frank Budinsky, David Steinber, Ed Merks, Raymond Ellersick, and Timothy Groose J. Eclipse Modeling Framework. Addison-Wesley, 2004 (cit. p. 4).
UML-Classroom: An Introduction to Object-Oriented Modeling
  • Martina Seidl
  • Marion Scholz
  • Christian Huemer
  • Gerti Kappel
Martina Seidl, Marion Scholz, Christian Huemer, and Gerti Kappel. UML-Classroom: An Introduction to Object-Oriented Modeling. Springer, 2012 (cit. p. 7).