Content uploaded by Lyubomir Blagoev
Author content
All content in this area was uploaded by Lyubomir Blagoev on Apr 24, 2020
Content may be subject to copyright.
1
Semantics representation regarding establishing and maintaining a Semantic Interoperability
in the e-Governance’s environment
Lyubomir Blagoev, Kamen Spassov
Abstract
The paper aims to propose a terminological basis to help create a Semantic
Interoperability in the e-Governance’s environment. It suggests more strict
definitions of commonly used terms or establishes new ones for terms used in the e-
Governance’s environment. The e-Governance lexis is established by laws (E-
Governance Act, etc.). Therefore, the proposed terminology could be considered
appropriate to realize elements of Natural Language Processing in the e-
Governance’s environment.
Introduction
Each research that concerns semantics creates its own terminology. This is necessary to deal with specific
problems concerning the corresponding development. But it is the specifics that limits the possibility of such
terminology to be introduced and applied in other areas outside the current development.
That’s why there is no common terminology yet in this poorly described area of the human knowledge or at
least such definition would look like a “self-description”, because the “self-description” itself is semantics.
The lack of common terminology in the sphere of semantics complicates the work of establishing and
maintaining the Semantic Interoperability in the e-Governance’s environment. The Semantic Interoperability is
one of the four dimensions of the Interoperability [1]- the others are technological, organizational and legal. The
participants in this e-Governance environment are way too heterogeneous which leads to different
comprehensions of the terms “data”, “concept”, “information” and many others. The difference in the
understanding of these terms is often found in the administrative slang, which is different not only in different
administrations but also in different legal texts.
The paper aims to suggest a suitable terminological basis to establish a Semantic Interoperability in the e-
Governance’s environment.
Semantics/Meaning
In Bulgarian “semantics and meaning” are synonyms, although they have different definitions in the Glossary of
the Bulgarian language, BAS [2]. For the purpose of this development, we will accept them as synonyms using
the definition given for the word “Meaning” which is: Internal, logical content, comprehended by our mind;
the meaning of a word or phrase given. We will accept this definition not only because it is used more often
in the general speech, but also because of the fact that it contains an important key element– it is
comprehended by our mind i.e. semantics is a human category created by human for human, in order to be
able to communicate between.
That’s why, when we use semantics in any form of machine presentation, it is preceded by some form of
presentation in the minds of humans. The main idea of the machine form of presenting the so created
semantics is it to be transferred between humans and inside groups of people without changing its volume.
The unambiguous presentation of semantics is a prerequisite of machine processing.
The given definition of semantics/meaning will be used in the current paper as a foundation to define the
means for its presentation.
Atomic semantics
That’s indivisible semantics, which cannot be decomposed into other components. The atomic
semantics can be reviewed as a product of decomposition of some complicated semantics. This means that it
has a terminal character as a product of this decomposition.
2
The atomic semantics can be presented as a word or phrase. We can identify the semantics given verbally or
presented as a text. The process itself relies on the demands of humans’ communication efficiency. Therefore,
a shorter form of text presentation is used which is called “name”.
Sometimes atomic semantics cannot be presented fully using “name” via words or phrases. That’s why it is
amplified with an explanation, which has no formal limits on its volume.
For example, the atomic semantics of the word “Yes” is explained with a text such as “a positive answer to a
question”, suitable for use in building a user interface for the purposes of the e-Governance.
On the other hand, the atomic semantics could be reviewed as a basis for creating a semantics which
upgrades by itself. That’s why the atomic semantics can be defined as a basic or main semantics.
Single semantics
Single semantics is composed of the main semantics and an additional semantics. For example, the
semantics “person’s height” is completed with the semantics presented with a number. In the explanation of
the semantics, it is pointed out that this number is interpreted as a dimension, presented in centimeters. The
semantics of the same number taking part in a different Single semantics could be defined in another way from
another main semantics such as weight, speed and etc.
What is specific about the Single semantics is that the context of the main semantics refines its complementary
semantics.
The additional semantics in the single semantics is named its content. The presenting of the additional
semantics can be not only in text form but also in any other forms, which can be perceived by humans directly
or by using suitable mechanical means.
Compound semantics
Compound semantics is composed of main semantics and content which is composed by other
compound semantics which can be either single or compound. There must be at least one compound
semantics. Other semantics could also be present either by obligation or with the presence of a condition
presented with another obligatory semantics present in the composition.
If we combine two identical compositions of Single and compound semantics with semantics with different
basic meaning we will form two different compound semantics. For example the compound semantics
embedded in “correspondents’ address” and in “delivery address” is different due to the difference in the main
semantics of the two addresses, although the combined composition of semantics in both cases is the same. A
characteristic of the compound semantics is that its main semantics refines the meaning presented with the
composition of other single and compound semantics. The meaning of the semantics does not change.
Semantics that is presented with compositions of other single semantics or compound semantics,
complementing compound semantics is called content. For example the content of compound semantics of its
composite data for address is formed of its components – data for country, city, street and etc.
Data – summary of the presenting of Single and Compound semantics
The mentioned above allows to establish a common construct to present the single and compound semantics,
which consists of main meaning and meaning presented with its content. Usually such construct is called
“data” as there are two types of data – single and compound data that present either single or compound
semantics.
Nevertheless, the presentation of the semantics itself requires a description which also presents semantics.
Then we will have to define a specific type of data “meta-data”, i.e. data that describe other data. But these
meta-data need to be described too. That’s how we get to meta-meta-data then meta-meta-meta-data and etc.
3
Currently there is no reason to state that this process is converging neither it is known how to deal with it in the
human’s mind. For the aim of the presentation of semantics in the e-Governance’s environment it is enough to
create an applicable solution, even if this solution does not reflect the whole nature of the process.
Model of data, instance of model
A set of data describing other data can be defined as a model. Data created according to a model are called
“instance” of this model. For a text presentation of the connection between the model and its instance it is
suitable to use the suggested in [3] designation- Instance[Model].
All instances of a model have the same main meaning, presented with the name and explanation of the model.
This way, the full meaning presented with an instance is presented with the main meaning, embedded in its
model and with the meaning, presented with the content of the instance.
The connection between model and instance can be presented through defining straight and reversed
problem:
A. Straight problem: To create a set of data according to a given model
B. Reverse problem: To recognize the model, looking at a certain set of data
“Model of model”
It is possible to define a suitable set of data, which are model and instance at the same time i.e. this will
be a construction Model[Model]. At this stage, it will be really hard or even impossible to prove the existence
of such construction with which derivatives (let us say ModelA[Model], ModelB[ModelA] and so on) it is
possible to present any type of semantics with which humans work.
But, the task to present the semantics in the e-Government environment is not that so general and wide-
ranging and it is solvable, because in this environment two types of semantics can be specified:
Semantics defined in law texts
For this semantics, the valid “meta-meta” chain is short and traceable. This results in creating a suitable
construction of Model[Model] which is applicable in the e-Governance environment. In this case, it's not
necessary to prove such “applicability” because it is possible to change the law definition of the given
semantics due to a problem to achieve applicability of the construction Model[Model].
Semantics connected to the processing of semantics, defined in law texts
A prerequisite for the applicability of the construction Model[Model] for this type of semantics is the fact that
it serves the precedent type of semantics, for which applicability is guaranteed.
Therefore, we can define the construct and composition of the data inside the construction and the
construction of Model[Model] itself, as a “necessary and sufficient” to present a semantics in the e-Governance
environment through it or its derivatives.
Information object
Using data sets without showing that they are instance of a model or expressly to point out their applicability as
a model for relevant instances is common. Furthermore, in such cases their formal presentation is not affected.
In this case, the use is limited to the availability and the meaning of such sets of data and the possibility of
them to be created/destroyed as a whole. Their identification is made thought their names.
In such cases, the term used is “information object” or just “object”. This means that Information object is a
set of data that can be created, destroyed or identified through their name as a whole. In many cases
instead of “information object” it is often used just “object”.
This allows the models and instances of models to be treated as respective types of (information) objects.
Each object will have a model, that’s being used to create it, and it will be clear whether it is supposed to be
used as a model. The meaning presented by the object will be formed by the main meaning of its model and of
the meaning presented with its content.
4
Similar treating is suitable for handling data that concern only their meaning but not their way of formal
presentation. That’s why the set of data and information objects are synonyms, which are used depending on
the context of usage.
For example, for people who are not familiar with the formal presentation of data, will be more appropriate to
use the term “information object” not only when interacting with each other but also when they work along with
programmers, who possess the ability to present them in a formal way. This way we ensure a semantical
interaction between two groups of people with different technical background.
Representation of Atomic semantics through information object type “Term”
As mentioned above, Atomic semantics is presented only with name and explanation. Which is a set of data
i.e. information object with no content. Such types of objects are terminal in a process of decomposition of
semantics or in process of composition of semantics with “meta-meta” chain. That’s why in [4], for the type of
such objects the name “term” is introduced. This way a possibility is given to use atomic semantics
Object[Term] to maintain Semantic Interoperability in e-Governance. In connection to that in [4], although
implicitly, Term[Model] is defined. The Objects[Term] presents atomic meaning, which is not clarified
through adding content to it.
Sadly, at this point of introduction of Semantic Interoperability in the Bulgarian e-Governance the definition
Model[Model] has not been introduced to be able to define Term[Model] the right way, which created problems,
which were described in more detail at [5].
For the aim of the machine processing, for the Object[Term] identification through the model identifier is
introduced. It is unique in its space of identifiers/names, which in this case matches the e-Governance’s
environment.
Representing Single semantics with information object- type “Value”
As mentioned above, Single semantics, respectively single data include main meaning and content.
Processing the semantics, presented with content is done in accordance with the main meaning and context, in
which the processing is performed.
The content could be presented either as Atomic or as Compound semantics. An example of presenting
Atomic semantics in the context of Single data is the meaning of the text string “Yes”, which was used above to
illustrate the Atomic semantics.
Presenting Compound semantics in the content of Single semantics can be in two categories:
A. Presentation, subject to machine processing at the current level of ICT.
This way of presenting provides performing actions using machine means to decompose compound
semantics, to extract its elements and to process the semantics in them. For example, presenting date
using the machine format “dd,mm,yy” allows automatically to separate the day, month and year and to
be processed together or separately depending on the requirements of the context in which it is used
embedded in its meaning. Single data, which content presents Compound semantics through elements
that can be processed using machines represent Single semantics are called Structured data.
B. Presentation, subject to processing by humans at the current level of ICT
Examples of such presentation are textual, graphical, audio-visual and etc. For example, Atomic
semantics, presented with different kinds of the text “hexagonal image, which requires mandatory
stopping of vehicles, which are moving towards it” cannot yet be processed by a machine with a good
authenticity to atomic semantics “Road sign STOP”. That’s why in such cases it's accepted that such
semantics is presented with unstructured data.
But at the moment there is available machine processing of the graphical image “Road sign STOP”, which can
recognize the atomic semantics in that road sign, but with an unacceptable level of authenticity and security at
certain conditions. If this level is acceptable for the current context, it could be reckoned as Structured data,
otherwise, it is Unstructured data.
5
With a significant dose of optimism, we can believe that the existence of Unstructured data is temporary until
the resources used to process natural language (Natural Language Processing - NLP), graphical, audio-visual
and etc. presentations of semantics are not developed to a level commensurable to humans.
What’s being said till now for the Single data and their connection to the machine processing makes natural
reference to the wide used in computer technologies name “variable”. But in different computer platforms, this
name is interpreted in different ways with different varieties of the interpretations, reflecting functional features
of the platforms. This makes using this name as a unified name for single data hard.
For the purpose of developing and maintaining Semantic Interoperability in e-Governance, through regulations
of Electronic Governance Act [4] the name “Value” was introduced via defining Value[Model]. The
Objects[Value] present single semantics, which’s content may be structured as well as unstructured.
The Objects[Value] created through this model are models of instances which serve as machine presentation
of single data. For the purpose of this presentation of each Object[Value] is given a description of the coding of
the content and features of its visualization, as in compliance not only to the regulated coding but also in
discrepancy with it.
In addition, in order to be able to differ the machine processing of the instance of Object[Value] from other
instances, it has to be labeled and designated as an area of names, in which this label is unique.
For the purpose of machine processing for the Objects[Model] an identification, through the model identifier is
introduced. It is unique in the respective space of the identifiers/names, which in this case matches with the e-
Governance’s environment.
If the space of the names of the models’ identifiers matches with the space of the names of the tags for
identification of the instances of the Objects[Value] it is possible that both identifiers match too.
Presenting compound semantics through information object type “Segment”
As mentioned above, the Compound semantics has content with a regulated construction i.e. it is presented
with Structured data. This attribute is not influenced by the possible amount of Unstructured data in the content
of Structured data, because they can be deferrable/separable in it. In this case, the processing of Structured
data is reduced to the processing of their construction.
For the purpose of establishing and maintaining Semantic Interoperability in e-Governance, according to
regulations of Electronic Governance Act [4] for the compound data, the name “Segment” is introduced through
defining the Segment[Model]. Therefore the Objects[Segment] presents compound semantics, which
content is constructed by other Compound or Single data. The Objects[Segment] are also models of
instances serving for machine presentation of Structured data.
Furthermore, to be able to differ the machine processing the instance of an Object[Segment] among other
instances, it gets tagged and indicates the space in which the current label is unique.
For the purpose of the machine processing, there is an identification introduced for Object[Segment], though
model identifier. It is unique in the corresponding space of identifiers/names, which in this case matches the e-
Governance’s environment.
If the space of the names of the models’ identifiers matches with the space of the names of the tags for
identification of the instances of the Objects[Segment], it is possible that both identifiers also match.
Objects type “Nomenclature”
These are a type of data, analogical by construct to data type “Value”, but with their content it is
pointed out their main semantics/meaning of other data or atomic semantics. That’s why for the content
of such type of data an identifier (or name) is used with which is pointed out (addressed in the respective
space of names) model of data or definition of atomic semantics.
6
The presentation of semantics with data type “Nomenclature” is different from the presentation of other types of
data. The main meaning of this type of data presents a “closed question”. The possible answers are presented
with the meaning of the possible options of the content. These possible couples of options are embedded in
the model of the respective data “Nomenclature”.
For the purpose of establishing and maintaining Semantic Interoperability in e-Governance, through the
provision by e-Governance Act [4] for this specific type single data the name “Nomenclature” is introduced
through defining a Nomenclature[Model]. The Objects[Nomenclature] created by this model themselves are
also models of instances serving for machine presentation of single data. For the purpose of this presentation,
for each Object[Nomenclature], there is a tag that is appointed and also the space of the names in which this
tag is unique.
For the purpose of machine processing, the Objects[Nomenclature] are introduced through a model identifier. It
is unique in the respective space of the identifiers/names, which in this case matches the e-Governance’s
environment. If the environment of the names of the model identifiers matches the environment of the name
tags for identifiers of the instances of Objects[Nomenclature] it is possible both types of identifiers to match.
When building a user interface for work with Object[Nomenclature] is provided not only visualization of name
and explanation of instance of Object[Nomenclature], but are also provided funds for a choice of possible
content. This choice manages any process of data management.
Such a process is presented as a sequence of processing stages, each of them presented with the
corresponding couple “input data-output data” which are from the type Object[Segment] or in particular from
the type Object[Value].
Branches in such sequences are presented through data from the type Object[Nomenclature]. The most simple
branch with the construct IF/THEN/ELSE is presented with data type Object[Nomenclature], which have two
possible options of content- one to present THEN- and one for ELSE- branch. The main meaning of the
Object[Nomenclature] presents the parameters and their way of processing, therefore, a choice is made
between one of the options of the content of the object i.e. one of the branches THEN/ELSE.
Concept as a summary of the means for presenting semantics
A summary of the described tools to present semantics is presented in Figure 1:
Data
The meaning embedded in data is formed from their main meaning
and the meaning presented with their content.
Term[Model]
Main meaning:
Name
Explanation
Value[Model]
Main meaning
Name
Explanation
Nomenclature[Model]
Main meaning
Name
Explanation
Segment[Model]
Main meaning
Name
Explanation
The main meaning
embedded in
models presents
„concepts“.
Content:
Adds meaning
through suitable
coding of
semantics
presentation.
Content:
Adds meaning through
the main meaning of one
of the models in the set.
Content:
Adds meaning
through the
meaning
embedded in
other data.
The content of data
adds meaning to
the main meaning
embedded in them.
Instance of the
model:
xml- construct of
single data which
content is
presented with the
corresponding
coding.
Instance of the model:
xml- construct of single
data containing an
identifier of a model from
a set of models, which
forms the content of the
data.
Instance of the
model:
xml- construct of
composite data
which content is
formed of other
single and
composite data.
Instances of models
of data are xml-
constructs.
Figure 1. Means for semantics presentation
7
The summary can be reduced to these thesis and explanations:
A. The models of Object[Term] present “concept without content”. The meaning of such objects is formed
by the main meaning, embedded in their name and explanation. Pointing/citing of such defined
meaning can be done when using the name or the model identifier of the current Object[Term]. Using
this identifier/name we can find the corresponding object and from its name and explanation the
meaning it contains can be extracted.
B. The models of data, i.e. of Object[Value], Object[Nomenclature] and Object[Segment], present
“concepts with content”. The meaning of the instances of such models is formed from the main
meaning of the models (embedded in the name and explanation) and the meaning, embedded in the
content of the instances.
C. The instances of models of data are presented with the means of a suitable formalism of machine
processing as an xml-construct for example, with its tag is pointed out in the corresponding model of
data.
D. The method of coding/presenting of semantics/meaning with the content (i.e. with xml-construct) of data
type is regulated in their model and this is a requirement for creating software support of this
coding/presenting.
E. The method of visualization is the meaning/semantics, embedded in the content of data is defined in
their model and this is a requirement for creating software support of such visualization.
F. This way, the whole meaning/semantics, embedded in data, machine presented (for example) with xml-
construct is formed in a type suitable for the person’s perception:
a. The main meaning presented with the name and explanation in the model of xml-construct. The
main meaning is presented by visualizing the name and the explanation of the data model.
b. The complementary meaning, embedded in the content of the data, i.e. in the content of xml-
construct. Visualizing of the content of xml-construct is done according to the regulation,
embedded in the data model.
Therefore: Each information object presents a concept with its main meaning, embedded in its name
and explanation. As a further development of this conclusion we can say that each information object has a
conceptual usage, irrespective of all the semantics embedded in it.
Unstructured data, information
Earlier, it was mentioned that the recognizability of the construct of composite data defines them as Structured
data. For convenience, when used in practice, they are just called “data”. In the absence of cognition of the
data structure, the name "Unstructured Data" is explicitly used.
Usually, a more general understanding is used to represent semantics/meaning, which is called
"information". It does not specify whether it can undergo a machine processing. Moreover, in most cases it is
understood that the meaning embedded in some information will only be processed / perceived by humans.
The ability of processing unstructured data and information by humans only equals them and defines a specific
presentation of semantics/meaning, for which (at this state) lacks applicable description/model and machine
tools for processing. The difference is that with unstructured data it is not disputed the possibility of a model
availability, but such is missing, or there is no access to a model or there is access, but the model is not of
good quality, therefore it's unsuitable for data processing. Therefore, if an applicable model of unstructured
data is provided, they turn into structured and can undergo a machine processing.
The information is usually dominated by the opposite view of the initial lack of a model representing the
semantics in it, without challenging the possibility that a relevant, applicable model may emerge over time.
The difference between the terms "information" and "unstructured data" is very subtle and sometimes
disputable in their practical use.
For example with regulation 910/2014 [6] a definition for a document is introduced, which in the terms of the
current development, in general, can be reduced to “document is each set of data”. Since it is not clarified
whether the data are structured or unstructured, this way it includes not only documents with any text,
8
graphical or any other type of images and etc., but also documents with a formal presentation of data in their
content.
Concluding notes
The suggested in the current work means for presentation of the semantics cover the area of the Semantic
Interoperability in the e-Governance’s environment. Their application in the administrative practice requires
small efforts to overcome any types of administrative slang. They are also applicable to definitions, which are
not mentioned in the current paper.
The suggested in this work terminological system is a suitable foundation for creation of models of concepts,
data, and processes for establishing and maintaining Semantic Interoperability in the e-Governance’s
environment. The idea for the e-Governance itself includes as a mandatory element the introduction of
automation for processing information in administrative activities, for which this is technologically possible and
legal.
Important spheres of such automation can be suggested processing of text in natural language (NLP),
presentation of rules and control of their formal application and others. Resolving such problems will be
realized in the conditions of Semantic Interoperability, as a further development of the suggested in the current
work terminological system for semantics presentation.
Bibliography
[1]
European Commission , „European Interoperability Framework,“ 2005.
[2]
Institute for Bulgarian language by BAS , „Glossary of the Bulgarian language,“ http://ibl.bas.bg/rbe/.
[3]
Lyubo Blagoev, Tihomir Blagoev, „Semantic Network Based Architecture,“ ResearchGate, 2019.
[4]
National Assembly of the Republic of Bulgaria , „Electronic Governance Act,“ State Gazette , 2007.
[5]
Blagoev L., Spassov K., Manolov S., Dimitrov G., Antonova A., „Analysis and recommendations for further
development of the National model of data and processes in the administration,“ ResearchGate, 2013.
[6]
European Commission , „Regulation 910,“ 2014.