
Nikolaos KonstantinouThe University of Manchester · School of Computer Science
Nikolaos Konstantinou
Ph.D.
About
92
Publications
24,347
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
633
Citations
Citations since 2017
Introduction
I am a Research Fellow at the University of Manchester, School of Computer Science, currently working in the VADA project. I obtained my PhD and B.Eng from the School of ECE, NTUA, Greece. I have extensive work experience in several positions of the software development cycle as well as in research roles in numerous international research projects. My research interests are in data science and knowledge management. I am a member of the Technical Chamber of Greece and a Senior Member of the IEEE.
Additional affiliations
October 2015 - present
May 2013 - September 2015
September 2011 - May 2013
Publications
Publications (92)
Format transformation is one of the most labor intensive tasks of a data wrangling process. Recent advances in programming by example proposed synthesis algorithms that showed promising results on spreadsheet data. However, when employed on repositories consisting of multiple sources and large number of examples, such algorithms manifest scalabilit...
Data preparation, whether for populating enterprise data warehouses or as a precursor to more exploratory analyses, is recognised as being laborious, and as a result is a barrier to cost-effective data analysis. Several steps that recur within data preparation pipelines are amenable to automation, but it seems important that automated decisions can...
Background
Data scientists spend considerable amounts of time preparing data for analysis. Data preparation is labour intensive because the data scientist typically takes fine grained control over each aspect of each step in the process, motivating the development of techniques that seek to reduce this burden.
Results
This paper presents an archit...
Source selection is the problem of identifying a subset of available data sources that best meet a user's needs. In this paper we propose a user-driven approach to source selection that seeks to identify sources that are most fit for purpose. The approach employs a decision support methodology to take account of a user's context, to allow end users...
Data scientists are usually interested in a subset of sources with properties that are most aligned to intended data use. The SOURCERY system supports interactive multi-criteria user-driven source selection. SOURCERY allows a user to identify criteria they consider of importance and indicate their relative importance, and seeks a source selection r...
Data analysis often uses data sets that were collected for different purposes. Indeed, new insights are often obtained by combining data sets that were produced independently of each other, for example by combining data from outside an organization with internal data resources. As a result, there is a need to discover, clean, integrate and restruct...
Machine learning can be applied in applications that take decisions that impact people’s lives. Such techniques have the potential to make decision making more objective, but there also is a risk that the decisions can discriminate against certain groups as a result of bias in the underlying data. Reducing bias, or promoting fairness, has been a fo...
Data analytics stands to benefit from the increasing availability of datasets that are held without their conceptual relationships being explicitly known. When collected, these datasets form a data lake from which, by processes like data wrangling, specific target datasets can be constructed that enable value-adding analytics. Given the potential v...
Behaviour Driven Development (BDD) is an agile testing technique that enables software requirements to be specified as example interactions with the system, using structured natural language. While (in theory) being readable by non-technical stakeholders, the examples can also be executed against the code base to identify behaviours that are not ye...
The process of preparing potentially large and complex data sets for further analysis or manual examination is often called data wrangling. In classical warehousing environments, the steps in such a process are carried out using Extract-Transform-Load platforms, with significant manual involvement in specifying, configuring or tuning many of them....
Data preparation, whether for populating enterprise data warehouses or as a precursor to more exploratory analyses, is recognised as being laborious, and as a result is a barrier to cost-effective data analysis. Several steps that recur within data preparation pipelines are amenable to automation, but it seems important that automated decisions can...
In Behaviour-Driven Development (BDD), the behaviour of the software to be built is specified as a set of example interactions with the system, expressed using a “Given-When-Then” structure. The examples are written using customer language, and are readable by end-users. They are also executable, and act as tests that determine whether the implemen...
The process of preparing potentially large and complex data sets for further analysis or manual examination is often called data wrangling. In classical warehousing environments, the steps in such a process have been carried out using Extract-Transform-Load platforms, with significant manual involvement in specifying, configuring or tuning many of...
Data wrangling, the multi-faceted process by which the data required by an application is identified, extracted, cleaned and integrated, is often cumbersome and labor intensive. In this paper, we present an architecture that supports a complete data wrangling lifecycle, orchestrates components dynamically, builds on automation wherever possible, is...
Several approaches have been proposed in the literature for offering RDF views over databases. In addition to these, a variety of tools exist that allow exporting database contents into RDF graphs. The approaches in the latter category have often been proved demonstrating better performance than the ones in the former. However, when database conten...
This book explains the Linked Data domain by adopting a bottom-up approach: it introduces the fundamental Semantic Web technologies and building blocks, which are then combined into methodologies and end-to-end examples for publishing datasets as Linked Data, and use cases that harness scholarly information and sensor data. It presents how Linked D...
In this chapter, we introduce and discuss the problems that Linked Data solve and the concepts that are related to these problems. We introduce and analyze the basic concepts that are related to the generation of Linked Data and the Semantic Web in general. We provide a brief history of the Semantic Web and the associated evolution of concepts, pro...
This chapter introduces the semantic modeling procedure, detailing its technical characteristics, possibilities and limitations. First, we present the languages that are used for semantic description. We present RDF, RDFS and OWL, describe their expressiveness in terms of describing Web Resources, and the abilities they provide in order to describe...
In this Chapter, we consider relational databases as a data source for the generation of Linked Data, given that they constitute one of the most popular data storage media, containing huge data volumes that feed the vast majority of information systems worldwide. In this context, we review the related literature and reveal the main motivations that...
In this Chapter, we focus on dealing with data originating from sensor data streams, in order to materialize an intelligent, semantically-enabled data layer. First, we introduce the concepts that are covered in this Section: real-time, context-awareness, windowing, information fusion. Next, we mention the difficulties associated with the attempt of...
This chapter provides an overview of the methodologies and technologies that support Linked Data designing and publishing. More specifically, this chapter starts with a presentation of the rationale and a discussion about how data can be opened up (i.e. published under an open license). Basic principles are first introduced regarding the cases in w...
In this Chapter, we summarize and discuss the material presented throughout this book. We recapitulate what is presented and discussed in each Chapter. We discuss the most interesting aspects of the Web of Data landscape, highlighting its main contributions, and then continue with a discussion, mentioning our most important observations, including...
Purpose
– This paper aims to introduce a transformation engine which can be used to convert an existing institutional repository installation into a Linked Open Data repository.
Design/methodology/approach
– The authors describe how the data that exist in a DSpace repository can be semantically annotated to serve as a Semantic Web (meta)data repos...
Purpose
– The purpose of this paper is to study advantages and challenges of electronic academic textbook (e-textbook) for the Hellenic higher education and the publishing community. In the higher education domain, the shift to e-textbook adoption entails numerous benefits. However, reluctance is noted in students as well as in publishers, impeding...
In addition to tools offering RDF views over databases, a variety of tools exist that allow exporting database contents into RDF graphs; tools proven that in many cases demonstrate better performance than the former. However, in cases when database contents are exported into RDF, it is not always optimal or even necessary to dump the whole database...
As far as digital repositories are concerned, numerous benefits emerge from the disposal of their contents as Linked Open Data (LOD). This leads more and more repositories towards this direction. However, several factors need to be taken into account in doing so, among which is whether the transition needs to be materialized in real-time or in asyn...
This paper introduces a novel sensor information fusion system enabling security and surveillance in large scale sensor saturated urban environments. The system is built over state-of-the art sensor networks middleware and provides information fusion at multiple layers. A distinguishing characteristic of the system is that it support seamless integ...
This paper presents a lightweight approach for providing web-based location aware multimedia content retrieval through Java enabled handheld devices. The main distinguishing characteristic of the proposed approach is that it separates the positioning system from the content access mechanisms, while being generic to the selection of the localization...
In this paper, we present a three-layer flexible architecture which intends to help developers and end users to take advantage of the full potential that modern sensor networks can offer. The proposed architecture deals with issues regarding data aggregation, data enrichment and finally, data management and querying using semantic Web techniques. S...
This paper introduces an approach to mapping relational database contents to ontologies. The current effort is motivated by the need of including into the Semantic Web volumes of web data not satisfied by current search en-gines. A graphical tool is developed in order to ease the mapping procedure and export enhanced ontologies linked to database e...
This paper proposes a middleware architecture for the automated, real-time, unsupervised annotation of low-level context features and their mapping to high-level semantics. The distinguishing characteristic of this architecture is that both low level components such as sensors, feature extraction algorithms and data sources, and high level componen...
In this paper, we present a three-layer flexible architecture which intends to help developers and end users to take advantage of the full potential that modern sensor networks can offer. The proposed architecture deals with issues regarding data aggregation, data enrichment and finally, data management and querying using semantic Web techniques. S...
The Linked Open Data (LOD) movement is constantly gaining worldwide acceptance. In this paper we describe how LOD is generated in the case of digital repositories that contain bibliographic information, adopting international standards. The available options and respective choices are presented and justified while we also provide a technical descri...
As far as digital repositories are concerned, numerous benefits emerge from the disposal of their contents as Linked Open Data (LOD). This leads more and more repositories towards this direction. However, several factors need to be taken into account in doing so, among which is whether the transition needs to be materialized in real-time or in asyn...
The Linked Open Data (LOD) movement is constantly gaining worldwide acceptance. In this paper we describe how LOD is generated in the case of digital repositories that contain bibliographic information, adopting international standards. The available options and respective choices are presented and justified while we also provide a technical descri...
In cases when the information being collected and processed originates by a multi-sensor system, one has to be aware of the impact of the approaches followed in designing the overall system behavior. Typically, there are many steps involved in processing sensory information, from the time a real-world event takes place until the information that de...
This paper analyzes and proposes Expowave, a distributed algorithm for the scheduling of an RFID reader network. The behavior of the algorithm is presented in detail, and its performance is evaluated through a set of simulation experiments. It is demonstrated that the algorithm constitutes an efficient approach to the reader anti-collision problem,...
As data proliferates at increasing rates, the need for real-time stream processing applications increases as well. In the same way that data stream management systems have emerged from the database community, there is now a similar concern in managing dynamic knowledge among the Semantic Web community. Unfortunately, early relevant approaches are t...
This paper introduces a novel sensor information fusion system enabling security and surveillance in large scale sensor saturated urban environments. The system is built over state-of-the art sensor networks middleware and provides information fusion at multiple layers. A distinguishing characteristic of the system is that it support seamless integ...
Despite the proliferation of RFID systems and applications, there is still no easy way to develop, integrate and deploy non-trivial RFID solutions. Indeed, the latter comprise various middleware modules (e.g., data collection and filtering, generation of business events, integration with enterprise applications), which must be deployed and configur...
The increasing availability of small-size sensor devices during the last few years and the large amount of data that they generate has led to the necessity for more efficient methods regarding data management. In this chapter, we review the techniques that are being used for data gathering and information management in sensor networks and the advan...
This paper investigates the problem of the real-time integration and processing of multimedia metadata collected by a distributed sensor network. The discussed practical problem is the efficiency of the technologies used in creating a Knowledge Base in real-time. Specifically, an approach is proposed for the real-time, rule-based semantic enrichmen...
After several years of research, the fundamental Semantic Web technologies have reached a high maturity level. Nevertheless, the average Web user has not yet taken advantage of their full potential. In this paper, we introduce the Semantic Web bottleneck, analyse the main problems that preserve it and suggest ways to overcome it. In particular, we...
This paper presents an evaluation of using the low cost Bluetooth wireless technology, as a localisation technique, fitted in a lightweight approach for providing web-based location aware content through Java enabled handheld devices. This approach separates the positioning system from the content access mechanisms, while being generic and independ...
The major problem of information available is that usually, the form in which it is published lacks both syntactical and semantic homogenisation. There is a huge amount of information, both online and in less interlinked sources. The existence of metadata that would add a higher degree of clarity is not taken for granted. The question that arises i...
In this paper, we present a three-layer flexible architecture which intends to help developers and end users to take advantage of the full potential that modern sensor networks can offer. The proposed architecture deals with issues regarding data aggregation, data enrichment and finally, data management and querying using Semantic Web techniques. S...
In this paper, we present a three-layer flexible architecture which intends to help developers and end users to take advantage of the full potential that modern sensor networks can offer. The proposed architecture deals with issues regarding data aggregation, data enrichment and finally, data management and querying using semantic Web techniques. S...
In this paper we discuss the problem of mapping relational database contents and ontologies. The motivation lies in the fact that during the latest years, the evolution in Web Technologies rendered the addition of intelligence to the information residing on the Web a necessity. We argue that the addition of formal semantics to the databases that st...
This paper proposes a middleware architecture for the automated, real-time, unsupervised annotation of low-level context features and their mapping to high-level semantics. The distinguishing characteristic of this architecture is that both low level components such as sensors, feature extraction algorithms and data sources, and high level componen...
This paper presents a lightweight approach for providing web-based location aware multimedia content retrieval through Java enabled handheld devices. The main distinguishing characteristic of the proposed approach is that it separates the positioning system from the content access mechanisms, while being generic to the selection of the localization...
In this paper we examine the requirements for deploying advanced Location Based Guidance Services in museum and/or exhibition environments, and we propose an architectural approach that copes with these requirements. The proposed architecture provides automatic and on demand audiovisual content retrieval, both on-site and through the Web, to differ...
This paper introduces an approach to mapping relational database contents to ontologies. The current effort is motivated by the need of including into the Semantic Web volumes of web data not satisfied by current search engines. A graphical tool is developed in order to ease the mapping procedure and export enhanced ontologies linked to database en...
In this paper we present E-Museum a system for providing advanced audiovisual guidance services in museums (and exhibitions) to different classes of users. These services include, automatic and on demand audiovisual content retrieval, both on-site and through the Web. On-site services are provided through handheld devices, which exploit the user’s...