Simon Scerri

Simon Scerri
Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS | IAIS · Department of Organized Knowledge

PhD

About

83
Publications
55,301
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,838
Citations
Additional affiliations
March 2011 - February 2014
University of Galway
Position
  • PostDoc Position

Publications

Publications (83)
Conference Paper
Full-text available
In this paper, we consider a dataset compiled from online job adverts for consecutive fixed periods, to identify whether repeated and automated observation of skills requested in the job market can be used to predict the relevance of skillsets and the predominance of skills in the near future. The data, consisting of co-occurring skills observed in...
Chapter
Digital service users are routinely exposed to Privacy Policy consent forms, through which they enter contractual agreements consenting to the specifics of how their personal data is managed and used. Nevertheless, despite renewed importance following legislation such as the European GDPR, a majority of people still ignore policies due to their len...
Conference Paper
Full-text available
Digital service users are routinely exposed to Privacy Policy consent forms, through which they enter contractual agreements consenting to the specifics of how their personal data is managed and used. Nevertheless, despite renewed importance following legislation such as the European GDPR, a majority of people still ignore policies due to their len...
Conference Paper
Full-text available
Understanding the needs of highly-dynamic job market sectors is of crucial importance to job seekers, employers, and educational bodies alike. This paper describes efforts to identify skill demand composition and dynamics by constructing and interpreting a time series of skills networks that are routinely identified through an established ag-glomer...
Conference Paper
Full-text available
Increasing data volumes have extensively increased application possibilities. However, accessing this data in an ad hoc manner remains an unsolved problem due to the diversity of data management approaches, formats and storage frameworks, resulting in the need to effectively access and process distributed heterogeneous data at scale. For years, Sem...
Preprint
Full-text available
Whereas the availability of data has seen a manyfold increase in past years, its value can be only shown if the data variety is effectively tackled ---one of the prominent Big Data challenges. The lack of data interoperability limits the potential of its collective use for novel applications. Achieving interoperability through the full transformati...
Presentation
Full-text available
Advances in Data Management methods have resulted in a wide array of storage solutions having varying query capabilities and supporting different data formats. Traditionally, heterogeneous data was transformed off-line into a unique format and migrated to a unique data management system, before being uniformly queried. However, with the increasing...
Conference Paper
Full-text available
The last two decades witnessed a remarkable evolution in terms of data formats, modalities, and storage capabilities. Instead of having to adapt one's application needs to the, earlier limited, available storage options, today there is a wide array of options to choose from to best meet an application's needs. This has resulted in vast amounts of d...
Conference Paper
Full-text available
Squerall is a tool that allows the querying of heterogeneous, large-scale data sources by leveraging state-of-the-art Big Data processing engines: Spark and Presto. Queries are posed on-demand against a Data Lake, i.e., directly on the original data sources without requiring prior data transformation. We showcase Squerall's ability to query five di...
Conference Paper
Full-text available
Institutions from different domains require the integration of data coming from heterogeneous Web sources. Typical use cases include Knowledge Search, Knowledge Building, and Knowledge Completion. We report on the implementation of the RDF Molecule-Based Integration Framework MINTE+ in three domain-specific applications: Law Enforcement, Job Market...
Conference Paper
Full-text available
Although the use of apps and online services comes with accompanying privacy policies, a majority of end-users ignore them due to their length, complexity and unappealing presentation. In light of the, now enforced EU-wide, General Data Protection Regulation (GDPR) we present an automatic technique for mapping privacy policies excerpts to relevant...
Article
Full-text available
The analysis of increasingly large and diverse data for meaningful interpretation and question answering is handicapped by human cognitive limitations. Consequently, semi-automatic abstraction of complex data within structured information spaces becomes increasingly important, if its knowledge content is to support intuitive, exploratory discovery....
Conference Paper
Full-text available
With the omnipresent availability and use of cloud services, software tools, Web portals or services, legal contracts in the form of license agreements or terms and conditions regulating their use are of paramount importance. Often the textual documents describing these regulations comprise many pages and can not be reasonably assumed to be read an...
Conference Paper
Full-text available
The rapid changes in the job market, including a continuous year-on-year increase in new skills in sectors like information technology, has resulted in new challenges for job seekers and educators alike. The former feel less informed about which skills they should acquire to raise their competitiveness, whereas the latter are inadequately prepared...
Conference Paper
Full-text available
The management and analysis of large-scale datasets – described with the term Big Data – involves the three classic dimensions volume, velocity and variety. While the former two are well supported by a plethora of software components, the variety dimension is still rather neglected. We present the BDE platform – an easy-to-deploy, easy-to-use and a...
Article
Full-text available
In enterprises, Semantic Web technologies have recently received increasing attention from both the research and industrial side. The concept of Linked Enterprise Data (LED) describes a framework to incorporate benefits of Semantic Web technologies into enterprise IT environments. However, LED still remains an abstract idea lacking a point of origi...
Conference Paper
Full-text available
In enterprises, Semantic Web technologies have recently received increasing attention from both the research and industrial side. The concept of Linked Enterprise Data (LED) describes a framework to incorporate benefits of Semantic Web technologies into enterprise IT environments. However, LED still remains an abstract idea lacking a point of origi...
Conference Paper
Full-text available
Ignoring End-User License Agreements (EULAs) for online services due to their length and complexity is a risk un-dertaken by the majority of online and mobile service users. This paper presents an Ontology-Based Information Extraction (OBIE) method for EULA term and phrase extraction to facilitate a better understanding by humans. An ontol-ogy capt...
Article
Full-text available
Semantic technologies in enterprises have recently received increasing attention from both the research and industrial side. The concept of Linked Enterprise Data (LED) describes a framework to incorporate benefits of semantic technologies into enterprise IT environments. However, LED still remains an abstract idea lacking a point of origin, i.e.,...
Conference Paper
Full-text available
The Web of Data is an increasingly rich source of information, which makes it useful for Big Data analysis. However, there is no guarantee that this Web of Data will provide the consumer with truthful and valuable information. Most research has focused on Big Data's Volume, Velocity, and Variety dimensions. Unfortunately, Veracity and Value, often...
Article
Full-text available
We conduct a systematic survey with the aim of assessing open government data initiatives, that is; any attempt, by a government or otherwise, to open data that is produced by a governmental entity. We describe the open government data life-cycle and we focus our discussion on publishing and consuming processes required within open government data...
Conference Paper
Full-text available
Many LOD datasets, such as DBpedia and LinkedGeoData, are voluminous and process large amounts of requests from diverse applications. Many data products and services rely on full or partial local LOD replications to ensure faster querying and processing. While such replicas enhance the flexibility of information sharing and integration infrastructu...
Chapter
Full-text available
Recent trends in ubiquitous computing target to provide user-controlled servers, providing a single point of access for managing different personal data in different Online Social Networks (OSNs), i.e. profile data and resources from various social interaction services (e.g., LinkedIn, Facebook, etc.). Ideally, personal data should remain independe...
Conference Paper
Full-text available
The di.me userware is a pervasive personal information management system that successfully adopted ontologies to provide various intelligent features. Supported by a suitable user interface, di.me provides ontology-driven support for the (i) integration of personal information from multiple personal sources, (ii) privacy-aware sharing of personal d...
Conference Paper
Full-text available
The di.me userware is a pervasive personal information management system that successfully adopted ontologies to provide various intelligent fea-tures. Supported by a suitable user interface, di.me provides ontology-driven sup-port for the i) integration of personal information from multiple personal sources, ii) privacy-aware sharing of personal d...
Conference Paper
Full-text available
People working in an office environment suffer from large volumes of information that they need to manage and access. Frequently, the problem is due to machines not being able to recognise the many implicit relationships between office artefacts, and also due to them not being aware of the context surrounding them. In order to expose these relation...
Article
Full-text available
Although a number of initiatives provide personalized context-aware guidance for niche use cases, a standard framework for context awareness remains lacking. This article explains how semantic technology has been exploited to generate a centralized repository of personal activity context. This data drives advanced features such as personal situatio...
Article
Full-text available
Future generation networks target collecting intelligence from multiple sources based on end-users' data and their social interaction in order to draw useful conclusions on enabling users to execute their rights to online privacy. These networks form a rising class of service-oriented broker platforms. Designers and providers of such network platfo...
Conference Paper
Instance matching targets the extraction, integration and matching of instances referring to the same real-world entity. In this paper we present a weighted ontology-based user profile resolution technique which targets the discovery of multiple online profiles that refer to the same person identity. The elaborate technique takes into account profi...
Conference Paper
Full-text available
The increase in use of smart devices nowadays provides us with a lot of personal data and context information. In this paper we describe an approach which allows users to define and register rules based on their personal data activities in an event processor, which continuously listens to perceived context data and triggers any satisfied rules. We...
Conference Paper
Full-text available
Today's personal devices provide a stream of information which, if processed adequately, can provide a better insight into their owner's current activities, environment, location, etc. In treating these devices as part of a personal sensor network, we exploit raw and interpreted context information in order to enable the automatic recognition of pe...
Conference Paper
The di.me userware visualizes vast personal information from various sources and allows for sharing them in a decentralized social network. Multiple identities can be used to avoid unintended linkability when communicating to other users or external systems. The di.me user interface for that is presented in this paper. A user-centered information-...
Conference Paper
Full-text available
The di.me userware is a decentralised personal information sharing system with a difference: extracted information and observed personal activities are exploited to automatically recognise personal situations, provide privacy-related warnings, and recommend and/or automate user actions. To enable reasoning, personal information from multiple device...
Conference Paper
Full-text available
Recommender systems depend on the amount of available and processable information for a given purpose. Trends to- wards decentralized online social networks (OSNs), promising more user control by means of privacy preserving mechanisms, lead to new challenges for (social) recommender systems. Information, recommender algorithms rely on, is no longer...
Conference Paper
The average person today is required to create and separately manage multiple online identities in heterogeneous online accounts. Their integration would enable a single entry point for the management of a person's digital personal information. Thus, we target the extraction, retrieval and integration of these identities, using a comprehensive onto...
Conference Paper
Full-text available
New trends in pervasive computing allow for hosting user controlled servers for integrating respective user’s social sph- eres. One main feature of such servers is the provision of a single point for managing user’s data and resources from various social interaction services (e.g., LinkedIn, Facebook, etc.). A step forward would be to include the c...
Conference Paper
Trust calculation to inform privacy recommendations based on context information involvement (e.g. location information, nearby people) is an increasing need in pervasive environments. In this paper we present a multidimensional trust metric designed for access control decisions in scenarios of the EU funded digital.me project. Thereby each involve...
Conference Paper
A new generation of distributed social networks is promising to give back users full control over their personal information as shared in private and business life. However, there are many aspects to this control, such as information ownership, access to third parties and limited persistence. This paper compares various existing solutions against a...
Conference Paper
Full-text available
Users are currently required to create and separately manage dupli-cated personal data in heterogeneous online accounts. Our approach targets the crawling, retrieval and integration of this data, based on a comprehensive ontol-ogy framework which serves as a standard format. The motivation for this inte-gration is to enable single point management...
Article
Efforts by the pervasive, context-aware system development community have over the years produced a wide variety of context-aware techniques and frameworks. However, a bulk of this technology tends to be strictly tied to a native system, thus largely limiting its external adoption. In addressing this limitation, we introduce an interoperable contex...
Article
Full-text available
Nowadays, smart devices perceive a large amount of information from device sensors, usage, and other sources which contribute to defining the user's context and situations. The main problem is that although the data is available, it is not processed to help the user deal with this information easily. Our approach is based on the assumption that, gi...
Chapter
Effective support for knowledge work has to center on the activities and needs of the individual knowledge worker. The EU-funded project NEPOMUK realized a comprehensive work environment for improved personal knowledge work: the Social Semantic Desktop. Based on semantic web technology, the NEPOMUK Social Semantic Desktop allows access to informati...
Article
Full-text available
A new trend in pervasive personal server hosting is to enable the integration of a user's social spheres. Ideally, the design of access control to private data should be exible and independent of the target host. Personal data should also remain independent of environmental constraints, e.g., in order to support easy migration to new deployment la...
Article
As the number of social websites offering tagging facilities increases, tagging has become not only a common basis for user participation, but also an important aspect of social content. Tagging is primarily based on the user's participation and interaction, including the sharing and the exchange of their interests. However, even though users can c...
Conference Paper
Digital means of communications such as email and IM have become a crucial tool for collaboration. Taking advantage of the fact that information exchanged over these media can be made persistent, a lot of research has strived to make sense of the ongoing communication processes in order to support the participants with their management. In this Cha...
Conference Paper
Full-text available
Email can be considered as a virtual working enviro nment in which users are constantly struggling to m anage the vast amount of exchanged data. Although most of this data belongs to well-defined workflows, these are implicit and l argely unsupported by existing email clients. Semanta provides this suppo rt by enabling Semantic Email - email enhanc...
Conference Paper
Taking advantage of the fact that knowledge exchanged within digital working environments can be made persistent, a lot of research has strived to make sense of the ongoing communications in order to support the participants with their shared management. Semantic technology has been applied for the purpose as it ensures a shared understanding of th...
Conference Paper
Semanta provides a simple interface for semantic email.It enables machines to support email users with correctly interpreting, handling and keeping track of action items within email messages, visualizing email workflows, and extracting tasks and appointments from email messages.
Conference Paper
Full-text available
Semanta is a system supporting Semantic Email, implemented as an add-in to two popular Mail User Agents, using existing email transport technology and integrated with the Social Semantic Desktop. It enables machines to support email users with correctly interpreting, handling and keeping track of action items within email messages, visualizing emai...
Conference Paper
Full-text available
In this paper we present Semanta – a fully-implemented system supporting Semantic Email Processes, integrated into the existing technical landscape and using existing email transport technology. By applying Speech Act Theory, knowledge about these processes can be made explicit, enabling machines to support email users with correctly interpreting,...
Conference Paper
Full-text available
In this paper we analyse a social structure of an online community defined through tagging practices, and in- vestigate whether useful knowledge about the evolution of social networks can be mined through tagging prac- tices, and whether a more representative social structure has any influence on the tagging experience itself. The results from tagg...
Conference Paper
The vision of the Social Semantic Desktop defines a user’s personal information environment as a source and end-point of the Semantic Web: Knowledge workers comprehensively express their information and data with respect to their own conceptualizations. Semantic Web languages and protocols are used to formalize these conceptualizations and for coor...
Conference Paper
Semanta provides a simple interface for semantic email.It enables machines to support email users with correctly interpreting, handling and keeping track of action items within email messages, visualizing email workflows, and extracting tasks and appointments from email messages.
Conference Paper
Full-text available
As the number of Web 2.0 sites offering tagging facilities for the users' voluntary content annotation increases, so do the efforts to analyze social phenomena resulting from generated tagging and folksonomies. Most of these efforts provide different views for the understanding of various Web activities. Results from various experimental research s...
Conference Paper
Full-text available
In this paper, we introduce a formal email workflow model based on traditional email, which enables the user to define and execute ad-hoc workflows in an intuitive way. This model paves the way for semantic annotation of implicit, well-defined workflows, thus making them explicit and exposing the missing information in a machine processable way. Gr...
Conference Paper
Full-text available
The lack of structure in the content of email messages makes it very hard for data channelled between the sender and the recipient to be correctly interpreted and acted upon. As a result, the purposes of messages frequently end up not being fulfilled, prompting prolonged communication and stalling the disconnected workflow that is characteristic of...
Conference Paper
The complete lack of structure and semantics in email content is one reason why data channeled between the sender and the recipient is hard to be correctly interpreted and acted upon. This causes information overload, tedious personal information management, and jeopardizes the disconnected workflow that is characteristic of email. Through Semanta,...
Article
Full-text available
There is a growing interest into how we represent and share tagging data in collaborative tagging systems. Conventional tags, meaning freely created tags that are not associated with a structured ontology, are not naturally suited for collaborative processes, due to linguistic and grammatical variations, as well as human typing errors. Additionally...
Article
Full-text available
In this paper we provide a summary of work that has been pursued in the area of Semantic Email, with a particular focus on our work in the area. The aim of this paper is to provide a status quo for this topic, as well as to generate ideas and discussions that could evolve the topic and take it to new heights. We finish off by outlining future direc...