Conference PaperPDF Available

Personalized Environmental Service Configuration and Delivery Orchestration: The PESCaDO Demonstrator

Authors:

Abstract

Citizens are increasingly aware of the influence of environmental and meteorological conditions on the quality of their life. This results in an increasing demand for personalized environmental information, i.e., information that is tailored to citizens’ specific context and background. In this demonstration, we present an environmental information system that addresses this demand in its full complexity in the context of the PESCaDO EU project. Specifically, we will show a system that supports submission of user generated queries related to environmental conditions. From the technical point of view, the system is tuned to discover reliable data in the web and to process these data in order to convert them into knowledge, which is stored in a dedicated repository. At run time, this information is transferred into an ontology-based knowledge base, from which then information relevant to the specific user is deduced and communicated in the language of their preference.
Personalized Environmental Service Configuration and
Delivery Orchestration: The PESCaDO Demonstrator
Leo Wanner1,2, Marco Rospocher8, Stefanos Vrochidis6, Harald Bosch3, Nadjet
Bouayad-Agha2, Ulrich B¨
ugel4, Gerard Casamayor2, Thomas Ertl3, Desiree
Hilbring4, Ari Karppinen5, Ioannis Kompatsiaris6, Tarja Koskentalo7, Simon Mille2,
J¨
urgen Moßgraber4, Anastasia Moumtzidou6, Maria Myllynen7, Emanuele Pianta8,
Horacio Saggion2, Luciano Serafini8, Virpi Tarvainen5, Sara Tonelli8
1Catalan Institute for Research and Advanced Studies, 2Dept. of Information and
Communication Technologies, Pompeu Fabra University, 3Visualization Institute, University of
Stuttgart, 4Fraunhofer Institute for Optronics, System Technologies and Image Exploitation,
5Finnish Meteorological Institute, 6Informatics and Telematics Institute, Centre for Research
and Technology Hellas, 7Helsinki Region Environmental Services Authority, 8Fondazione
Bruno Kessler
pescado@upf.edu;http://www.pescado-project.eu
Abstract. Citizens are increasingly aware of the influence of environmental and
meteorological conditions on the quality of their life. This results in an increas-
ing demand for personalized environmental information, i.e., information that is
tailored to citizens’ specific context and background. In this demonstration, we
present an environmental information system that addresses this demand in its
full complexity in the context of the PESCaDO EU project. Specifically, we will
show a system that supports submission of user generated queries related to en-
vironmental conditions. From the technical point of view, the system is tuned to
discover reliable data in the web and to process these data in order to convert
them into knowledge, which is stored in a dedicated repository. At run time, this
information is transferred into an ontology-based knowledge base, from which
then information relevant to the specific user is deduced and communicated in
the language of their preference.
1 Research background
Citizens are increasingly aware of the influence of environmental and meteorological
conditions on the quality of their life. One of the consequences of this awareness is
the demand for high quality environmental information that is tailored to one’s specific
context and background (e.g. health conditions, travel preferences, etc.), i.e., which is
personalized. Personalized environmental information may need to cover a variety of
aspects (such as meteorology, air quality, pollen, and traffic) and take into account a
number of specific personal attributes (health, age, allergies, etc.) of the user, as well as
the intended use of the information. For instance, a pollen allergic person, planning to
do some outdoor activities, may be interested in being notified whether the pollen situ-
ation in the area may trigger some symptoms, or if the temperature is too hot for doing
physical exercise, while a city administrator has to be informed whether the current air
quality situation requires some actions to be urgently taken.
2
So far, only a few approaches have been proposed with a view of how this in-
formation can be facilitated in technical terms. All of these approaches focus on one
environmental aspect and only very few of them address the problem of information
personalization [2], [7], [9]. We aim to address the above task in its full complexity.
In this work, carried on in the context of the PESCaDO EU project, we take ad-
vantage of the fact that nowadays, the World Wide Web already hosts a great range
of services (i.e. websites, which provide environmental information) that offer data on
each of the above aspects, such that, in principle, the required basic data are available.
The challenge is threefold: first, to discover and orchestrate these services; second, to
process the obtained data in accordance with the needs of the user; and, third, to com-
municate the gained information in the users preferred mode.
The demonstration will aim, in particular, at showing how semantic web technolo-
gies are exploited to address this challenges in PESCaDO.
2 The PESCaDO Platform: Main modules and key semantic
technologies used
The challenges mentioned in Section 1 require the involvement of an elevated number
of rather heterogeneous applications addressing various complex tasks: discovery of the
environmental service nodes in the web, distillation of the data from webpages, orches-
tration of the environmental service nodes, fusion of environmental data, assessment of
the data with respect to the needs of the addressee, selection of user-relevant content and
its delivery to the addressee, and, finally, interaction with the user. Thus, in PESCaDO
we developed a service-based infrastructure to integrate all these applications.
For a general overview of the running PESCaDO service platform1, and the type of
information produced, see: http://www.youtube.com/watch?v=c1Ym7ys3HCg. In
this section, we focus on presenting three tasks we addressed by applying semantic web
technologies.
The back-bone of the PESCaDO service platform, exploited in each of these three
tasks, is an ontology-based knowledge base, the PESCaDO Knowledge Base (PKB),
where all the information relevant for a user request are dynamically instantiated. The
ontology, partially built exploiting automatic key-phrases extraction techniques [8], for-
malizes a variety of aspects related to the application context: environmental data, en-
vironmental nodes2, user requests, user profiles, warnings and recommendations trig-
gered by environmental conditions, logico-semantic relations (e.g. cause, implication)
between facts, and so on. The current version of the ontology consists of 241 classes,
672 individuals, 151 object properties, and 43 datatype properties.
2.1 Discovery of environmental nodes
The first step towards the extraction and indexing of environmental information is the
discovery of environmental nodes, which can be considered as a problem of domain spe-
1A more comprehensive description of the system workflow can be found in [10]
2An environmental node is a provider of environmental data values, like for instance a web-site,
a web-service, or a measuring station
3
cific search. To this end, we implement a node discovery framework, which builds upon
state of the art domain specific search techniques, advanced content distillation, ontolo-
gies and supervised machine learning. The framework consists of three main parts: a)
Web search b) Post processing and c) Indexing and storage. Web search is realized with
the aid of a general-purpose search engine, which accesses large web indices. In this
implementation we employ Yahoo! Search BOSS API. In order to generate domain
specific queries, we apply two complementary techniques. First we use the ontology of
the PKB and we extract concepts and instances referring to types of environmental data
(e.g. temperature, birch pollen, PM10) and we combine them with geographical city
names automatically retrieved by geographical resources. In addition, the queries are
expanded by keyword spices [6], which are domain specific keywords extracted with
the aid of machine learning techniques from environmental websites.
During the post-processing step we perform supervised classification with Support
Vector Machines to separate relevant from irrelevant nodes and we crawl each website
to further expand our search in an iterative manner. The determination of the relevance
of the nodes and their categorization is done using a classifier that operates on a weight-
based vector of key phrases and concepts from the content and the structure of the
webpages. Subsequently, we parse the body and the metadata of the relevant webpages
in order to extract the structure and the clues that reveal the information presented.
Finally, the information obtained with respect to each relevant node is indexed in a
Sensor Observation Service (SOS) [5] compliant repository, which can be accessed and
retrieved by the system when a user request is submitted.
The whole discovery procedure is automatic, however an administrative user could
intervene through an interactive user interface, in order to select geographic regions
of interest to perform the discovery, optimize the selection of keyword spices, and
parametrize the training of the classifiers.
2.2 Processing raw environmental data to obtain content
The user interface of the PESCaDO system guides the user in formulating a request,
which is instantiated in all its details (e.g. type of request, user profile, time period,
geographic location) in the PKB. By exploiting Description Logics (DL) reasoning on
the PKB, the system determines from the request description which are the types of en-
vironmental data which constitute the raw content necessary to fulfil the user needs. A
specific component of the system is then responsible of selecting from the SOS repos-
itory the actual values (observed, forecasted, historical) for the selected types of envi-
ronmental data, and to appropriately instantiate them in the PKB.
At this stage, the raw data retrieved from the environmental nodes are processed to
derive additional personalized content from them, such as data aggregations, qualitative
scaling of numerical data, and user tailored recommendations and warnings triggered by
the environmental data relevant for the specific user query. Logico-semantic relations
are also instantiated at this stage, for instance to represent whether a certain pollen
concentration value causes the triggering of a recommendation to the user, due to its
sensitiveness to that pollen.
The computation of this inferred content is performed by the decision support ser-
vice of the PESCaDO Platform by combining some complementary reasoning strate-
4
gies, including DL reasoning and rule-based reasoning. A two layer reasoning infras-
tructure is currently in place. The first layer exploits the HermiT reasoner for the OWL
DL reasoning services. The second layer is stacked on top of the previous layer. It uses
the Jena RETE rule engine, which performs the rule-based reasoning computation.
2.3 Generating user information from content
As is common in Natural Language Generation, our information generator is divided
into two major modules: the text planning module and the linguistic generation module
(with the latter taking as input the text plan produced by the former).
Text Planning The text planning module is divided into a content selection module and
discourse structuring module. As is common in report generation, our content selection
is schema- (or template-) based. Therefore, the ontology of the PKB introduced above
defines a class Schema with an n-ary schema component object property whose range
can be any individuals of the PKB itself.
Similar to [1], we assume the output of the discourse structuring module to be a
well-formed text plan which consists of (i) elementary discourse units (EDUs) that
group together individuals of the PKB, (ii) discourse relations between EDUs and/or
individuals of the PKB, and (iii) precedence relations between EDUs. This structure
translates into two top classes of the ontology of the PKB: EDU with an n-ary EDU
component relation and a linear precedence property, and Discourse Relation
with nucleus and satellite relation. A set of SPA RQ L query rules are defined to instantiate
the various concepts and relations.
Content Selection (CS) operates on the output of the decision support service. It
selects the content to be included in the report and groups it by topic, instantiating a
number of schemas for each topic. The inclusion of a given individual in a schema
can be subject to some restrictions defined in the queries; for example, if the minimum
and maximum air quality index (AQI) values are identical, or if the maximum AQI
value triggers a user recommendation or warning, then only the maximum AQI value is
selected (the minimum AQI rating is omitted).
Discourse structuring is carried out by a pipeline of three rule-based submodules: (i)
Elementary Discourse Unit (EDU) Determination, which groups topically related PKB
individuals into propositional units starting from the schemas determined during CS;
(ii) Mapping logico-semantic relations to discourse relations; and (iii) EDU Ordering,
which introduces a precedence relation between EDUs using a number of heuristics
derived from interviews with domain communication experts.
Linguistic generation Our linguistic generation module is based on a multilevel lin-
guistic model of the Meaning-Text Theory (MTM) [4], such that the generation con-
sists of a series of mappings between structures of adjacent strata (from the conceptual
stratum to the linguistic surface stratum): Conceptual Structure (ConStr)Semantic
Structure (SemStr)Deep-Syntactic Structure (DSyntStr)Surface-Syntactic Struc-
ture (SSyntStr)Deep-Morphological Structure (DMorphStr)Surface-Morpholog-
ical Structure (SMorphStr)Text. Starting from the conceptual stratum, for each pair
of adjacent strata Siand Si+1, a transition grammar Gi
i+1 is defined; see [3].
The ConStr is derived from each text plan produced by the text planning component.
In a sense, ConStr can thus be considered a projection of selected fragments of the
5
ontologies onto a linguistically motivated structure. ConStrs are language-independent
and thus ideal as starting point of multilingual generation.
3 System Demonstration
The system demonstration will show how the PKB is instantiated and exploited by the
different services composing the PESCaDO Platform in the context of two different
application scenarios, one about health safety decision support for end users and one
about administrative decision support. In particular, the demo attendees will have the
chance to see how the raw environmental data are dynamically processed with ontology-
based techniques to obtain reports. Furthermore, we will demonstrate how to use and
set-up the tool for environmental node discovery.
Acknowledgments
The work described in this paper has been partially funded by the European Commis-
sion under the contract number FP7-248594, PESCaDO (Personalized Environmental
Service Configuration and Delivery Orchestration) project.
References
1. N. Bouayad-Agha, G. Casamayor, L. Wanner, F. D´
ıez, and S. L´
opez Hern´
andez. Footbowl:
Using a generic ontology of football competition for planning match summaries. In Pro-
ceedings of the Eighth Extended Semantic Web Conference (ESWC), pages 230–244, 2011.
2. Kostas D. Karatzas. State-of-the-art in the dissemination of aq information to the general
public. In Proceedings of EnviroInfo, pages 41–47, 2007.
3. F. Lareau and L. Wanner. Towards a generic multilingual dependency grammar for text
generation. In T. King and E.M. Bender, editors, Proceedings of the GEAF07 Workshop,
pages 203–223, Stanford, CA, 2007. CSLI.
4. I.A. Mel’ˇ
cuk. Dependency Syntax: Theory and Practice. SUNY Press, Albany, 1988.
5. 52 North. Sensor observation service (sos), 2004.
6. S Oyama, T Kokubo, and T Ishida. Domain-specific web search with keyword spices. IEEE
Transactions on Knowledge and Data Engineering, 16(1):17–27, 2004.
7. G. Peinel, Rose T., and R. San Jos´
e. Customized information services for environmental
awareness in urban areas. In Proceedings of the 7th World Congress on Intel ligent Transport
Systems, Turin, Italy, 2000.
8. S. Tonelli, M. Rospocher, E. Pianta, and L. Serafini. Boosting collaborative ontology build-
ing with key-concept extraction. In Proceedings of 5th IEEE International Conference on
Semantic Computing (September 18-21, 2011 - Palo Alto, USA), 2011.
9. L. Wanner, B. Bohnet, N. Bouayad-Agha, F. Lareau, and D. Nicklaß. MARQUIS: Generation
of user-tailored multilingual air quality bulletins. Applied Artificial Intelligence, 24(10):914–
952, 2010.
10. L. Wanner, S. Vrochidis, S. Tonelli, J. Moßgraber, H. Bosch, A. Karppinen, M. Mylly-
nen, M. Rospocher, N. Bouayad-Agha, U. B¨
ugel, G. Casamayor, T. Ertl, I. Kompatsiaris,
T. Koskentalo, S. Mille, A. Moumtzidou, E. Pianta, H. Saggion, L. Serafini, and V. Tar-
vainen. Building an environmental information system for personalized content delivery. In
Proceedings of the ISESS 2011, Brno, Czech Republic, pages 169–176. Springer, 2011.
... The second approach mitigates the users' missing familiarity with a domain by exploiting formally represented domain knowledge [Bosch et al., 2011a. Its prototype is integrated in an environmental decision support system [Wanner et al., 2010[Wanner et al., , 2011[Wanner et al., , 2012a[Wanner et al., ,b, 2013. It consists of two separate areas involving user interaction: query creation and result interpretation. ...
... At the same time the semantic information can be used to personalize all steps of the process from query generation, over data acquisition and result computation. The PESCaDO project [Wanner et al., 2010[Wanner et al., , 2011[Wanner et al., , 2012a provides a web-based personalized decision support related to environmental data, a domain which features complex interrelations and has a broad data coverage in the World Wide Web. Its target audience are citizens, public services, and administration in sectors sensitive to the environmental condition, e.g. ...
Thesis
The often cited information explosion is not limited to volatile network traffic and massive multimedia capture data. Structured and high quality data from diverse fields of study become easily and freely available, too. This is due to crowd sourced data collections, better sharing infrastructure, or more generally speaking user generated content of the Web 2.0 and the popular transparency and open data movements. At the same time as data generation is shifting to everyday casual users, data analysis is often still reserved to large companies specialized in content analysis and distribution such as today's internet giants Amazon, Google, and Facebook. Here, fully automatic algorithms analyze metadata and content to infer interests and believes of their users and present only matching navigation suggestions and advertisements. Besides the problem of creating a filter bubble, in which users never see conflicting information due to the reinforcement nature of history based navigation suggestions, the use of fully automatic approaches has inherent problems, e.g. being unable to find the unexpected and adopt to changes, which lead to the introduction of the Visual Analytics (VA) agenda. If users intend to perform their own analysis on the available data, they are often faced with either generic toolkits that cover a broad range of applicable domains and features or specialized VA systems that focus on one domain. Both are not suited to support casual users in their analysis as they don't match the users' goals and capabilities. The former tend to be complex and targeted to analysis professionals due to the large range of supported features and programmable visualization techniques. The latter trade general flexibility for improved ease of use and optimized interaction for a specific domain requirement. This work describes two approaches building on interactive visualization to reduce this gap between generic toolkits and domain-specific systems. The first one builds upon the idea that most data relevant for casual users are collections of entities with attributes. This least common denominator is commonly employed in faceted browsing scenarios and filter/flow environments. Thinking in sets of entities is natural and allows for a very direct visual interaction with the analysis subject and it stands for a common ground for adding analysis functionality to domain-specific visualization software. Encapsulating the interaction with sets of entities into a filter/flow graph component can be used to record analysis steps and intermediate results into an explicit structure to support collaboration, reporting, and reuse of filters and result sets. This generic analysis functionality is provided as a plugin-in component and was integrated into several domain-specific data visualization and analysis prototypes. This way, the plug-in benefits from the implicit domain knowledge of the host system (e.g. selection semantics and domain-specific visualization) while being used to structure and record the user's analysis process. The second approach directly exploits encoded domain knowledge in order to help casual users interacting with very specific domain data. By observing the interrelations in the ontology, the user interface can automatically be adjusted to indicate problems with invalid user input and transform the system's output to explain its relation to the user. Here, the domain related visualizations are personalized and orchestrated for each user based on user profiles and ontology information. In conclusion, this thesis introduces novel approaches at the boundary of generic analysis tools and their domain-specific context to extend the usage of visual analytics to casual users by exploiting domain knowledge for supporting analysis tasks, input validation, and personalized information visualization.
... In addition, the analysis of environmental information is often a prerequisite for the fulfillment of legal mandates on the management and preservation of environmental quality, according to the EU's and other legal frameworks (Karatzas and Moussiopoulos, 2000). With a view to offering personalized decision support services for people based on environmental information regarding their everyday activities (Wanner et al., 2012) and supporting environmental experts in air quality preservation tasks, there is a need to extract, combine and compare complementary and competing environmental information from several resources in order to generate more reliable and cross-validated information on the environmental conditions. One of the main steps towards this goal is the environmental information extraction from heterogeneous resources. ...
... Potential uses for the proposed framework include supporting environmental systems that provide either air quality information from several providers for direct comparison or orchestration purposes or decision support on everyday issues (e.g. travel planning) (Wanner et al., 2012), and in general providing a way to access sufficiently usable numerical environmental data for a host of applications involving the processing of the latter, without requiring explicit data publishing policy changes by part of environmental data providers, thus creating more flexibility. ...
Article
Full-text available
Environmental data analysis and information provision are considered of great importance for people, since environmental conditions are strongly related to health issues and directly affect a variety of everyday activities. Nowadays, there are several free web-based services that provide environmental information in several formats with map images being the most commonly used to present air quality and pollen forecasts. This format, despite being intuitive for humans, complicates the extraction and processing of the underlying data. Typical examples of this case are the chemical weather forecasts, which are usually encoded heatmaps (i.e. graphical representation of matrix data with colors), while the forecasted numerical pollutant concentrations are commonly unavailable. This work presents a model for the semi-automatic extraction of such information based on a template configuration tool, on methodologies for data reconstruction from images, as well as on text processing and Optical Character Recognition (OCR). The aforementioned modules are integrated in a standalone framework, which is extensively evaluated by comparing data extracted from a variety of chemical weather heat maps against the real numerical values produced by chemical weather forecasting models. The results demonstrate a satisfactory performance in terms of data recovery and positional accuracy.
... allergies, asthma, cardiovascular diseases) and of course they play an important role in everyday outdoor activities such as sports and commuting. With a view to offering personalized decision support services for people based on environmental information regarding their everyday activities [1], there is a need to extract and combine complementary and competing environmental information from several resources. One of the main steps towards this goal is the environmental information extraction from multimedia resources. ...
... In this paper, we propose a framework for environmental information extraction from air quality and pollen forecast heatmaps, combining image processing, template configuration, as well as textual recognition components. This framework could serve as a basis for supporting environmental systems that provide either air quality information from several providers for direct comparison or orchestration purposes or decision support [1] on everyday issues (e.g. travel planning). ...
Conference Paper
Full-text available
Extraction and analysis of environmental information is very important, since it strongly affects everyday life. Nowadays there are already many free services providing environmental information in several formats including multimedia (e.g. map images). Although such presentation formats might be very informative for humans, they complicate the automatic extraction and processing of the underlying data. A characteristic example is the air quality and pollen forecasts, which are usually encoded in image maps, while the initial (numerical) pollutant concentrations remain unavailable. This work proposes a framework for the semi-automatic extraction of such information based on a template configuration tool, on Optical Character Recognition (OCR) techniques and on methodologies for data reconstruction from images. The system is tested with a different air quality and pollen forecast heatmaps demonstrating promising results.
Book
Frühwarnsysteme dienen zur möglichst frühzeitigen Information über eine sich anbahnende oder auftretende Gefahr, um Personen und Organisationen die Möglichkeit zu geben entsprechend darauf reagieren zu können. Die Konzeption eines Frühwarnsystems stellt komplexe Herausforderungen an die Systemarchitekten, hierzu liefert die vorliegende Arbeit ein Rahmenwerk für die Architektur zukünftiger Frühwarnsysteme.
Conference Paper
Many kinds of environmental data are nowadays publicly available, but spread over the web. This paper discusses using the Sensor Observation Service (SOS) standard of the Open Geospatial Consortium (OGC) as a common interface for providing data from heterogeneous sources which can be integrated to a user tailored environmental information system. In order to allow for providing user-tailored and problem-specific information the adjusted SOS is augmented by a semantic layer which maps the environmental information to ontology concepts. The necessary information fusion from different domains and data types lead to several specific requirements for the SOS. Addressing these requirements we have implemented a SOS which still conforms to the OGC SOS 1.0.0 standard specification. The developed SOS has been integrated in a publicly available demonstrator of our personalized environmental information system. Additionally this paper discusses future consequences for the SOS, caused by the recently published SOS 2.0 specification.
Article
Full-text available
Air quality information furnishes a major business resource for regional governments to be offered as value-added services for citizens. From a citizen and city administration's point of view, the increase of the quality of air pollution information and the way of how this information is delivered to the citizen will produce a better quality of life and a much better communication and mutual understanding between citizens and city authorities. Project APNEE (Air Pollution Network for Early warning and online information Exchange in Europe) strives to foster the active dissemination of air quality information to European citizens according to "key action 1: system and services for the citizen" of the 5th Framework Programme. Once APNEE is in place, there will be a dedicated information service to inform citizens about air quality trends. Transparency of these trends acts as a mirror citizens have to face. This awareness might effect decisions on their actions in order to improve local and regional conditions. In this light, transparency translates into a consciousness of action, which will yield to a share of the responsibility among citizens and authorities and might eventually result in a change of citizen behaviour.
Conference Paper
Full-text available
We present a wiki-based collaborative environment for the semi-automatic incremental building of ontologies. The system relies on an existing platform, which has been extended with a component for terminology extraction from domain-specific textual corpora and with a further step aimed at matching the extracted concepts with pre-existing structured and semi-structured information. The system stands on the shoulders of a well-established user-friendly wiki architecture and it enables knowledge engineers and domain experts to collaborate in the ontology building process. We have performed a task-oriented evaluation of the tool in a real use case for incrementally constructing the missing part of an environmental ontology. The tool effectively supported the users in the task, thus showing its usefulness for knowledge extraction and ontology engineering.
Conference Paper
Full-text available
Citizens are increasingly aware of the influence of environmental and meteorological conditions on the quality of their life. This results in an increasing demand for personalized environmental information, i.e., information that is tailored to citizens’ specific context and background. In this work we describe the development of an environmental information system that addresses this demand in its full complexity. Specifically, we aim at developing a system that supports submission of user generated queries related to environmental conditions. From the technical point of view, the system is tuned to discover reliable data in the web and to process these data in order to convert them into knowledge, which is stored in a dedicated repository. At run time, this information is transferred into an ontology-structured knowledge base, from which then information relevant to the specific user is deduced and communicated in the language of their preference.
Conference Paper
Full-text available
We present a two-layer OWL ontology-based Knowledge Base (KB) that allows for flexible content selection and discourse structuring in Natural Language text Generation (NLG) and discuss its use for these two tasks. The first layer of the ontology contains an application-independent base ontology. It models the domain and was not designed with NLG in mind. The second layer, which is added on top of the base ontology, models entities and events that can be inferred from the base ontology, including inferable logico-semantic relations between individuals. The nodes in the KB are weighted according to learnt models of content selection, such that a subset of them can be extracted. The extraction is done using templates that also consider semantic relations between the nodes and a simple user profile. The discourse structuring submodule maps the semantic relations to discourse relations and forms discourse units to then arrange them into a coherent discourse graph. The approach is illustrated and evaluated on a KB that models the First Spanish Football League.
Article
Full-text available
Air pollution has a major influence on health. It is thus not surprising that air quality (AQ) increasingly becomes a central issue in the environmental information policy worldwide. The most common way to deliver AQ information is in terms of graphics, tables, pictograms, or color scales that display either the concentrations of the pollutant substances or the corresponding AQ indices. However, all of these presentation modi lack the explanatory dimension; nor can they be easily tailored to the needs of the individual users. MARQUIS is an AQ information generation service that produces user-tailored multilingual bulletins on the major measured and forecasted air pollution substances and their relevance to human health in five European regions. It incorporates modules for the assessment of pollutant time series episodes with respect to their relevance to a given addressee, for planning of the discourse structure of the bulletins and the selection of the adequate presentation mode, and for generation proper. The positive evaluation of the bulletins produced by MARQUIS by users shows that the use of automatic text generation techniques in such a complex and sensitive application is feasible.
Article
Full-text available
Domain-specific Web search engines are effective tools for reducing the difficulty experienced when acquiring information from the Web. Existing methods for building domain-specific Web search engines require human expertise or specific facilities. However, we can build a domain-specific search engine simply by adding domain-specific keywords, called "keyword spices," to the user's input query and forwarding it to a general-purpose Web search engine. Keyword spices can be effectively discovered from Web documents using machine learning technologies. The paper describes domain-specific Web search engines that use keyword spices for locating recipes, restaurants, and used cars.
Sensor observation service (sos
  • North