Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Monitoring process performance is an important means for organizations to identify opportunities to improve their operations. The definition of suitable Process Performance Indicators (PPIs) is a crucial task in this regard. Because PPIs need to be in line with strategic business objectives, the formulation of PPIs is a managerial concern. Managers typically start out to provide relevant indicators in the form of natural language PPI descriptions. Therefore, considerable time and effort have to be invested to transform these descriptions into PPI definitions that can actually be monitored. This work presents an approach that automates this task. The presented approach transforms an unstructured natural language PPI description into a structured notation that is aligned with the implementation underlying a business process. To do so, we combine Hidden Markov Models and semantic matching techniques. A quantitative evaluation on the basis of a data collection obtained from practice demonstrates that our approach works accurately. Therefore, it represents a viable automated alternative to an otherwise laborious manual endeavor.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... As far as PPI measurability is concerned, it is normally assumed in the literature that PPIs can be measured as long as so-called process event logs [3] containing relevant data are available [4,5]. On this basis, the literature about PPIs mainly concentrates on PPI definition, ...
... with several examples of conceptual models and notations to define PPIs (e.g., [5,6]). More recent approaches also consider how to derive formal PPI definitions from natural language descriptions [7,4]. ...
... Here, the accuracy, consistency, completeness, maturity, and volume dimensions are considered, but the proposed approach is open to be extended with other dimensions. This model is novel in the literature because of two reasons: first, a PPI measurability model is currently lacking and PPI measurability is usually taken for granted as long as process event logs are available [4,5]; second, the literature considers the data quality of event logs a crucial issue only in the event log extraction phase, neglecting the impact that it can have on other phases of the business process lifecycle, such as process monitoring; ...
Article
The efficiency and effectiveness of business processes are usually evaluated by Process Performance Indicators (PPIs), which are computed using process event logs. PPIs can be insightful only when they are measurable, i.e., reliable. This paper proposes to define PPI measurability on the basis of the quality of the data in the process logs. Then, based on this definition, a framework for PPI measurability assessment and improvement is presented. For the assessment, we propose novel definitions of PPI accuracy, completeness, consistency, timeliness and volume that contextualise the traditional definitions in the data quality literature to the case of process logs. For the improvement, we define a set of guidelines for improving the measurability of a PPI. These guidelines may concern improving existing event logs, for instance through data imputation, implementation or enhancement of the process monitoring systems, or updating the PPI definitions. A case study in a large-sized institution is discussed to show the feasibility and the practical value of the proposed framework.
... The first type of appraoches is based on indicators and time patterns. Works that collect and analyze performance related to Key Performance Indicators (KPIs) are crucial to ensure consistent and continuous process optimization (Del-Río-Ortega et al., 2016), (Mendes and Santos, 2016), (Van der Aa et al., 2017), (El Hadj Amor and Ghannouchi, 2017), (Hompes et al., 2018). KPIs can be defined as quantifiable measures that an organisation uses to measure the performance in terms of meeting its strategic and operational objectives. ...
... PPINOT support two different types of resource-aware PPIs and shows the main elements and the types of measure (Base, Derived and Aggregated) that can be used to define a PPI. In (Van der Aa et al., 2017), the authors translate the natural language PPI descriptions into Process Performance Indicators (PPIs) according to a structured notation. (Mendes and Santos, 2016) identified the model used for evaluating the performance of BP which best fits the evaluation of the business processes, with a view to a greater alignment between the indicators of the process and the strategic objectives. ...
Conference Paper
Full-text available
Measuring the performance of business processes is an essential task that enables an organization to achieve effective and efficient results. It is by measuring processes that data on their performance is provided, thus showing the evolution of the organization in terms of its strategic objectives. To be efficient in such task, organizations need a set of measures, thereby enabling them to support planning, inducing control and making it possible to diagnose the current situation. Indeed, several researchers have defined specific measures for assessing the business process (BP) performance. Our approach proposes new temporal and cost measures to assess the performance of business process models. The aim of this paper is to classify the performance measures proposed so far within a framework defined in terms of characteristics, design and temporal perspectives, and to evaluate the performance of business process models. This framework uses business and social contexts to improve particular measures. It helps the designer to select a subset of measures corresponding to each perspective and to calculate and interpret their values in order to improve the performance of their model.
... In other cases it is less straightforward to decide on the consistency-or lack thereof-between the representations. For example, sentence (9) indicates that a certain procedure of the process must be repeated for every part in the order. The textual description does not indicate that any action is associated with this repetition. ...
... Lines 3-14 in Algorithm 2 describe the creation of these combinations. The underlying idea is that all existing text interpretations, starting from an empty list (line 3), are incrementally extended with a single statement interpretation (lines [8][9][10][11]. This ensures that each possible combination of statement interpretations is included in the list. ...
... These requirements are elicited, analyzed, verified, and documented. In the elicitation stage, there is a person responsible for writing in natural language the specifications of the requirements, complementing their descriptions with various models, e.g., description of organizational processes [1] or goal-oriented [25], among others modeling languages. Then, from the requirements representations, making the abstractions of the structural elements of the software system that expects to solve the problem understudy. ...
Chapter
Full-text available
It is common to describe Software Product Lines and manage its variability with the aid of a feature model (FM). In this light, it shows that there are ambiguity issues concerning FM, which result in redundancy problems, anomalies, inconsistency, and mainly semantics issues. We propose a study regarding the feature modeling that considers the common aspects and the deficiencies in syntax, semantics, and semiotic clarity detected in the use of these modeling languages and the tools implemented from these. The initial results from this proposal show that the corrections of errors using feature modeling languages are feasible.
... HMM has been extensively used in various research disciplines, such as communication (Ghosh et al., 2009;Turin & Nobelen, 1998), Internet traffic modeling (Dainotti et al., 2008), speech and text recognition (Kang et al., 2018;Nwe et al., 2003), bioinformatics (Yin et al., 2018), medicine (Gupta et al., 2020;Wojtowicz et al., 2019) and various fields of information systems research (Elgarrai et al., 2016;Leopold et al., 2019;Sahoo et al., 2012;Singh et al., 2011;Van der Aa et al., 2017;Zhang et al., 2019). A review of HMM applications can be found in Mor et al. (2020). ...
Article
Reducing costly hospital readmissions of patients with Congestive Heart Failure (CHF) is important. We analyzed 4,661 CHF patients (from 2007 to 2017) using Hidden Markov Models in order to profile CHF readmission risk over time. This method proved practical in identifying three patient groups with distinctive characteristics, which might guide physicians in tailoring personalized care to prevent hospital readmission. We thus demonstrate how applying appropriate AI analytics can save costs and improve the quality of care.
... The contexts in which HMMs can be applied are diverse, including bioinformatics [44,45], electrical engineering [46], and natural language processing. In the latter domain, which also encompasses our application context, a variety of use cases are addressed using HMMs, such as speech [47] and handwriting recognition [48], part of speech tagging [49], machine translation [50], and information extraction [22,51]. The strength of HMMs in the context of these applications, as well as in the context of the technique presented in this paper, is that they combine emission and transition probabilities. ...
Article
Full-text available
Many process model analysis techniques rely on the accurate analysis of the natural language contents captured in the models’ activity labels. Since these labels are typically short and diverse in terms of their grammatical style, standard natural language processing tools are not suitable to analyze them. While a dedicated technique for the analysis of process model activity labels was proposed in the past, it suffers from considerable limitations. First of all, its performance varies greatly among data sets with different characteristics and it cannot handle uncommon grammatical styles. What is more, adapting the technique requires in-depth domain knowledge. We use this paper to propose a machine learning-based technique for activity label analysis that overcomes the issues associated with this rule-based state of the art. Our technique conceptualizes activity label analysis as a tagging task based on a Hidden Markov Model. By doing so, the analysis of activity labels no longer requires the manual specification of rules. An evaluation using a collection of 15,000 activity labels demonstrates that our machine learning-based technique outperforms the state of the art in all aspects.
... A key benefit of the PPINOT approach is that individual PPIs can be traced back to their related business process elements and that a scope can be defined that acts as a filter for the relevant element instances, much like the projection function for artifact instances in this paper. The authors further extend their work along several directions [4,5,7,8,21]. The PPINOT-related techniques provide a clear semantics for the PPI lifecycle, from definition to analysis. ...
Chapter
Full-text available
Many business processes are supported by information systems that record their execution. Process mining techniques extract knowledge and insights from such process execution data typically stored in event logs or streams. Most process mining techniques focus on process discovery (the automated extraction of process models) and conformance checking (aligning observed and modeled behavior). Existing process performance analysis techniques typically rely on ad-hoc definitions of performance. This paper introduces a novel comprehensive approach to process performance analysis from event data. Our generic technique centers around business artifacts, key conceptual entities that behave according to state-based transactional lifecycle models. We present a formalization of these concepts as well as a structural approach to calculate and monitor process performance from event data. The approach has been implemented in the open source process mining tool ProM and its applicability has been evaluated using public real-life event data.
... Aside from techniques that establish alignments between different process models, focus has recently shifted towards the establishment of alignments among a broader range of process-related artifacts. For example, several techniques exist that establish correspondences between event logs and process models [2,40], and a technique for the alignment of process performance indicators and process models [45]. ...
Article
Full-text available
Process model descriptions are an ubiquitous source of information that exists in any organization. To reach different types of stakeholders, distinct descriptions are often kept, so that process understandability is boosted with respect to individual capabilities. While the use of distinct representations allows more stakeholders to interpret process information, it also poses a considerable challenge: to keep different process descriptions aligned. In this paper, a novel technique to align process models and textual descriptions is proposed. The technique is grounded on projecting knowledge extracted from these two representations into a uniform representation that is amenable for comparison. It applies a tailored linguistic analysis of each description, so that the important information is considered when aligning description' elements. Compared to existing approaches that address this use case, our technique provides more comprehensive alignments, which encompass process model activities, events, and gateways. Furthermore, the technique, which has has been implemented into the platform nlp4bpm.cs.upc.edu, shows promising results based on experiments with real-world data.
... This enables organizations to more efficiently monitor the performance of their business processes and continuously adapt their monitoring activities to changing business needs. Published as: [5] 2.5 Technique 5: Process Model Matching using Event-Log Information. ...
Preprint
Full-text available
Having access to the right information on business processes is crucial to the proper and efficient execution of all sorts of activities, such as the assessment of mortgage applications, manufacturing of goods, as well as the treatment of patients. A major challenge here is that information related to a single process is often spread out over various models, documents, and systems. This fragmentation can have disastrous consequences for an organization’s operations. It can, for example, lead to delays, wastes of money, and even violations of rules and laws. The work presented in this thesis tackles these problems with algorithms that can automatically compare process information stemming from various sources. These techniques, among others, enable the detection of contradictions between the sources and improve the ability of organizations to monitor their compliance to rules and regulations.
Chapter
With the widespread popularity of smart devices in people’s daily lives, people hope to communicate with devices through a more humane interactive way. The natural language communication simulation computing system can provide technical support for the interactivity of smart devices. Genetic algorithm (GA) is a novel intelligent algorithm that has been studied hot in recent years. Therefore, this article will start the design research of the natural language communication simulation calculation (NLCSC) system model based on GA. This paper combines the advantages of BP neural network and GA, and proposes an improved adaptive genetic algorithm. This article designs the NLCSC system model in detail, including the architecture of the input layer, convolutional layer, and pooling layer. This article compares with the calculation model of BP neural network by experiment, it is found that the average calculation error of BP algorithm is 1.46%, while the average calculation error of GA-BP calculation model is 0.243%. This data verifies that the GA-based NLCSC system model has a high calculation accuracy.KeywordsNatural languageBP neural networkGenetic algorithmSystem model
Chapter
Nowadays, enterprises need to handle a continually growing amount of text data generated internally by their employees and externally by current or potential customers. Accordingly, the attention of managers shifts to an efficient usage of this data to address related business challenges. However, it is usually hard to extract the meaning out of unstructured text data in an automatic way. There are multiple discussions and no general opinion in the research and practitioners’ community on the design of text classification tasks, specifically the choice of text representation techniques and classification algorithms. One essential point in this discussion is about building solutions that are both accurate and understandable for humans. Being able to evaluate the classification decision is a critical success factor of a text classification task in an enterprise setting, be it legal documents, medical records, or IT tickets. Hence, our study aims to investigate the core design elements of a typical text classification pipeline and their contribution to the overall performance of the system. In particular, we consider text representation techniques and classification algorithms, in the context of their explainability, providing ultimate insights from our IT ticket complexity prediction case study. We compare the performance of a highly explainable text representation technique based on the case study tailored linguistic features with a common TF-IDF approach. We apply interpretable machine learning algorithms such as kNN, its enhanced versions, decision trees, naïve Bayes, logistic regression, as well as semi-supervised techniques to predict the ticket class label of low, medium, or high complexity. As our study shows, simple, explainable algorithms, such as decision trees and naïve Bayes, demonstrate remarkable performance results when applied with our linguistic features-based text representation. Furthermore, we note that text classification is inherently related to Granular Computing.
Article
Full-text available
Decision Model and Notation (DMN) has become a relevant topic for organizations since it allows users to control their processes and organizational decisions. The increasing use of DMN decision tables to capture critical business knowledge raises the need for supporting analysis tasks such as the extraction of inputs, outputs and their relations from natural language descriptions. In this paper, we create a stepping stone towards implementing a Natural Language Processing framework to model decisions based on the DMN standard. Our proposal contributes to the generation of decision rules and tables from a single sentence analysis. This framework comprises three phases: (1) discourse and semantic analysis, (2) syntactic analysis and (3) decision table construction. To the best of our knowledge, this is the first attempt devoted to automatically discovering decision rules according to the DMN terminology from natural language descriptions. Aiming at assessing the quality of the resultant decision tables, we have conducted a survey involving 16 DMN 2 Leticia Arco et al. experts. The results have shown that our framework is able to generate semantically correct tables. It is convenient to mention that our proposal does not aim to replace analysts but support them in creating better models with less effort.
Chapter
Many Parts Of Speech (POS) taggers for the Malayalam language has been implemented using Support Vector Machine (SVM), Memory-Based Language Processing (MBLP), Hidden Markov Model (HMM) and other similar techniques. The objective was to find an improved POS tagger for the Malayalam language. This work proposed a comparison of the Malayalam POS tagger using the SVM and Hidden Markov model (HMM). The tagset used was the popular Bureau of Indian Standard (BIS) tag set. A manually created data set which has around 52,000 words has been taken from various Malayalam news sites. The preprocessing steps that have done for news text are also mentioned. Then POS tagging has been done using SVM and HMM. As POS tagging requires the extraction of multiple class labels, a multi-class SVM is used. It also performs feature extraction, feature selection, and classification. The word sense disambiguation and misclassification of words are the two major issues identified in SVM. Hidden Markov Model predicts the hidden sequence based on maximum observation likelihood which reduces ambiguity and misclassification rate.
Conference Paper
Full-text available
To determine whether strategic goals are met, organizations must monitor how their business processes perform. Process Performance Indicators (PPIs) are used to specify relevant performance requirements. The formulation of PPIs is typically a managerial concern. Therefore, considerable effort has to be invested to relate PPIs, described by management , to the exact operational and technical characteristics of business processes. This work presents an approach to support this task, which would otherwise be a laborious and time-consuming endeavor. The presented approach can automatically establish links between PPIs, as formulated in natural language, with operational details, as described in process models. To do so, we employ machine learning and natural language processing techniques. A quantitative evaluation on the basis of a collection of 173 real-world PPIs demonstrates that the proposed approach works well.
Book
Full-text available
A world that is changing faster and faster forces companies to a continuous performance monitoring. Indicators give the impression to be the real engine of organizations or even the economy at large. But performance indicators are not simple observation tools. They can have a deep "normative" effect, which can modify organizational behaviour and influence key decisions. Companies are what they measure! The selection of good performance indicators is not an easy process. This monograph focuses on the designing of a Performance Measurement System (PMS), knowing that "magic rules" to identify them do not exist. Some indicators seem right and easy to measure, but have subtle, counter-productive consequences. Other indicators are more difficult to measure, but focus the enterprise on those decisions and actions that are critical to success. This book suggests how to identify indicators that achieve a balance in these effects and enhance long-term profitability.
Conference Paper
Full-text available
An organization's knowledge on its business processes represents valuable corporate knowledge because it can be used to enhance the performance of these processes. In many organizations, documentation of process knowledge is scattered around various process information sources. Such information fragmentation poses considerable problems if, for example, stakeholders wish to develop a comprehensive understanding of their operations. The existence of efficient techniques to combine and integrate process information from different sources can therefore provide much value to an organization. In this work, we identify the general challenges that must be overcome to develop such techniques. This paper illustrates how these challenges should be and, to some extent , are being met in research. Based on these insights, we present three main frontiers that must be further expanded to successfully counter the fragmentation of process information in organizations.
Conference Paper
Full-text available
Process model matching refers to the creation of correspondences between activities of process models. Applications of process model matching are manifold, reaching from model validation over harmonization of process variants to effective management of process model collections. Recently, this demand led to the development of different techniques for process model matching. Yet, these techniques are heuristics and, thus, their results are inherently uncertain and need to be evaluated on a common basis. Currently, however, the BPM community lacks established data sets and frameworks for evaluation. The Process Model Matching Contest 2013 aimed at addressing the need for effective evaluation by defining process model matching problems over published data sets. This paper summarizes the setup and the results of the contest. Besides a description of the contest matching problems, the paper comprises short descriptions of all matching techniques that have been submitted for participation. In addition, we present and discuss the evaluation results and outline directions for future work in this field of research
Conference Paper
Full-text available
Text-based and model-based process descriptions have their own particular strengths and, as such, appeal to different stakeholders. For this reason, it is not unusual to find within an organization descriptions of the same business processes in both modes. When considering that hundreds of such descriptions may be in use in a particular organization by dozens of people, using a variety of editors, there is a clear risk that such models become misaligned. To reduce the time and effort needed to repair such situations, this paper presents the first approach to automatically identify inconsistencies between a process model and a corresponding textual description. Our approach leverages natural language processing techniques to identify cases where the two process representations describe activities in different orders, as well as model activities that are missing from the textual description. A quantitative evaluation with 46 real-life model-text pairs demonstrates that our approach allows users to quickly and effectively identify those descriptions in a process repository that are inconsistent.
Conference Paper
Full-text available
We describe the design and use of the Stanford CoreNLP toolkit, an extensible pipeline that provides core natural language analysis. This toolkit is quite widely used, both in the research NLP community and also among commercial and government users of open source NLP technology. We suggest that this follows from a simple, approachable design, straight-forward interfaces, the inclusion of robust and good quality analysis components, and not requiring use of a large amount of associated baggage.
Conference Paper
Full-text available
Many proposals to model service level agreements (SLAs) have been 1 elaborated in order to automate different stages of the service lifecycle such as 2 monitoring, implementation or deployment. All of them have been designed for 3 computational services and are not well–suited for other types of services such 4 as business process outsourcing (BPO) services. However, BPO services sup-5 ported by process–aware information systems could also benefit from modelling 6 SLAs in tasks such as performance monitoring, human resource assignment or 7 process configuration. In this paper, we identify the requirements for modelling 8 such SLAs and detail how they can be faced by combining techniques used to 9 model computational SLAs, business processes, and process performance indi-10 cators. Furthermore, our approach has been validated through the modelling of
Article
Full-text available
The importance of normal distribution is undeniable since it is an underlying assumption of many statistical procedures such as t-tests, linear regression analysis, discriminant analysis and Analysis of Variance (ANOVA). When the normality assumption is violated, interpretation and inferences may not be reliable or valid. The three common procedures in assessing whether a random sample of independent observations of size n come from a population with a normal distribution are: graphical methods (histograms, boxplots, Q-Q-plots), numerical methods (skewness and kurtosis indices) and formal normality tests. This paper* compares the power of four formal tests of normality: Shapiro-Wilk (SW) test, Kolmogorov-Smirnov (KS) test, Lilliefors (LF) test and Anderson-Darling (AD) test. Power comparisons of these four tests were obtained via Monte Carlo simulation of sample data generated from alternative distributions that follow symmetric and asymmetric distributions. Ten thousand samples of various sample size were generated from each of the given alternative symmetric and asymmetric distributions. The power of each test was then obtained by comparing the test of normality statistics with the respective critical values. Results show that Shapiro-Wilk test is the most powerful normality test, followed by Anderson-Darling test, Lilliefors test and Kolmogorov-Smirnov test. However, the power of all four tests is still low for small sample size.
Conference Paper
Full-text available
We review accuracy estimation methods and compare the two most common methods: crossvalidation and bootstrap. Recent experimental results on arti cial data and theoretical results in restricted settings have shown that for selecting a good classi er from a set of classiers (model selection), ten-fold cross-validation may be better than the more expensive leaveone-out cross-validation. We report on a largescale experiment|over half a million runs of C4.5 and a Naive-Bayes algorithm|to estimate the e ects of di erent parameters on these algorithms on real-world datasets. For crossvalidation, we vary the number of folds and whether the folds are strati ed or not � for bootstrap, we vary the number of bootstrap samples. Our results indicate that for real-word datasets similar to ours, the best method to use for model selection is ten-fold strati ed cross validation, even if computation power allows using more folds. 1
Article
Full-text available
Process performance management (PPM) aims at measuring, monitoring and analysing the performance of business processes (BPs), in order to check the achievement of strategic and operational goals and to support decision-making for their optimisation. PPM is based on process performance indicators (PPIs), so having an appropriate definition of them is crucial. One of the main problems of PPIs definition is to express them in an unambiguous, complete, understandable, traceable and verifiable manner. In practice, PPIs are defined informally – usually in ad hoc, natural language, with its well-known problems – or they are defined from an implementation perspective, hardly understandable to non-technical people. In order to solve this problem, in this article we propose a novel approach to improve the definition of PPIs using templates and linguistic patterns. This approach promotes reuse, reduces both ambiguities and missing information, is understandable to all stakeholders and maintains traceability with the process model. Furthermore, it enables the automated processing of PPI definitions by its straightforward translation into the PPINOT metamodel, allowing the gathering of the required information for their computation as well as the analysis of the relationships between them and with BP elements.
Conference Paper
Full-text available
A key aspect in any process-oriented organisation is the measurement of process performance for the achievement of its strategic and operational goals. Process Performance Indicators (PPIs) are a key asset to carry out this evaluation, and, therefore, the management of these PPIs throughout the whole BP lifecycle is crucial. In this demo we present PPINOT Tool Suite, a set of tools aimed at facilitating and automating the PPI management. The support includes their definition using either a graphical or a template-based textual notation, their automated analysis at design-time, and their automated computation based on the instrumentation of a Business Process Management System.
Article
Full-text available
Purpose This second part of the paper summarizes typical pitfalls as they can be observed in larger process modeling projects. Design/methodology/approach The identified pitfalls have been derived from a series of focus groups and semi‐structured interviews with business process analysts and managers of process management and modeling projects. Findings The article continues the discussion of the first part. It covers issues related to tools and related requirements (7‐10), the practice of modeling (11‐16), the way we design to‐be models (17‐19), and how we deal with success of modeling and maintenance issues (19‐21). Potential pitfalls related to strategy and governance (1‐3) and the involved stakeholders (4‐6) were discussed in the first part of this paper. Research limitations/implications This paper is a personal viewpoint, and does not report on the outcomes of a structured qualitative research project. Practical implications The provided list of intotal 22 pitfalls increases the awareness for the main challenges related to process modeling and helps to identify common mistakes. Originality/value This paper is one of the very few contributions in the area of challenges related to process modeling.
Conference Paper
Full-text available
We describe and evaluate hidden understanding models, a statistical learning approach to natural language understanding. Given a string of words, hidden understanding models determine the most likely meaning for the string. We discuss 1) the problem of representing meaning in this framework, 2) the structure of the statistical model, 3) the process of training the model, and 4) the process of understanding using the model. Finally, we give experimental results, including results on an ARPA evaluation.
Article
Full-text available
We describe a series of five statistical models of the translation process and give algorithms for estimating the parameters of these models given a set of pairs of sentences that are translations of one another. We define a concept of word-by-word alignment between such pairs of sentences. For any given pair of such sentences each of our models assigns a probability to each of the possible word-by-word alignments. We give an algorithm for seeking the most probable of these alignments. Although the algorithm is suboptimal, the alignment thus obtained accounts well for the word-by-word relationships in the pair of sentences. We have a great deal of data in French and English from the proceedings of the Canadian Parliament. Accordingly, we have restricted our work to these two languages; but we feel that because our algorithms have minimal linguistic content they would work well on other pairs of languages. We also feel, again because of the minimal linguistic content of our algorithms, that it is reasonable to argue that word-by-word alignments are inherent in any sufficiently large bilingual corpus.
Article
Full-text available
This paper presents a new corpus-based method for calculating the semantic similarity of two target words. Our method, called Second Order Co-occurrence PMI (SOC-PMI), uses Pointwise Mutual Information to sort lists of important neighbor words of the two target words. Then we consider the words which are common in both lists and aggregate their PMI values (from the opposite list) to calculate the relative semantic similarity. Our method was empirically evaluated using Miller and Charler's (1991) 30 noun pair subset, Ruben-stein and Goodenough's (1965) 65 noun pairs, 80 synonym test questions from the Test of English as a Foreign Language (TOEFL), and 50 synonym test questions from a collection of English as a Second Language (ESL) tests. Evaluation results show that our method outperforms several competing corpus-based methods.
Chapter
Full-text available
A formal business process model serves as a common understanding of how business tasks are carried out to achieve end goals. The business process life cycle is managed using Business Process Management tools and methodologies. Business Activity Monitoring provides (near) real-time visibility into process execution notifying relevant personnel of process exceptions. Business process modelling captures business and execution semantics, but lacks any foundation for process analysis. This chapter will outline a model for process performance management for use in the monitoring phase of the process life cycle and how this model is leveraged within the iWISE architecture. iWISE provides a single view of business processes spanning disparate systems and departments.
Conference Paper
Full-text available
One of the main data resources used in many studies over the past two decades for spoken language understanding (SLU) research in spoken dialog systems is the airline travel information system (ATIS) corpus. Two primary tasks in SLU are intent determination (ID) and slot filling (SF). Recent studies reported error rates below 5% for both of these tasks employing discriminative machine learning techniques with the ATIS test set. While these low error rates may suggest that this task is close to being solved, further analysis reveals the continued utility of ATIS as a research corpus. In this paper, our goal is not experimenting with domain specific techniques or features which can help with the remaining SLU errors, but instead exploring methods to realize this utility via extensive error analysis. We conclude that even with such low error rates, ATIS test set still includes many unseen example categories and sequences, hence requires more data. Better yet, new annotated larger data sets from more complex tasks with realistic utterances can avoid over-tuning in terms of modeling and feature design. We believe that advancements in SLU can be achieved by having more naturally spoken data sets and employing more linguistically motivated features while preserving robustness due to speech recognition noise and variance due to natural language.
Article
Full-text available
Current schema matching approaches still have to improve for large and complex Schemas. The large search space increases the likelihood for false matches as well as execution times. Further difficulties for Schema matching are posed by the high expressive power and versatility of modern schema languages, in particular user-defined types and classes, component reuse capabilities, and support for distributed schemas and namespaces. To better assist the user in matching complex schemas, we have developed a new generic schema matching tool, COMA++, providing a library of individual matchers and a flexible infrastructure to combine the matchers and refine their results. Different match strategies can be applied including a new scalable approach to identify context-dependent correspondences between schemas with shared elements and a fragment-based match approach which decomposes a large match task into smaller tasks. We conducted a comprehensive evaluation of the match strategies using large e-Business standard schemas. Besides providing helpful insights for future match implementations, the evaluation demonstrated the practicability of our system for matching large schemas.
Article
Full-text available
Performance measurement and analysis is crucial for steering the organization to realize its strategic and operational goals. Relevant performance indicators and their relationships to goals and activities need to be determined and analyzed. Current organization modeling approaches do not reflect this in an adequate way. This paper attempts to fill the gap by presenting a framework for modeling performance indicators within a general organization modeling framework.
Conference Paper
Full-text available
With the increasing influence of Business Process Management, large process model repositories emerged in enterprises and public administrations. Their effective utilization requires meaningful and efficient capabilities to search for models that go beyond text based search or folder navigation, e.g., by similarity. Existing measures for process model similarity are often not applicable for efficient similarity search, as they lack metric features. In this paper, we introduce a proper metric to quantify process similarity based on behavioral profiles. It is grounded in the Jaccard coefficient and leverages behavioral relations between pairs of process model activities. The metric is successfully evaluated towards its approximation of human similarity assessment.
Conference Paper
Full-text available
Semantic Business Process Management (SBPM) has been proposed as an extension of BPM with Semantic Web and Semantic Web Services (SWS) technologies in order to increase and enhance the level of automation that can be achieved within the BPM life-cycle. In a nutshell, SBPM is based on the extensive and exhaustive conceptualization of the BPM domain so as to support reasoning during business processes modelling, composition, execution, and analysis, leading to important enhancements throughout the life-cycle of business processes. An important step of the BPM life-cycle is the analysis of the processes deployed in companies. This analysis provides feedback about how these processes are actually being executed (like common control-flow paths, performance measures, detection of bottlenecks, alert to approaching deadlines, auditing, etc). The use of semantic information can lead to dramatic enhancements in the state-of-the-art in analysis techniques. In this paper we present an outlook on the opportunities and challenges on semantic business process mining and monitoring, thus paving the way for the implementation of the next generation of BPM analysis tools.
Conference Paper
Full-text available
Output of a planning process is a set of assigned individual tasks to resources at a certain point in time. Initially a manual job, however, in the past decades information systems have largely overtaken this role, especially in industries such as (road-) logistics. This paper focuses on the performance parameters and objectives that play a role in the planning process. In order to gain insight in the factors which play a role in designing new software systems for Logistical Service Providers (LSPs). Therefore we study the area of Key Performance Indicators (KPI). Typically, KPIs are used in a post-ante context: to evaluate a company's past performance. We reason that KPIs should be utilized in the planning phase as well; thus ex-ante. The paper describes the extended literature survey that we performed, and introduces a novel framework that captures the dynamics of competing KPIs, by positioning them in the practical context of an LSP. This framework could be valuable input in the design of a future generation of information systems, capable of incorporating the business dynamics of today's LSPs.
Article
Full-text available
The proliferation of ontologies and taxonomies in many domains increasingly demands the integration of multiple such ontologies. The goal of ontology integration is to merge two or more given ontologies in order to provide a unified view on the input ontologies while maintaining all information coming from them. We propose a new taxonomy merging algorithm that, given as input two taxonomies and an equivalence matching between them, can generate an integrated taxonomy in a fully automatic manner. The approach is target-driven, i.e. we merge a source taxonomy into the target taxonomy and preserve the structure of the target ontology as much as possible. We also discuss how to extend the merge algorithm providing auxiliary information, like additional relationships between source and target concepts, in order to semantically improve the final result. The algorithm was implemented in a working prototype and evaluated using synthetic and real-world scenarios.
Article
Many organizations maintain textual process descriptions alongside graphical process models. The purpose is to make process information accessible to various stakeholders, including those who are not familiar with reading and interpreting the complex execution logic of process models. Despite this merit, there is a clear risk that model and text become misaligned when changes are not applied to both descriptions consistently. For organizations with hundreds of different processes, the effort required to identify and clear up such conflicts is considerable. To support organizations in keeping their process descriptions consistent, we present an approach to automatically identify inconsistencies between a process model and a corresponding textual description. Our approach detects cases where the two process representations describe activities in different orders and detect process model activities not contained in the textual description. A quantitative evaluation with 53 real-life model-text pairs demonstrates that our approach accurately identifies inconsistencies between model and text.
Article
Executives know that a company's measurement systems strongly affect employee behaviors. But the traditional financial performance measures that worked for the industrial era are out of sync with the skills organizations are trying to master. Frustrated by these inadequacies, some managers have abandoned financial measures like return on equity and earnings per share. "Make operational improvements, and the numbers will follow,"the argument goes. But managers want a balanced presentation of measures that will allow them to view the company from several perspectives at once. In this classic article from 1992, authors Robert Kaplan and David Norton propose an innovative solution. During a yearlong research project with 12 companies at the leading edge of performance management, the authors developed a "balanced scorecard;" a new performance measurement system that gives top managers a fast but comprehensive view of their business. The balanced scorecard includes financial measures that tell the results of actions already taken. And it complements those financial measures with three sets of operational measures related to customer satisfaction, internal processes, and the organization's ability to learn and improve-the activities that drive future financial performance. The balanced scorecard helps managers look at their businesses from four essential perspectives and answer Some important questions. First, How do customers see us? Second, What must we excel at? Third, Can we continue to improve and create value? And fourth, How do we appear to shareholders? By looking at all of these parameters, managers can determine whether improvements in one area have come at the expense of another. Armed with that knowledge, the authors say, executives can glean a complete picture of where the company stands-and where it's headed.
Conference Paper
Business operations are often documented by business process models. Use cases such as system validation and process harmonization require the identification of correspondences between activities, which is supported by matching techniques that cope with textual heterogeneity and differences in model granularity. In this paper, we present a matching technique that is tailored towards models featuring textual descriptions of activities. We exploit these descriptions using ideas from language modelling. Experiments with real-world process models reveal that our technique increases recall by up to factor five, largely without compromising precision, compared to existing approaches.
Article
This paper addresses the problem of transforming business specifications written in natural language into formal models suitable for use in information systems development. It proposes a method for transforming controlled natural language specifications based on the Semantics of Business Vocabulary and Business Rules standard. This approach is unique in combining techniques from Model-Driven Engineering (MDE), Cognitive Linguistics, and Knowledge-based Configuration, which allows the reliable semantic processing of specifications and integration with existing MDE tools to improve productivity, quality, and time-to-market in software development. The method first learns the vocabulary of the specification from glossary-like definitions then parses the rules of the specification and outputs the resulting formal SBVR model. Both aspects of the method are tested separately, with the system correctly learning 98% of the vocabulary and correctly interpreting 98% of the rules of an SBVR SE based example. Finally, the proposed method is compared to state-of-the-art approaches for creating formal models from natural language specifications, arguing that it meets the criteria necessary to fulfil the three goals of (1) shifting control of specification to non-technical business experts, (2) reducing the manual effort involved in formalising specifications, and (3) supporting business experts in creating well-formed sets of business vocabularies and rules.
Article
While the maturity of process mining algorithms increases and more process mining tools enter the market, process mining projects still face the problem of different levels of abstraction when comparing events with modeled business activities. Current approaches for event log abstraction try to abstract from the events in an automated way that does not capture the required domain knowledge to fit business activities. This can lead to misinterpretation of discovered process models. We developed an approach that aims to abstract an event log to the same abstraction level that is needed by the business. We use domain knowledge extracted from existing process documentation to semi-automatically match events and activities. Our abstraction approach is able to deal with n:m relations between events and activities and also supports concurrency. We evaluated our approach in two case studies with a German IT outsourcing company.
Article
There is a wide variety of drivers for business process modelling initiatives, reaching from organisational redesign to the development of information systems. Consequently, a common business process is often captured in multiple models that overlap in content due to serving different purposes. Business process management aims at flexible adaptation to changing business needs. Hence, changes of business processes occur frequently and have to be incorporated in the respective process models. Once a process model is changed, related process models have to be updated accordingly, despite the fact that those process models may only be loosely coupled. In this article, we introduce an approach that supports change propagation between related process models. Given a change in one process model, we leverage the behavioural abstraction of behavioural profiles for corresponding activities in order to determine a change region in another model. Our approach is able to cope with changes in pairs of models that are not related by hierarchical refinement and show behavioural inconsistencies. We evaluate the applicability of our approach with two real-world process model collections. To this end, we either deduce change operations from different model revisions or rely on synthetic change operations.
Article
A key aspect in any process-oriented organisation is the evaluation of process performance for the achievement of its strategic and operational goals. Process Performance Indicators (PPIs) are a key asset to carry out this evaluation, and, therefore, having an appropriate definition of these PPIs is crucial. After a careful review of the literature related and a study of the current picture in different real organisations, we conclude that there not exists any proposal that allows to define PPIs in a way that is unambiguous and highly expressive, understandable by technical and non-technical users and traceable with the Business Process (BP). In addition, like other activities carried out during the BP lifecycle, the management of PPIs is considered time-consuming and error-prone. Therefore, providing an automated support for them is very appealing from a practical point of view. In this paper, we propose the PPINOT metamodel, which allows such an advanced definition of PPIs and is independent of the language used to model the business process. Furthermore, we provide an automatic semantic mapping from the metamodel to Description Logics (DL) that allows the implementation of design-time analysis operations in such a way that DL reasoners’ facilities can be leveraged. These operations provide information that can assist process analysts in the definition and instrumentation of PPIs. Finally, to validate the usefulness of our proposal, we have used the PPINOT metamodel at the core of a software tool called the PPINOT Tool Suite and we have applied it in several real scenarios.
Article
It is common for large and complex organizations to maintain repositories of business process models in order to document and to continuously improve their operations. Given such a repository, this paper deals with the problem of retrieving those process models in the repository that most closely resemble a given process model or fragment thereof. The paper presents three similarity metrics that can be used to answer such queries: (i) label matching similarity that compares the labels attached to process model elements; (ii) structural similarity that compares element labels as well as the topology of process models; and (iii) behavioral similarity that compares element labels as well as causal relations captured in the process model. These similarity metrics are experimentally evaluated in terms of precision and recall, and in terms of correlation of the metrics with respect to human judgement. The experimental results show that all three metrics yield comparable results, with structural similarity slightly outperforming the other two metrics. Also, all three metrics outperform traditional search engines when it comes to searching through a repository for similar business process models.
Article
The supply chain is an important element in logistics development for all industries. It can improve efficiency and effectiveness of not only product transfer, but also information sharing between the complex hierarchy of all the tiers. There is no systematic grouping of the different performance measures in the existing literatures. This paper presents the formulisation of both quantitative and qualitative performance measurements for easy representation and understanding. Apart from the common criteria such as cost and quality, five other performance measurements are defined: resource utilisation; flexibility; visibility; trust; and innovativeness. In particular, new definitions are developed for visibility, trust, and innovativeness. Details of choices of these performance measurements are listed and suggested solutions are given, with the hope that a full picture of supply chain performance measurements is developed. In addition, a multi-attribute decision-making technique, an analytic hierarchy process (AHP), is used to make decisions based on the priority of performance measures. This paper outlines the application and particularly the pairwise comparison which helps to identify easily the importance of different performance measurements. An example from the electronic industry is used to demonstrate the AHP technique.
Chapter
Evaluating processes with the aid of key performance indicators continues to gain acceptance as an integral element of corporate controlling as well as process management. This article will first describe the tasks associated with process controlling, which will then be used as a basis for defining the objectives of a process-oriented key performance indicator management system. In particular, we will explain how the description of key performance indicators can be integrated in general process modelling. When KPI management has been assigned its proper place in the process management loop, roles that are needed to implement it in real terms will be defined.
Article
Concept mapping systems used in education and knowledge management emphasize flexibility of representation to enhance learning and facilitate knowledge capture. Collections of concept maps exhibit terminology variance, informality, and organizational variation. These factors make it difficult to match elements between maps in comparison, retrieval, and merging processes. In this work, we add an element anchoring mechanism to a similarity flooding (SF) algorithm to match nodes and substructures between pairs of simulated maps and student-drawn concept maps. Experimental results show significant improvement over simple string matching with combined recall accuracy of 91% for conceptual nodes and concept → link → concept propositions in student-drawn maps.
Article
A system for part-of-speech tagging is described. It is based on a hidden Markov model which can be trained using a corpus of untagged text. Several techniques are introduced to achieve robustness while maintaining high performance. Word equivalence classes are used to reduce the overall number of parameters in the model, alleviating the problem of obtaining reliable estimates for individual words. The context for category prediction is extended selectively via predefined networks, rather than using a uniformly higher-order conditioning which requires exponentially more parameters with increasing context. The networks are embedded in a first-order model and network structure is developed by analysis of erros, and also via linguistic considerations. To compensate for incomplete dictionary coverage, the categories of unknown words are predicted using both local context and suffix information to aid in disambiguation. An evaluation was performed using the Brown corpus and different dictionary arrangements were investigated. The techniques result in a model that correctly tags approximately 96% of the text. The flexibility of the methods is illustrated by their use in a tagging program for French.
Article
We are interested in providing automated services via natural spoken dialog systems. By natural, we mean that the machine understands and acts upon what people actually say, in contrast to what one would Like them to say. There are many issues that arise when such systems are targeted for large populations of non-expert users. In this paper, we focus on the task of automatically routing telephone calls based on a user's fluently spoken response to the open-ended prompt of "How may I help you?". We first describe a database generated from 10,000 spoken transactions between customers and human agents. We then describe methods for automatically acquiring language models for both recognition and understanding from such data. Experimental results evaluating call-classification from speech are reported for that database. These methods have been embedded within a spoken dialog system, with subsequent processing for information retrieval and formfilling. (C) 1997 Elsevier Science B.V.
Conference Paper
This paper presents a method for measuring the se- mantic similarity of texts, using corpus-based and knowledge-based measures of similarity. Previous work on this problem has focused mainly on either large doc- uments (e.g. text classification, information retrieval) or individual words (e.g. synonymy tests). Given that a large fraction of the information available today, on the Web and elsewhere, consists of short text snip- pets (e.g. abstracts of scientific documents, imagine captions, product descriptions), in this paper we fo- cus on measuring the semantic similarity of short texts. Through experiments performed on a paraphrase data set, we show that the semantic similarity method out- performs methods based on simple lexical matching, re- sulting in up to 13% error rate reduction with respect to the traditional vector-based similarity metric.
Conference Paper
SGStudio is a grammar authoring tool that eases semantic grammar development. It is capable of integrating different information sources and learning from annotated examples to induct CFG rules. In this paper, we investigate a modification to its underlying model by replacing CFG rules with n-gram statistical models. The new model is a composite of HMM and CFG. The advantages of the new model include its built-in robust feature and its scalability to an n-gram classifier when the understanding does not involve slot filling. We devised a decoder for the model. Preliminary results show that the new model achieved 32% error reduction in high resolution understanding.
Conference Paper
Business Process Management approaches incorporate an analysis phase as an essential activity to improve business processes. Although business processes are defined at a high-level of abstraction, the actual analysis concerns are specified at the workflow implementation level resulting in a technology-dependent solution, increasing the complexity to evolve them. In this paper we present a language for high-level monitoring, measurement data collection, and control of business processes and an approach to translate these specifications into executable implementations. The approach we present offers process analysts the opportunity to evolve analysis concerns independently of the process implementation.
Conference Paper
The Event-Driven Process Chain (EPC) and the Business Process Modeling Notation (BPMN) are designed for modelling business processes, but do not yet include any means for modelling process goals and their measures, and they do not have a published metamodel. We derive a metamodel for both languages, and extend the EPC and the BPMN with process goals and performance measures to make them conceptually visible. The extensions are based on the metamodels tested with example business processes.
Article
Schema matching is the problem of finding relationships among concepts across data sources that are heterogeneous in format and in structure. Starting from the “hidden meaning” associated with schema labels (i.e. class/attribute names), it is possible to discover lexical relationships among the elements of different schemata. In this work, we propose an automatic method aimed at discovering probabilistic lexical relationships in the environment of data integration “on the fly”. Our method is based on a probabilistic lexical annotation technique, which automatically associates one or more meanings with schema elements w.r.t. a thesaurus/lexical resource. However, the accuracy of automatic lexical annotation methods on real-world schemata suffers from the abundance of non-dictionary words such as compound nouns and abbreviations. We address this problem by including a method to perform schema label normalization which increases the number of comparable labels. From the annotated schemata, we derive the probabilistic lexical relationships to be collected in the Probabilistic Common Thesaurus. The method is applied within the MOMIS data integration system but can easily be generalized to other data integration systems.
Article
A set of criteria for choosing the most suitable third-party logistics (3PL) provides is discussed. The evaluation criteria framework can help information technology (IT) management evaluate outsourcing logistics services. The conceptual framework using IT as the focus peruses the core functionalities of 3PL providers such as inventory management, logistics, transportation, warehousing and customer services. A careful consideration of this framework and the use of IT in logistics and supply chain management can provide insights to logistics managers, procurement managers, IT managers and academicians.
Article
Using automatic tools for the quality analysis of Natural Language (NL) requirements is recognized as a key factor for achieving software quality. Unfortunately few tools and techniques for the NL requirements analysis are currently available. This paper presents a methodology and a tool (called QuARS - Quality Analyzer for Requirement Specifications) for analyzing NL requirements in a systematic and automatic way. QuARS allows requirements engineers to perform an initial parsing of the requirements in order to automatically detect potential linguistic defects that could cause interpretation problems at subsequent stages in developing the software. This tool is also able to partially support the consistency and completeness analysis by clustering the requirements according to specific topics.
Article
consistency of such corresponding models is a major challenge for process modeling theory and practice. In this paper, we take the inappropriateness of existing strict notions of behavioral equivalence as a starting point. Our contribution is a concept called behavioral profile that captures the essential behavioral constraints of a process model. We show that these profiles can be computed efficiently, i.e., in cubic time for sound free-choice Petri nets w.r.t. their number of places and transitions. We use behavioral profiles for the definition of a formal notion of consistency which is less sensitive to model projections than common criteria of behavioral equivalence and allows for quantifying deviation in a metric way. The derivation of behavioral profiles and the calculation of a degree of consistency have been implemented to demonstrate the applicability of our approach. We also report the findings from checking consistency between partially overlapping models of the SAP reference model. Index Terms—Process model analysis, process model alignment, behavioral abstraction, consistency checking, consistency measures. Ç