Boualem BenatallahUNSW Sydney | UNSW · School of Computer Science and Engineering
Boualem Benatallah
PhD
About
312
Publications
62,171
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
14,650
Citations
Introduction
Prof. Boualem Benatallah is a Scientia professor and research group leader at UNSW Australia. His main research interests are developing fundamental concepts and techniques in Web service engineering and business processes management. Boualem has been general and PC chair of number of international conferences. He is member of the steering committee of BPM and ICSOC conferences. He is member of Executive Committee of IEEE CSTC on Business Informatics and Systems.
Publications
Publications (312)
Application Programming Interfaces (APIs) have become one of the key assets within modern businesses, facilitating the linking and integration of intra- and inter-organizational data and systems in the context of complex and heterogeneous technology ecosystems. APIs allow organizations to monetize data, build profitable partnerships and foster inno...
Application Programming Interface (API) is a core technology that facilitates developers’ productivity by enabling the reuse of software components. Understanding APIs and gaining knowledge about their usage are therefore fundamental needs for developers that impact a wide range of software development activities. This paper presents an approach to...
Collaboration tools are important for workplace communication. The amount of conversation data produced in workplaces are increasing rapidly, placing a burden on workers. There is a necessity to analyze large amounts of data automatically to extract actionable information. Multiple studies were conducted on action extraction to identify actions suc...
There have been ever-increasing amounts of security vulnerabilities discovered and reported in recent years. Much of the information related to these vulnerabilities is currently available to the public, in the form of rich, textual data (e.g. vulnerability reports). Many of the state-of-the-art techniques used today to process such textual data re...
With the dynamic nature of cloud applications and rapid change of their resource requirements, elasticity over cloud resources has to be effectively supported. It represents the ability to dynamically adjust cloud resources that applications use in order to adapt to their varying workloads while maintaining the desired quality of service. However,...
In this paper and demo we present a crowd and crowd+AI based system, called CrowdRev, supporting the screening phase of literature reviews and achieving the same quality as author classification at a fraction of the cost, and near-instantly. CrowdRev makes it easy for authors to leverage the crowd, and ensures that no money is wasted even in the fa...
Data veracity is a grand challenge for various tasks on the Web. Since the web data sources are inherently unreliable and may provide conflicting information about the same real-world entities, truth discovery is emerging as a countermeasure of resolving the conflicts by discovering the truth, which conforms to the reality, from the multi-source da...
Truth discovery is the problem of detecting true values from the conflicting data provided by multiple sources on the same data items. Since sources' reliability is unknown a priori, a truth discovery method usually estimates sources' reliability along with the truth discovery process. A major limitation of existing truth discovery methods is that...
Open Cloud Computing Interface (OCCI) follows a set of guidelines (i.e. best practices) to create interoperable APIs over Cloud resources. In this paper, we identify a set of patterns that must be followed and anti-patterns that should be avoided to comply with the OCCI guidelines. To automatically detect (anti)patterns, we propose a Semantic-based...
The Internet of Things (IoT) embodies the evolution from systems that link digital documents to systems that relate digital information with real-world physical items. It provides the infrastructure to transparently and seamlessly glue heterogeneous resources and services together by accessing sensors and actuators over the Internet. By connecting...
With the vast proliferation of cloud computing technologies, DevOps are inevitably faced with managing large amounts of complex cloud resource configurations. This involves being able to proficiently understand and analyze cloud resource attributes and relationships, and make decisions on demand. However, a majority of cloud tools encode resource d...
The increasing application of social and human-enabled systems in people's daily life from one side and from the other side the fast growth of mobile and smart phones technologies have resulted in generating tremendous amount of data, also referred to as big data, and a need for analyzing these data, i.e., big data analytics. Recently a trend has e...
The literature on the challenges of and potential solutions to architecting cloud-based systems is rapidly growing, but is scattered. It is important to systematically analyze and synthesize the existing research on architecting cloud-based software systems in order to build a cohesive body of knowledge of the reported challenges and solutions. We...
Despite the proliferation of cloud resource orchestration frameworks (CROFs), DevOps managers and application developers still have no systematic tool for evaluating their features against desired criteria. The authors present generic technical dimensions for analyzing CROF capabilities and understanding prominent research to refine them.
With recent advances in radio-frequency identification (RFID), wireless
sensor networks, and Web services, physical things are becoming an integral
part of the emerging ubiquitous Web. Finding correlations of ubiquitous things
is a crucial prerequisite for many important applications such as things
search, discovery, classification, recommendation,...
The emerging Web of Things (WoT) will comprise billions of Web-enabled
objects (or "things") where such objects can sense, communicate, compute and
potentially actuate. WoT is essentially the embodiment of the evolution from
systems linking digital documents to systems relating digital information to
real-world physical items. It is widely understo...
Web services are a consolidated reality of the modern Web with tremendous, increasing impact on everyday computing tasks. They turned the Web into the largest, most accepted, and most vivid distributed computing platform ever. Yet, the use and integration of Web services into composite services or applications, which is a highly sensible and concep...
This paper introduces a scalable process event analysis approach, including parallel algorithms, to support efficient event correlation for big process data. It proposes a two-stages approach for finding potential event relationships, and their verification over big event datasets using MapReduce framework. We report on the experimental results, wh...
Cloud computing provides on-demand access to affordable hardware (such as multicore CPUs, GPUs, disk drives, and networking equipment) and software (databases, application servers, load-balancers, data processors, and frameworks). The pervasiveness and power of cloud computing alleviates some of the problems that application administrators face in...
In today’s knowledge-, service-, and cloud-based economy, businesses accumulate massive amounts of data from a variety of sources. In order to understand businesses one may need to perform considerable analytics over large hybrid collections of heterogeneous and partially unstructured data that is captured related to the process execution. This dat...
This paper studies the problem of checking the simulation preorder for data-centric services. It focuses more specifically on the underlying decidability and complexity issues in the framework of the Colombo model [1]. We show that the simulation test is exptime-complete for Colombo services without any access to the database (noted ColomboDB = ∅ )...
Traditional structured process-support systems increasingly prove too rigid amidst today’s fast-paced and knowledge-intensive environments. Commonly described as “unstructured” or “semi-structured” processes, they cannot be pre-planned and likely to be dependent upon the interpretation of human-workers during process execution. On the other hand, t...
Workers in online crowd sourcing systems have different levels of expertise, trustworthiness, incentives and motivations. Therefore, recruiting sufficient number of well-suited workers is always a challenge. Existing methods usually recruit workers through open calls, friendships relations, matching their profiles with task requirements or recruiti...
A method for use with a spreadsheet includes storing a cell object, where the cell object includes a location in the spreadsheet of a cell to which the cell object relates and a process associated with the cell, and performing the process on a complex object to produce a result, where the complex object includes a construct comprised of data and co...
Recent advances in technologies have created a need for solving security problems in a systematic way. With this in mind, network security technologies have been produced in order to ensure the security of software and communication functionalities at basic, enhanced, and architectural levels.
Network Security Technologies: Design and Applications...
The 9th edition of the ICSOC PhD Symposium was held on December 2, 2013, in Berlin, as a satellite event of the 11th International Conference on Service Oriented Computing (ICSOC 2013). The aim of the PhD Symposium series is to bring together Ph.D. students and established researchers in the field of service oriented computing, to give students the...
We are surrounded by data, a vast amount of data that has brought about an increasing need for combining and analyzing it in order to extract information and generate knowledge. A need not exclusive of big software companies with expert programmers; from scientists to bloggers, many end-user programmers currently demand data management tools to gen...
The consumption of APIs, such as Enterprise Services (ESs) in an enterprise Service-Oriented Architecture (eSOA), has largely been a task for experienced developers. With the rapidly growing number of such (Web)APIs, users with little or no experience in a given API face the problem of finding relevant API operations – e.g., mashups developers. How...
In many cases, it is not cost effective to automate business processes which affect a small number of people and/or change frequently. We present a novel approach for enabling domain experts to model and deploy such processes from their respective domain as Web service compositions. The approach builds on user-editable service, naming and represent...
Information Extraction (IE) is the task of automatically extracting
structured information from unstructured/semi-structured machine-readable
documents. Among various IE tasks, extracting actionable intelligence from
ever-increasing amount of data depends critically upon Cross-Document
Coreference Resolution (CDCR) - the task of identifying entity...
The Internet has truly transformed into a global deployment and development platform. For example, Web 2.0 inspires large-scale collaboration; Social-computing empowers increased awareness; as well as Cloud-computing for virtualization of resources. As a result, developers have thus been presented with ubiquitous access to countless web-services. H...
Data transformation is a key task in mashup development (e.g., access to heterogeneous services, data flow). It is considered as a labour-intensive and error-prone process. The possibility of reusing previously specified mappings promises a significant reduction in manual and time-consuming transformation tasks, nevertheless its potential has not b...
The rapid growth of online Web services has led to the proliferation of functionality-wise equivalent services with differences in their descriptions and behaviors, and therefore has given rise to the need for service adaptation. In this chapter, we discuss key challenges for Web service interoperability and adaptation. We present a consolidated fr...
Adapting user interfaces (UIs) to various contexts, such as for the exploding number of different devices, has become a major challenge for UI developers. The support offered by current development environments for UI adaptation is limited, as is the support for the efficient creation of UIs in Web service-based applications. In this paper, we desc...
Processes in case management applications are flexible, knowledge-intensive and people-driven, and often used as guides for workers in processing of artifacts. An important fact is the evolution of process artifacts over time as they are touched by different people in the context of a knowledge-intensive process. This highlights the need for tracki...
In recent times we have witnessed several advances in modern web-technology that has transformed the Internet into a global deployment and development platform. Such advances include Web 2.0 for large-scale collaboration; Social-computing for increased awareness; as well as Cloud-computing, which have helped virtualized resources over the Internet....
As a new distributed computing model, crowdsourcing lets people leverage the crowd's intelligence and wisdom toward solving problems. This article proposes a framework for characterizing various dimensions of quality control in crowdsourcing systems, a critical issue. The authors briefly review existing quality-control approaches, identify open iss...
Social rating systems are widely used to harvest user feedback and to support making decisions by users on the Web. Web users may try to exploit such systems by posting unfair or false evaluations for fame or profit reasons. Detecting the real rating scores of products as well as the trustworthiness of reviewers is an important and a very challengi...
Social rating systems are subject to unfair evaluations. Users may try to individually or collaboratively promote or demote a product. Detecting unfair evaluations, mainly massive collusive attacks as well as honest looking intelligent attacks, is still a real challenge for collusion detection systems. In this paper, we study the impact of unfair e...
Online rating systems are subject to unfair evaluations. Users may try to individually or collaboratively promote or demote a product. Collaborative unfair rating, i.e., collusion, is more damaging than individual unfair rating. Detecting massive collusive attacks as well as honest looking intelligent attacks is still a real challenge for collusion...
We consider the problem of analyzing specifications of data-centric services. Specifications of such services incorporate data in business protocols. We focus our study on the decidability of the problem of checking the simulation preorder in the framework of the Colombo model. Colombo is a data-centric service that appears, at a first glance, to h...
The number of Web-Services publicly accessible through APIs have rapidly grown in recent years. Although, while these services are key in providing access to data as well as a variety of functionality, often their full potential remains yet to be fully exploited. Due to the different standards used to implement and expose Web services, it is usuall...
Existing approaches in similarity analysis is little concerned with the right choice of similarity functions. We present an approach for suggesting which similarity functions (e.g., edit distance) are most appropriate for a given similarity search task. We identify data features (e.g., misspellings) that are considerable when choosing similarity fu...
Graphs are essential modeling and analytical objects for representing information networks. Existing approaches, in on-line analytical processing on graphs, took the first step by supporting multi-level and multi-dimensional queries on graphs, but they do not provide a semantic-driven framework and a language to support n-dimensional computations,...
Provenance refers to the documentation of an object's lifecycle. This
documentation (often represented as a graph) should include all the information
necessary to reproduce a certain piece of data or the process that led to it.
In a dynamic world, as data changes, it is important to be able to get a piece
of data as it was, and its provenance graph...
The ability to efficiently find relevant subgraphs and paths in a large graph
to a given query is important in many applications including scientific data
analysis, social networks, and business intelligence. Currently, there is
little support and no efficient approaches for expressing and executing such
queries. This paper proposes a data model an...
Worker selection is a significant and challenging issue in crowdsourcing
systems. Such selection is usually based on an assessment of the reputation of
the individual workers participating in such systems. However, assessing the
credibility and adequacy of such calculated reputation is a real challenge. In
this paper, we propose an analytic model w...
Automatically constructing or completing knowledge bases of SOA design knowledge puts traditional clustering approaches beyond their limits. We propose an approach to amend incomplete knowledge bases of Enterprise Service (ES) design knowledge, based on a set of ES signatures. The approach employs clustering, complemented with various filtering and...
Online rating systems are subject to malicious behaviors mainly by posting
unfair rating scores. Users may try to individually or collaboratively promote
or demote a product. Collaborating unfair rating 'collusion' is more damaging
than individual unfair rating. Although collusion detection in general has been
widely studied, identifying collusion...
The volume of data related to business process execution is increasing significantly in the enterprise. Many of data sources include events related to the execution of the same processes in various systems or applications. Event correlation is the task of analyzing a repository of event logs in order to find out the set of events that belong to the...
The emergence of cloud computing over the past five years is potentially one
of the breakthrough advances in the history of computing. It delivers hardware
and software resources as virtualization-enabled services and in which
administrators are free from the burden of worrying about the low level
implementation or system administration details. Al...
A number of excellent technical contributions that significantly advance the state-of-the-art of software architectures and application development environments for cloud computing are compiled. Yu et al. in the paper titled 'A Novel Watermarking Method for Software Protection in the Cloud', identify an insider threat to access control, which is no...
Cloud computing paradigm [1] has shifted the computing from physical hardware- and locally managed software-enabled platforms to virtualized cloud-hosted services. Cloud computing assembles large networks of virtualized services: hardware resources (CPU, storage, and network) and software resources (e.g., databases, load-balancers, monitoring syste...
Worker selection is a significant and challenging issue in crowdsourcing systems. Such selection is usually based on an assessment of the reputation of the individual workers participating in such systems. However, assessing the credibility and adequacy of such calculated reputation is a real challenge. In this paper, we propose a reputation manage...
Similar entity search is the task of identifying entities that most closely resemble a given entity (e.g., a person, a document, or an image). Although many techniques for estimating similarity have been proposed in the past, little work has been done on the question of which of the presented techniques are most suitable for a given similarity anal...
In many cases, it is not cost effective to automate business processes which affect a small number of people and/or change frequently. We present a novel approach for enabling domain experts to model and deploy such processes from their respective domain as Web service compositions. The approach is based on user-editable service naming, a graphical...
In Grid computing environments, the availability, performance, and state of resources, applications, services, and data undergo continuous changes during the life cycle of an application. Uncertainty is a fact in Grid environments, which is triggered by multiple factors, including: (1) failures, (2) dynamism, (3) incomplete global knowledge, and (4...
Spreadsheets are used by millions of users as a routine all-purpose data management tool. It is now increasingly necessary for external applications and services to consume spreadsheet data. In this paper, we investigate the problem of transforming spreadsheet data to structured formats required by these applications and services. Unlike prior meth...
In this talk, we will review state of the art in services composition and reuse. We discuss main issues related to simplifying composition and increasing reuse. Current APIs and composition techniques including mashups, however, aim toward developers with programming expertise; they are not directly usable by wider class of users who do not have pr...
In large-scale SOA development projects, organizations utilize Enterprise Services to implement new composite applications.
Such Enterprise Services are commonly developed based on service design methodologies of a SOA Governance process to feasibly
deal with a large set of Enterprise Services. However, this usually reduces their understandability...
Discovering the behavior of services and their interactions in an enterprise requires the ability to correlate service interaction
messages into process instances. The service interaction logic (or process model) is then discovered from the set of process
instances that are the result of a given way of correlating messages. However, sometimes, the...
Understanding, analyzing, and ultimately improving business processes is a goal of enterprises today. These tasks are challenging
as business processes in modern enterprises are implemented over several applications and Web services, and the information
about process execution is scattered across several data sources. Understanding modern business...
The paper mentions that business processes in today's enterprises are dynamic, flexible, and sometimes ad hoc. The enterprise often uses not only process management systems but also various tools, resources, and services to support process execution. Process spaces can provide a new abstraction for process management in such scenarios. Process-spac...
The World Wide Web is evolving towards a very large distributed platform allowing ubiquitous access to a wide range of Web applications with minimal delay and no installation required. Such Web applications range from having users undertake simple tasks, such as filling a form, to more complex tasks including collaborative work, project management,...
The execution of a business process (BP) in today's enterprises may involve a workflow and multiple IT systems and services. Often no complete, up-to-date documentation of the model or correlation information of process events exist. Understanding the execution of a BP in terms of its scope and details is challenging specially as it is subjective:...
Efforts and tools aiming to automate business processes promise the highest potential gains on business processes with a well-defined
structure and high degree of repetition [1]. Despite successes in this area, the reality is that today many processes are
in fact not automated. This is because, among other reasons, Business Process Management Suite...
Liquid Course Artifacts Software Platform aims to improve social productivity and enhance interactive experience for teaching
and collaborating by using suppliment materials such as slides, exercises, audios, videos and books.
With the growing presence of BPM and SOA in the IT industry, their impact on the IT education will be profound. Many institutions are becoming aware of the acute need of developing learning and teaching resource frameworks for the BPM and SOA. In this paper, we present part of such an effort from a team at the University of New South Wales, current...
Business process management, service-oriented architectures and software back-engineering heavily rely on the fundamental processes of mining of processes and web service business protocols from log files. Model extraction and mining aim at the (re)discovery of the behavior of a running model implementation using solely its interaction and activity...
With the rapid growth in the number of online Web services, the problem of service adaptation has received significant attention. In matching and adaptation, the functional description of services including interface and data as well as behavioral descriptions are important. Existing work on matching and adaptation focuses only on one aspect. In th...
Web services are increasingly gaining acceptance as a framework for facilitating application-to-application interactions within and across enterprises. It is commonly accepted that a service description should include not only the interface, but also the business protocol supported by the service. The present work focuses on the formalization of an...
In this paper we present FormSys, a Web-based system that service-enables form documents. It offers two main services: filling in forms based on Web services' incoming SOAP messages, and invoking Web services based on filled-in forms. This can be applied to benefit individuals to reduce the number of often repetitive form fields they have to comple...
Spreadsheets, a popular productivity tool, has gained attention as a potential mashup development environment targeted towards end-users. In this paper, we present a general architecture of mashup tools for spreadsheets. We also present an analysis of the state-of-the art on spreadsheet-based mashup tools. The analysis result is used to guide our r...