Conference PaperPDF Available

FormSys: Form-processing web services

Authors:

Abstract and Figures

In this paper we present FormSys, a Web-based system that service-enables form documents. It offers two main services: filling in forms based on Web services' incoming SOAP messages, and invoking Web services based on filled-in forms. This can be applied to benefit individuals to reduce the number of often repetitive form fields they have to complete manually in many scenarios. It can also help organisations to remove the need for manual data entry by automatically triggering business process implementations based on incoming case data from filled-in forms. While the concept applies to forms of any type of document, our implementation uses Adobe AcroForms due to its universal applicability, availability of a usable API, and end-user appeal. In the demo, we will show the two core functions, namely soap2pdf and pdf2soap, along with use case applications of the services developed from real world scenarios. Essentially, this work demonstrates how PDFs can be used as a channel for interacting with Web services.
Content may be subject to copyright.
FormSys: Form-processing Web Services
Ingo M. Weber
ingo.weber@cse.unsw.edu.au
Hye-young Paik
hpaik@cse.unsw.edu.au
Boualem Benatallah
boualem@cse.unsw.edu.au
Zifei Gong
zgon235@cse.unsw.edu.au
Liangliang Zheng
lzhe544@cse.unsw.edu.au
Corren Vorwerk
correnv@cse.unsw.edu.au
School of Computer Science and Engineering, K17
University of New South Wales
Sydney, NSW, Australia, 2052
ABSTRACT
In this paper we present FormSys, a Web-based system that
service-enables form documents. It offers two main services:
filling in forms based on Web services’ incoming SOAP mes-
sages, and invoking Web services based on filled-in forms.
This can be applied to benefit individuals to reduce the
number of often repetitive form fields they have to complete
manually in many scenarios. It can also help organisations
to remove the need for manual data entry by automatically
triggering business process implementations based on incom-
ing case data from filled-in forms. While the concept applies
to forms of any type of document, our implementation uses
Adobe AcroForms due to its universal applicability, avail-
ability of a usable API, and end-user appeal. In the demo,
we will show the two core functions, namely soap2pdf and
pdf2soap, along with use case applications of the services
developed from real world scenarios. Essentially, this work
demonstrates how PDFs can be used as a channel for inter-
acting with Web services.
Categories and Subject Descriptors
H.4.0 [Information Systems]: Information Systems Ap-
plications; D.2.13 [Software]: Reusable Software
General Terms
Algorithms, Design
Keywords
Documents, Processes, Service-enabled Forms, Web Services
1. INTRODUCTION
Modern business organisations employ a diverse range of
business process implementation solutions, including En-
terprise Resource Planning (ERP), Customer Relationship
Management (CRM), workflows, and the like [8]. However,
paper-based forms are still prevalent in many of the interac-
tions between the organisations and their customers or busi-
ness partners. For example, one needs to fill a series of forms
with personal details and a varying degree of additional in-
formation to open a bank account or request a driver’s li-
cense. Receiving these forms at the organisation’s end may
The author is enrolled with Univ. of Applied Sciences, Karl-
sruhe, Germany. This work was done when visiting UNSW.
Copyright is held by the International World Wide Web Conference Com-
mittee (IW3C2). Distribution of these papers is limited to classroom use,
and personal use by others.
WWW 2010, April 26–30, 2010, Raleigh, North Carolina, USA.
ACM 978-1-60558-799-8/10/04.
trigger a number of business process instances. More of-
ten than not, extracting the data from forms into software
(called media break, e.g. by [4]) is a highly manual task.
Automating such manual activities involves a team of ex-
perienced programmers implementing electronic versions of
the forms, e.g., HTML forms embedded in a Web-based
system. This solution is ad-hoc, costly and the time-to-
production is long. In fact, to make a form and its associ-
ated business process available instantly, it is still easier for
administrative office workers to create a form using a word
processor, save it in a widely-accepted format (e.g., PDF)
and distribute it e.g., via email. Of course the processing of
such forms still remains manual.
We believe it is desirable to have a solution which utilises
tools that the office workers are already familiar with, but at
the same time enhances the ability to create forms and easily
automate the processing procedures involved. As a step
towards the envisioned solution, in this paper, we describe
and demonstrate FormSys, a tool that provides Web services
for effective form processing. FormSys is designed to be used
by people without deep technical knowledge and deal with
any existing form as-is. In particular, we use a sub-standard
to PDF called AcroForms [3]. A discussion on design choices
can be found in Section 3.1.
In FormSys, office workers can easily create“form-processing
Web services” from PDFs, meaning the data the forms hold
can be received and sent via SOAP messages. This makes
it possible, for instance, to fill the common parts of multi-
ple forms simultaneously, or to electronically send a com-
pleted form’s content to a workflow system and trigger an
event. More specifically, when a form is uploaded by the
user, FormSys can dynamically generate two services:
soap2pdf accepts data from an application, fills an
AcroForm with it, and sends it back via email or pro-
vides a URL under which the filled form is available.
pdf2soap extracts the data from a filled AcroForm, as-
sembles and sends a SOAP message to an application.
The demonstration of FormSys comprises soap2pdf,pdf2soap,
and use case applications developed from real scenarios.
2. RELATED WORK
To the best of our knowledge, there is no other work that
offers the functionality discussed in this paper with form
documents. There are a number of commercial tools, such
as Adobe LiveCycle Designer [1], Crystal Report1, and For-
1http://www.crystalreports.com
WWW 2010 • Demo
April 26-30 • Raleigh • NC • USA
1313
mMax [2], that are aimed at designing form templates, dy-
namically generating forms or assisting form filling activity.
We also found numerous software products which utilise
OCR or barcode reading2to capture data from a docu-
ment. These systems mostly focus on building a database
of scanned forms for indexing or search purposes. There
are other works that focus on automatically extracting a
structure out of flat PDF documents. For example, in [5]
where an algorithm is proposed to automatically detect dif-
ferent parts of a PDF document (e.g., Header/Footer) and
produces an XML. Although the purposes of the conversion
may vary, generally speaking, these automatic structure ex-
traction techniques can be applied in our system, for exam-
ple, to eliminate the field/group name mapping step which
is done manually at the moment.
We should note that our system’s focus is not on design-
ing, creating or automatically scanning forms, but on pro-
viding a small, effective and self-contained service which can
be used to easily consolidate manual, form-based user inter-
actions in a business process into some degree of automation
by providing form data processing as services.
3. SYSTEM OVERVIEW
After discussing design decisions, we will present the ar-
chitecture and implementation of the tool.
3.1 Design Choices
While the concept of passing information from forms to
application program interfaces (APIs) and vice versa is in-
dependent of specific API or document formats, we had to
decide on specific technologies for the implementation. We
here discuss these choices.
Form document formats. The de-facto Web standard
for exchanging read-only fixed-layout documents today is
Adobe PDF3, which we decided to use. There is a sub-
standard to PDF called AcroForm which features editable
fields [3]. These fields can be of various types (text, check-
box, etc), and they reside on a visual layer above the regular
document. The fields are named, and their size and position
are determined by the form’s designer. As alternatives,
we considered GoogleDocs4and other common office soft-
ware packages such as OpenOffice5or Microsoft Office6. We
found that none of their APIs are as rich and as portable as
what PDF offers.
Data exchange. Another design decision was to build on
a structured communication standard. We here decided to
use Web service technology. As alternatives we also consid-
ered REST and proprietary Remote Procedure Calls (RPC).
However, REST by itself is not well structured, and thus less
suitable for our scenarios. RPC technology imposes require-
ments on using the same programming paradigm (e.g., Java
or .NET) in the communication partner as used for imple-
menting the service provider. Therefore we decided on Web
services, SOAP over HTTP, as well as regular email with
MIME attachments.
2http://www.scantopdf.co.uk/
3A Google search for “filetype:pdf finds 562,000,000 docu-
ments, vs. around 84,057,000 “doc” and “docx” (2/2/2010).
4http://docs.google.com (15/10/09)
5http://www.openoffice.org (15/10/09)
6http://office.microsoft.com (15/10/09)
3.2 Architecture and Tool Overview
The architecture of our system, cf. Fig. 1, has the fol-
lowing components: a front-end for forms administrators to
upload and administer form templates; a format converter
container, where converters from arbitrary formats to PDF
can be included as plug-ins; a central pdf2soap component
handling uploaded filled forms; an OCR pre-processor for
using pdf2soap functionality on filled forms no in AcroForm
format; a set of self-contained soap2pdf services for filling
forms; a database for form templates and field mappings;
and a core component controlling the whole system includ-
ing pdf2soap functionality and soap2pdf service creation.
These components and their usage are described below.
FormSys
FormSys
Core
Component
Client
Application
FormSys
Web Front-end
Central
pdf2soap
Component
generate,
maintain,
remove
activate,
maintain,
remove
Upload, refine,
and administer
form templates
Provide filled
electronic form
Invoke with
data from form
Invoke with
data for form
Response: filled
form URL
or email
End user /
customer
Organization’s
Web service
implementation or
adapter
OCR Pre-
processor for
pdf2soap
Provide
filled form
(e.g., scanned) Provide filled
electronic form
Form Format
Converter
Container
Database for
Templates and
Field Mapping
retrieve
mappings
Forms
administrator
. . .
soap2pdf 1
soap2pdf 2
soap2pdf n
Figure 1: FormSys Overview
Through the main Web interface of FormSys, forms ad-
ministrators can control the handling of form templates in
the core component: they can upload forms, specify the ser-
vice name and namespace, edit the field naming, group the
fields, and activate, maintain, or discontinue soap2pdf and
pdf2soap functionality. If the form is initially not available
in the required format (AcroForm), it needs to be converted
to PDF and the fields need to be specified. For the prior
step, available converters to PDF can be included in the sys-
tem through the converter container. For partly automating
the latter step, the discussed work [5] may be employed. De-
pending on a provided form template, it may be necessary
to add fields, change the position and type, and change the
names of fields or groups in the form – this ranges from
specifying all fields for new forms to only renaming to make
field names more descriptive. The forms administrator can
edit fields and grouping, cf. Figure 2, using an intuitive
user interface: an image of the uploaded form itself is pre-
sented, and existing fields can be selected by clicking on the
respective areas of the image. The form template and field
naming are stored in the database; they influence the data
structures of the services created from the form.
An active soap2pdf service is a full-fledged Web service of-
fering a WSDL interface with a single operation: fillForm.
The input to this operation is the following: the input data
for the form document; an expiry date for specifying how
long a filled form should be kept available; an email address
to which the filled form is mailed back; and a boolean switch
for determining whether the filled form will be editable (i.e.,
remain as AcroForm) or rendered to a flat PDF. If no email
address is given, or an email address and an expiry date
WWW 2010 • Demo
April 26-30 • Raleigh • NC • USA
1314
Figure 2: Field name and group editing
Figure 3: Options for a running pdf2soapservice
are specified, the filled form will be made available under a
URL, which is sent back in the synchronous SOAP response
to an invocation. This response also contains a status report
and a list of faults, if any. soap2pdf services can be flexibly
used in (Web) applications and orchestrations, as discussed
in the use case section.
pdf2soap works as follows: when a filled form is uploaded
by an end user through the pdf2soap front-end component,
the data is extracted and packaged in a SOAP message ac-
cording to the mapping in the database, and sent to the
associated consuming service. For forms administrators the
set-up is less straight-forward: challenges are how to find
the service, and how to map the data from the form to the
schema of the message to be sent. We circumvent the dif-
ficulties by enforcing the following procedure. For an up-
loaded form template, the forms administrator can generate
apdf2soap WSDL interface specification without endpoint
information. This interface needs to be implemented by the
consuming organization. Once that is done and the service
is deployed, the forms administrator provides FormSys with
the endpoint under which the service is reachable. With this
information a pdf2soap service can be started. The data ex-
tracted from filled forms is then forwarded to the given end-
point. Fig. 3 shows the forms administrator’s interface in the
pdf2soap view, where the part for controlling the pdf2soap
service is a drop-down menu shown for “sal11.pdf”. The
database is hereby the only point where the pdf2soap run-
time depends on the form design time. Thus, the central
pdf2soap component can be physically separated from the
core component, given the database is accessible.
Due to the prescribed procedure, the mapping from the
form fields to the schema is known to our system. The pro-
cedure applies naturally to situations where the organization
has no service implementation available at this point, as it
frees them from designing the interface manually. However,
if a Web service is already implemented, it is unlikely that
the generated WSDL will match it. In this situation, an
adapter has to be developed, e.g., based on [6] or on commer-
cial products such as Oracle Fusion Middleware (OFM)7or
SAP NetWeaver Process Integration (SAP NetWeaver PI)8.
If a filled form is not an AcroForm, pdf2soap has a pre-
processing component for Optical Character Recognition (OCR).
Thus, the data from fields can be extracted even out of im-
age formats, such that paper copies can be scanned and
processed by our system. We plan to use existing OCR
technology2to extract the data from the fields, however,
this is not implemented as yet.
3.3 Implementation
Although the designed architecture is generic, our current
implementation is specific to supporting AcroForms. How-
ever, we believe the system is mature enough to demonstrate
the concept we propose. The system is entirely written in
Java and makes use of the following libraries:
Apache CXF9for all aspects related to Web service.
ImageMagick10 for creating images from PDFs.
iText11 for accessing and manipulating AcroForms.
jQuery12, for interactive graphical Web user interfaces.
The soap2pdf code generation starts from the mapping
of user-given field names to each of the AcroForm fields – if
unchanged the mapping is simply the identity function. The
map is used in Java code generation from templates. The
generated code has a class for each group containing the re-
spective fields and their mapping to AcroForm fields. These
classes are combined in an input bean with the according
other parameters. A Web service implementation is then
generated using CXF, which creates the filled AcroForm
when invoked with the input bean, and accordingly handles
the response and emailing. This code is compiled, packaged,
and deployed using standard Java APIs. For pdf2soap the
field mapping is handled centrally by the upload site.
4. USE CASES
In order to demonstrate the applicability and value of
FormSys, we identified and implemented two use cases, all
taken from real-world. Use case 1 is implemented as a Web
application, and makes use of both pdf2soap and soap2pdf.
Use case 2 is implemented as a Web service orchestration
in BPEL[7], and uses only soap2pdf. All PDFs used in the
use cases are publicly available on the respective Web sites.
4.1 Use case 1: Suncorp investment forms
Personalised Form Download. Often, banks require
a paper-based form to be filled in by their customers to use
additional banking services. As an example, the investment
fund management forms of Suncorp, an Australian bank, can
be found at http://www.suncorp.com.au/suncorp/personal/
Investing/forms.aspx. From this page, Suncorp customers
download blank PDF forms (not AcroForms), print, fill in,
7http://www.oracle.com/us/products/middleware/
8http://www.sdn.sap.com/irj/sdn/nw-pi71 (13/10/09)
9http://cxf.apache.org
10http://www.imagemagick.org
11http://www.lowagie.com/iText
12http://jquery.com
WWW 2010 • Demo
April 26-30 • Raleigh • NC • USA
1315
and return the forms via postal mail or fax to consume the
respective service.
However, for existing customers, banks already have many
details such as name, title, residential address. With our so-
lution, the data that is already present in the bank’s data
bases can be pre-filled into personalized forms. This is par-
ticularly useful for the data the customer is unlikely to use
everyday, but required in the form nonetheless, such as in-
vestment fund numbers or insurance police identifiers.
To showcase this, we developed a Web application resem-
bling the above-mentioned Suncorp website. This applica-
tion is available online (cf. Section 5). Logged-in users can
generate personalized versions of Suncorp forms. In the
background, the Web application retrieves user data from
a database, assembles a SOAP message, and invokes the
FormSys Web service for the respective form with this mes-
sage. The service then returns a URL under which the form
can be downloaded. The fields in the generated form is still
editable by the user.
Form Triggering a Business Process. When a form
is filled in and the customer submits it to his bank, a set of
business process at the bank are triggered by the customer’s
request. If these processes are implemented in software, the
data from paper-based forms has to be provided to the soft-
ware, and more often than not this is a manual process.
In contrast, if the data is available in AcroForm fields,
then our pdf2soap solution can be used. After the filled-
in form has been uploaded to FormSys, and the respective
pdf2soap service is active, FormSys extracts the data from
the AcroForm, assembles a SOAP message, and invokes the
associated endpoint with this message.
Usually this endpoint would refer to an actual implemen-
tation of the business process that is triggered upon arrival
of a respective form. For the demonstration purposes, we de-
veloped a simple Web service as part of the Web application:
it accepts pdf2soap messages and writes the contained data
into a file. When accessing the output page, the information
from the file is displayed in a table.
4.2 Use case 2: Queensland Government
Driver Licence Request. When renewing or requesting
a new driver licence in Queensland, an individual has to fill
in up to six different forms. Most of these forms request some
standard personal data, such as name, title, address, date
of birth and the like. Also, an individual has to understand
which forms are needed when.
We analysed the requirements and downloaded the re-
spective forms from the Queensland Government, Transport
Department website, starting from http://www.transport.
qld.gov.au/Home/Licensing/Driver_licence/. On this ba-
sis we implemented an executable process which accepts the
data needed in more than one form and forwards it to the
respective FormSys soap2pdf services. The process has been
deployed to an Intalio|Server13, where users have access to a
simple workflow: when starting process instances by provid-
ing input data, the required subset of the six forms is filled
and returned to the user.
5. DEMO SCENARIO
The demonstration shows the personalized form genera-
tion from Use case 1 (cf. previous section) as a motivation,
13http://www.intalio.com
and then explains the underlying system, FormSys. We then
demonstrate how to upload an AcroForm, edit field names,
start a soap2pdf service, and view the dynamically gener-
ated WSDL file. We also explain and demonstrate pdf2soap
in general, and how it is used in Use case 1, followed by Use
case 2 where we show the executable process model and the
different ways to interact with the user.
The different parts of the demonstrations can be viewed
individually as screencasts from a Web site: http://www.
cse.unsw.edu.au/~FormSys/FormSys.html. This Web site
also links to the running implementation, which we encour-
age to test and to comment on.
6. CONCLUSION AND FUTURE WORK
In this paper, we showcase FormSys, a web-based system
which Web service-enables form documents. The main func-
tionality of FormSys is twofold: soap2pdf provides filled-in
forms based on data in SOAP messages, while pdf2soap ex-
tracts data from filled-in forms and invokes a given Web
service endpoint with it. We also discussed three real-world
use cases, which make use of soap2pdf and pdf2soap. The
running system is accessible online (cf. Section 5).
We conclude from this work that PDFs can be used as a
channel for interacting with Web services: individuals can
provide filled forms as input to Web services, and Web ser-
vices can output filled form documents. The philosophy was
to enable non-technical users to interact with our system.
In future work, we consider mechanisms to support some
tasks for forms administrators, like field grouping and nam-
ing, through partial automation. One major open point is
how to consume soap2pdf services. In order to allow end
users to implement their personal processes themselves, we
plan on extending a mashup tool with FormSys functional-
ity. Another open point is dealing with other form types
than PDF: while the architecture contains a flexible mecha-
nism to plug in software to handle different document types,
the concepts have yet to be implemented.
Acknowledgments
This work has been supported by a grant from the Smart
Services CRC under the Service Delivery Framework project.
7. REFERENCES
[1] Adobe LiveCycle Designer ES2.
www.adobe.com/products/livecycle/designer/.
[2] Acro Software. FormMax 3.5. www.acrosoftware.com/.
[3] Adobe Systems Incorporated. Acrobat Forms API
Reference. Technical Note No. 5181, 2003.
[4] J. Becker, L. Algermissen, and B. Niehaves. A
Procedure Model for Process Oriented e-Government
Projects. Business Process Management Journal,
12(1):61 – 75, 2006.
[5] H. D´ejean and J.-L. Meunier. A system for converting
PDF documents into structured XML format. In
Document Analysis Systems VII, LNCS 3872, 2006.
[6] H. Motahari, B. Benatallah, A. Martens, F. Curbera,
and F. Casati. Semi-Automated Adaptation of Service
Interactions. In WWW’07, 2007.
[7] OASIS. Web Services Business Process Execution
Language Version 2.0, Apr. 2007.
[8] D. Woods. Enterprise Services Architecture. O’Reilly,
2003.
WWW 2010 • Demo
April 26-30 • Raleigh • NC • USA
1316
... The starting point is a set of artefacts like forms, templates and emails. By using some of our earlier work [10], we can automatically create Web services from these artefacts. These Web services then can be used as activities in our personal process management approach. ...
... In earlier work [10], we investigated how to make PDF forms programmatically accessible. By PDF forms we refer to Adobe PDF's sub-standard AcroForm, which features editable fields [11]. ...
... In order to demonstrate the feasibility of our approach, we specified an architecture and implemented a proof-of-concept tool. The prototype is a significant extension to our earlier tool [10], hence called FormSys Process Designer. A screencast video is available 4 . ...
Article
Full-text available
In many cases, it is not cost effective to automate given busi-ness processes. Those business processes often affect a small number of people and/or change frequently. In this paper, we present a novel ap-proach for enabling end-users to model and deploy processes they en-counter in their daily work. The processes are modelled exclusively from the viewpoint of a single user, and hence avoid many complicated con-structs. Therefore, the modelling can rely on a simple textual process representation, which can be as easily understood as a cooking recipe. The simplicity is achieved by allowing only few activity types in the pro-cess: filling forms, sending pre-formatted emails, and filling HTML tem-plates. The process models can be translated to an executable format and be deployed, including an automatically generated Web interface for user interaction.
... In order to fill-in the forms electronically, we convert a form into a Web service. This is done by our previous work, FormSys [2], through which PDF forms 3 are uploaded to the central FormSys repository. Formally, we define F := {F 1 , F 2 , . . ...
Conference Paper
Full-text available
Despite all efforts to support processes through IT, processes based on paper forms are still prevalent. In this paper, we propose a cost-effective and non-technical approach to automate form-based processes. The approach builds on several types of annotations: to help collect and distribute information for form fields; to choose appropriate process execution paths; and to support email distribution or approval for filled forms. We implemented the approach in a prototype, called EzyForms. We conducted a user study with 15 participants, showing that people with little technical background were able to automate the existing form-based processes efficiently.
... A screencast video of the tool in action is available (see Footnote 3). In contrast, the Service Repository is implemented as an extensive extension of FormSys Forms Manager (previously called FormSys, [Weber et al. 2010a]) and coded in Java, JSP and JavaScript. The remainder of the paper, including the evaluation, focuses on FormSys Process Designer. ...
Article
Full-text available
In many cases, it is not cost effective to automate business processes which affect a small number of people and/or change frequently. We present a novel approach for enabling domain experts to model and deploy such processes from their respective domain as Web service compositions. The approach builds on user-editable service, naming and representing Web services as forms. On this basis, the approach provides a visual composition language with a targeted restriction of control-flow expressivity, process simulation, automated process verification mechanisms, and code generation for executing orchestrations. A Web-based service composition prototype implements this approach, including a WS-BPEL code generator. A small lab user study with 14 participants showed promising results for the usability of the system, even for nontechnical domain experts.
... The tool and approach are significant extensions of earlier work for processes made up of PDF forms [17]. The PDF form-filling services are, in turn, the result of a separate work [16], with which FormSys Process Designer shares some database tables. The idea to relate Web services and their messages to forms stems from the earlier work, but representing arbitrary WSDL Web services through forms is a novel contribution of this paper. ...
Conference Paper
Full-text available
In many cases, it is not cost effective to automate business processes which affect a small number of people and/or change frequently. We present a novel approach for enabling domain experts to model and deploy such processes from their respective domain as Web service compositions. The approach is based on user-editable service naming, a graphical composition language where Web services are represented as forms, a targeted restriction of control flow expressivity, automated process verification mechanisms, and code generation for executing orchestrations. A Web-based service composition prototype implements this approach, including a WS-BPEL code generator.
Article
Tables in documents are a widely-available and rich source of information, but not yet well-utilised computationally because of the difficulty in automatically extracting their structure and data content. There has been a plethora of systems proposed to solve the problem, but current methods present low usability and accuracy and lack precision in detecting data from diverse layouts. We propose a component-based design and implementation of table processing concepts which can offer flexibility and re-usability as well as high performance on a wide range of table types. In this paper, we describe a system named TEXUS which is a fully automated table processing system that takes a PDF document and detects tables in a layout independent manner. We introduce TEXUS's own table processing specific document model and the two-phased processing pipeline design. Through an extensive evaluation on a dataset comprised of complex financial tables, we show the performance of the system on different table types.
Conference Paper
We are surrounded by data, a vast amount of data that has brought about an increasing need for combining and analyzing it in order to extract information and generate knowledge. A need not exclusive of big software companies with expert programmers; from scientists to bloggers, many end-user programmers currently demand data management tools to generate information according to their discretion. However, data is usually distributed among multiple sources, hence, it requires to be integrated, and unfortunately, this process is still available just for professional developers. In this paper we propose DataSheets, a novel approach to make the data-flow specification accessible and its representation comprehensible to end-user programmers. This approach consists of a spreadsheet-based data-flow language that has been tested and evaluated in a service-centric composition framework.
Chapter
Simple form of annotations, such as tagging, are proven to be helpful to end users in organising and managing large amount of resources (e.g., photos, documents). In this paper, we take a first step in applying annotation to forms to explore potential benefits of helping people with little or no technical background to automate the form-based processes. An analysis of real-world forms was conducted to design algorithms for tag recommendations and our initial evaluation suggests that useful tag recommendation can be generated based on the contents and the metadata of the forms. We also briefly present EzyForms, a framework for supporting form-based processes. The architecture supports an end-to-end lifecycle of forms, starting from its creation, annotation, and ultimately to its execution in a process.
Article
Simple form of annotations, such as tagging, are proven to be helpful to end-users in organising and managing large amount of re-sources (e.g., photos, documents). In this paper, we take a first step in applying annotation to forms, one of the main artefacts that make up the long-tail of the processes, to explore potential benefits of helping people with little or no technical background to automate the long-tail of the processes. An analysis of real-world forms was conducted to design al-gorithms for tag recommendations. Our initial evaluation suggests that useful tag recommendation can be generated based on the contents and the metadata of the forms. We also briefly present FormSys + , a framework for supporting form-based processes. The architecture supports an end-to-end lifecycle of forms, starting from its creation, annotation, and ultimately to its exe-cution in a process.
Conference Paper
Many enterprises and factories have been using standard Web-based procedures as the centralized interface for remote control and manufacturing process data input, which is called “Web-based digital dashboard”. With the Web interface, people need to fill in data manually and then submit entire data to a remote system to process control procedure. In this paper, we have used several creativity methods to develop a multi-agent system, namely EMTAN, which can collect all kinds of data from the on-line database, electronic files, and controller’s interface. This system can also automatically retrieve all HTML fields and transfer data into the Web-based digital dashboard, and generate the results via using email or mobile instant messages. Our multi-agent system, in fact, is an integrated model to help process Web-based data precisely and can save lots of time on data key in.
Conference Paper
Full-text available
Efforts and tools aiming to automate business processes promise the highest potential gains on business processes with a well-defined structure and high degree of repetition [1]. Despite successes in this area, the reality is that today many processes are in fact not automated. This is because, among other reasons, Business Process Management Suites (BPMSs) are not well suited for ad-hoc and human-centric processes [2]; and automating processes demands high cost and skills. This affects primarily the “long tail of processes” [3], i.e. processes that are less structured, or that do not affect many people uniformly, or that are not considered critical to an organization: those are rarely automated. One of the consequences of this state is that still today organisations rely on templates and paper-based forms to manage the long tail processes.
Article
Full-text available
The WS-BPEL 2.0 specification [WS-BPEL 2.0] provides a language for formally
Conference Paper
Full-text available
In today's Web, many functionality-wise similar Web services are offered through heterogeneous interfaces (operation definitions) and business protocols (ordering constraints defined on legal operation invocation sequences). The typical approach to enable interoperation in such a heterogeneous setting is through developing adapters. There have been approaches for classifying possible mismatches between service interfaces and business protocols to facilitate adapter development. However, the hard job is that of identifying, given two service specifications, the actual mismatches between their interfaces and business protocols. In this paper we present novel techniques and a tool that provides semi-automated support for identifying and resolution of mismatches between service interfaces and protocols, and for generating adapter specification. We make the following main contributions: (i) we identify mismatches between service interfaces, which leads to finding mismatches of type of signature, merge/split, and extra/missing messages; (ii) we identify all ordering mismatches between service protocols and generate a tree, called mismatch tree, for mismatches that require developers' input for their resolution. In addition, we provide semi-automated support in analyzing the mismatch tree to help in resolving such mismatches. We have implemented the approach in a tool inside IBM WID (WebSphere Integration Developer). Our experiments with some real-world case studies show the viability of the proposed approach. The methods and tool are significant in that they considerably simplify the problem of adapting services so that interoperation is possible.
Conference Paper
Full-text available
We present in this paper a system for converting PDF legacy documents into structured XML format. This conversion system first extracts the different streams contained in PDF files (text, bitmap and vectorial images) and then applies different components in order to express in XML the logically structured documents. Some of these components are traditional in Document Analysis, other more specific to PDF. We also present a graphical user interface in order to check, correct and validate the analysis of the components. We eventually report on two real user cases where this system was applied on.
Article
Purpose – To provide guidelines in the form of a procedural model for e-government-indicated business process reengineering (BPR) projects in public administrations. Design/methodology/approach – A range of recently published works, which aim to provide practical advice for process-oriented e-government projects, were analysed. Additionally, experiences from several practical e-government projects were taken into account. The procedural model developed was then tested and evaluated. Findings – There is a lack of process orientation in public administrations. Additionally, existing processes are regularly not applicable to e-government. Therefore, e-government projects in practice are not always able to fully implement transactional processes. Part of the value potentially added by e-government is hence not exploited. One of the main reasons for the lack of process orientation is that there are few BPR methodologies applied and verified in public administrations. Research limitations/implications – The procedural model has not been tested for all different political and administrative systems. Certain national characteristics might lead to additional adaptations of the model which have been suggested. Practical implications – The procedural model is very useful and has been validated in several practical projects. Originality/value – This paper fulfils an identified need for BPR methodologies in public administrations, especially in the move towards e-government.
Adobe Systems Incorporated Acrobat Forms API Reference
Adobe Systems Incorporated. Acrobat Forms API Reference. Technical Note No. 5181, 2003.
A system for converting PDF documents into structured XML format
  • H Déjean
  • J.-L Meunier
H. Déjean and J.-L. Meunier. A system for converting PDF documents into structured XML format. In Document Analysis Systems VII, LNCS 3872, 2006.