Intelligent Document Routing as a First Step towards Workflow Automation: A Case Study Implemented in SQL
ABSTRACT In large and complex organizations, the development of workflow automation projects is hard. In some cases, a first important
step in that direction is the automation of the routing of incoming documents. In this paper, we describe a project to develop
a system for the first routing of incoming letters to the right department within a large, public portuguese institution.
We followed a data mining approach, where data representing previous routings were analyzed to obtain a model that can be
used to route future documents. The approach followed was strongly influenced by some of the limitations imposed by the customer:
the budget available was small and the solution should be developed in SQL to facilitate integration with the existing system.
The system developed was able to obtain satisfactory results. However, as in any Data Mining project, most of the effort was
dedicated to activities other than modelling (e.g., data preparation), which means that there is still plenty of room for
- SourceAvailable from: Nitesh V Chawla[Show abstract] [Hide abstract]
ABSTRACT: Rare objects are often of great interest and great value. Until recently, however, rarity has not received much attention in the context of data mining. Now, as increasingly complex real-world problems are addressed, rarity, and the related problem of ...ACM SIGKDD Explorations Newsletter 06/2004; 6(1):1-6. DOI:10.1145/1007730.1007733
Conference Paper: Exploiting XML technologies for intelligent document routing.[Show abstract] [Hide abstract]
ABSTRACT: Today, XML is increasingly becoming a standard for representation of semi-structured information such as documents that combines content and metadata. Typical document management applications include document representation, authoring, validation, and document routing in support of a business process. We propose a framework for intelligent document routing that exploits and extends XML technologies to automate dynamic document routing and real-time update of business routing logic. The document-routing logic is stored in a secure repository and executed by a business rules engine. During rule execution, the input parameters of each business rule are bound with the data from each inbound XML document. This document routing framework is validated in a real-world implementation with reduced development cost, accelerated rule update cycle and simplified administration efforts.Proceedings of the 2005 ACM Symposium on Document Engineering, Bristol, UK, November 2-4, 2005; 01/2005
- [Show abstract] [Hide abstract]
ABSTRACT: The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert labor power, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.ACM Computing Surveys 04/2001; 34(1):1-47. DOI:10.1145/505282.505283 · 4.04 Impact Factor