Information Systems

Published by Elsevier BV

Print ISSN: 0306-4379


A Dimensionality Reduction Technique for Efficient Time Series Similarity Analysis

April 2008


108 Reads

Qiang Wang


We propose a dimensionality reduction technique for time series analysis that significantly improves the efficiency and accuracy of similarity searches. In contrast to piecewise constant approximation (PCA) techniques that approximate each time series with constant value segments, the proposed method--Piecewise Vector Quantized Approximation--uses the closest (based on a distance measure) codeword from a codebook of key-sequences to represent each segment. The new representation is symbolic and it allows for the application of text-based retrieval techniques into time series similarity analysis. Experiments on real and simulated datasets show that the proposed technique generally outperforms PCA techniques in clustering and similarity searches.

Figure 1: Expanding into E-commerce means moving both business processes and the technology to implement them into a new domain.
Figure 3: A standard two-tier business framework.
Figure 4: Proposed extension to the three-tiered business framework composed of sub-frameworks and a services pool of frameworks. Hooks are represented by lines between classes.
Application framework issues when evolving business applicationsfor electronic commerce
Conference PaperFull-text available

February 1999


134 Reads

When an organization embarks on e-commerce it rarely has a chance to re-engineer its existing business applications. However, if these business applications were built using an application framework, then one might hope to reuse many of the existing legacy applications in the new e-commerce context. This paper examines the general issues created by migrating applications to e-commerce, and proposes an architecture for application frameworks that must support e-commerce

Evaluating Ontologies: Towards a Cognitive Measure of Quality

November 2007


39 Reads

Business process models are an important tool in understanding and improving the efficiency of a business and in the design of information systems. Recent work has evaluated business process modelling languages against upper-level ontologies on the assumption that these ontologies are adequate representations of the general process domain. In this paper, we present a method to test this assumption. Our method is based on principles of cognitive psychology and demonstrated using the BWW and SUMO upper-level ontologies.

Conceptual Maps as the First Step in an Ontology Construction Method

November 2010


246 Reads

In this paper a method is proposed to be used as the first step in the ontology construction process. This method, specially tailored to ontology construction for knowledge management applications, is based on the use of concept maps as a mean of expression for the expert, followed by an application that assists in the capture of the expert intention with the goal of further formalizing the map. This application analyses the concept map, taking into account the map topology and key words used by the expert. From this analysis a series of questions is presented to the expert that, when answered, reduce the map ambiguity and identify some common patterns in ontological representations, such as generalizations and mere logic relations. This information could then be used by the knowledge engineer during further knowledge acquisition sessions or to direct the expert to a further formalization or improvement of the map. The method was tested in a group of volunteers, all of them engineers working at the aerospace sector, and the results suggest that both the use of conceptual mapping as well as the intention capture step are acceptable from the point of view of the end user, supporting the claim that this method is viable as an option to reduce some of the difficulties in large scale ontology construction.

A system for supporting organizations in knowledge-based document preparation

February 1993


12 Reads

An overview of a knowledge-based document preparation system, REGENT (report generation tool), is presented. REGENT is a software environment which generates documents from reusable document pieces by planning, executing, and monitoring the document preparation process in an organizational setting. The organizational aspects of the document generation process are incorporated in the system architecture. The documents are constructed from stored document pieces using artificial intelligence methods. The report preparation process is detailed as to the knowledge representation structure and the problem solving strategy

Visualizing graphical and textual formalisms

February 2001


37 Reads

The purpose of this work is to combine the advantages of using visual formalisms for the specification of reactive systems with that of using formal verification and program transformation tools developed for textual formalisms. We have developed a tool suite called ViSta that automatically produces statechart layouts based on information extracted from an informal specification. In this paper, we discuss how ViSta is augmented with a tool that automatically translates statecharts to Z specifications. The informal, statechart and Z specifications are inter-related. This ensures consistency between the different representations, and therefore facilitates the verification and validation effort

A Geometric probabilistic framework for data fusion in information retrieval

August 2007


70 Reads

Data fusion in information retrieval has been investigated by many researchers and quite a few data fusion methods have been proposed, but why data fusion can bring improvement in effectiveness is still not very clear. In this paper, we use a geometric probabilistic framework to formally describe data fusion, in which each component result returned from an information retrieval system for a given query is represented as a point in a multiple dimensional space. Then all the component results and data fusion results can be explained using geometrical principles. In such a framework, it becomes clear why quite often data fusion can bring improvement in effectiveness and accordingly what the favourable conditions are for data fusion algorithms to achieve better results. The framework can be used as a guideline to make data fusion techniques be used more effectively.

Importance in knowledge systems

December 1989


34 Reads

In knowledge systems, pieces of information (evidence, hypotheses, attributes, terms, documents, rules) are usually assumed to carry equal importance and to be independent of each other, although it might not actually be the case. Issues for a logic of weighted queries, with possibility of also weighting documents and logical connectors (in terms of intelligent retrieval, for example) are presented here, using “min” or t-norms, and soft operators involving p-norms. This logic cannot be a conventional one for, when introducing relative importance between concepts, definitions are different for ANDed and ORed weighted queries. A concept of “nought”, a limit case of no-importance queries, and its behaviour with fuzzy sets operations is developed, in particular the notion of an extended membership is introduced. Finally it is shown, with a biomedical example, how to combine importance with soft matching in rule-based systems.

Orlowska, M.E.: Analyzing process models using graph reduction techniques. Information Systems 25(2), 117-134

April 2000


381 Reads

The foundation of a process model lies in its structural specifications. Using a generic process modeling language for workflows, we show how a structural specification may contain deadlock and lack of synchronization conflicts that could compromise the correct execution of workflows. In general, identification of such conflicts is a computationally complex problem and requires development of effective algorithms specific for the target modeling language. We present a visual verification approach and algorithm that employs a set of graph reduction rules to identify structural conflicts in process models for the given workflow modeling language. We also provide insights into the correctness and complexity of the reduction process. Finally, we show how the reduction algorithm may be used to count possible instance subgraphs of a correct process model. The main contribution of the paper is a new technique for satisfying well-defined correctness criteria in process models.

Loucopoulos, P.: Goal-driven Business Process Analysis Application in Electricity Deregulation. Information Systems 24(3), 187-207

May 1999


160 Reads

Current business challenges such as deregulation, mergers, globalisation and increased competition have given rise to a new process-centric philosophy of business management. The key issue in this paradigm is the concept of business process. From a methodological perspective, this movement has resulted in a considerable number of approaches that encourage the modelling of business processes as a key component of any improvement or re-engineering endeavour. However, there is a considerable controversy amongst all these competing approaches about the most appropriate way for identifying the types and number of relevant processes. Existing business process modelling approaches describe an enterprise in terms of activities and tasks without offering sufficient guidance towards a process-centred description of the organisation.In this paper we advocate the use of a goal-driven approach to business process modelling. A systematic approach to developing and documenting business processes on the basis of the explicit or implicit business objectives is put forward. We argue that such an approach should lead to a closer alignment between the intentional and operational aspects of an organisation. Our approach is exemplified through the use of parts of a large industrial application that is currently making use of a goal-driven business process modelling.

SQL/NF: A query language for ¬1NF relational databases

December 1987


46 Reads

There is growing interest in abandoning the first-normal-form assumption on which the relational database model is based. This interest has developed from a desire to extend the applicability of the relational model beyond traditional data-processing application. In this paper, we extend one of the most widely used relational query languages, SQL, to operate on non-first-normal-form relations. In this framework, well allow attributes to be relation-valued as well as atomic-valued (e.g. integer or character). A relation which occurs as the value of an attribute in a tuple of another relation is said to be nested. Our extended language, called SQL/NF, includes all of the power of standard SQL as well as the ability to define nested relations in the data definition language and query these relations directly in the extended data manipulation language. A variety of improvements are made to SQL; the syntax is simplified and useless constructs and arbitrary restrictions are removed.

Information Systems-Theoretical Foundations. Information Systems, 6, pp. 205-218

January 1981


69 Reads

Some basic concepts concerning information systems are defined and investigated. With every information system a query language is associated and its syntax and semantics is formally defined. Some elementary properties of the query language are stated. The presented approach leads to a new information systems organization. The presented idea was implemented and the implementation shows many advantages compared with other methods.

A software engineering approach to ontology building. Information Systems, 34(2), 258-275

April 2009


209 Reads

Ontologies are the backbone of the Semantic Web, a semantic-aware version of the World Wide Web. The availability of large-scale high quality domain ontologies depends on effective and usable methodologies aimed at supporting the crucial process of ontology building. Ontology building exhibits a structural and logical complexity that is comparable to the production of software artefacts. This paper proposes an ontology building methodology that capitalizes the large experience drawn from a widely used standard in software engineering: the Unified Software Development Process or Unified Process (UP). In particular, we propose UP for ONtology (UPON) building, a methodology for ontology building derived from the UP. UPON is presented with the support of a practical example in the eBusiness domain. A comparative evaluation with other methodologies and the results of its adoption in the context of the Athena EU Integrated Project are also discussed.

Business process mining: an industrial application. Inf. Syst. 32(5), 713-732

July 2007


1,531 Reads


H.A. Reijers





H.M.W.(E.) Verbeek
Contemporary information systems (e.g., WfM, ERP, CRM, SCM, and B2B systems) record business events in so-called event logs. Business process mining takes these logs to discover process, control, data, organizational, and social structures. Although many researchers are developing new and more powerful process mining techniques and software vendors are incorporating these in their software, few of the more advanced process mining techniques have been tested on real-life processes. This paper describes the application of process mining in one of the provincial offices of the Dutch National Public Works Department, responsible for the construction and maintenance of the road and water infrastructure. Using a variety of process mining techniques, we analyzed the processing of invoices sent by the various subcontractors and suppliers from three different perspectives: (1) the process perspective, (2) the organizational perspective, and (3) the case perspective. For this purpose, we used some of the tools developed in the context of the ProM framework. The goal of this paper is to demonstrate the applicability of process mining in general and our algorithms and tools in particular.

Rosemann, M.: Integrated process modelling: an ontological analysis. Inf. Syst. 25(2), 73-87

April 2000


68 Reads

Process modeling has gained prominence in the information systems modeling area due to its focus on business processes and its usefulness in such business improvement methodologies as Total Quality Management, Business Process Reengineering, and Workflow Management. However, process modeling techniques are not without their criticisms [13]. This paper proposes and uses the Bunge-Wand-Weber (BWW) representation model to analyze the five views — process, data, function, organization and output — provided in the Architecture of Integrated Information Systems (ARIS) popularized by Scheer [39, 40, 41]. The BWW representation model attempts to provide a theoretical base on which to evaluate and thus contribute to the improvement of information systems modeling techniques. The analysis conducted in this paper prompts some propositions. It confirms that the process view alone is not sufficient to model all the real-world constructs required. Some other symbols or views are needed to overcome these deficiencies. However, even when considering all five views in combination, problems may arise in representing all potentially required business rules, specifying the scope and boundaries of the system under consideration, and employing a “top-down” approach to analysis and design. Further work from this study will involve the operationalization of these propositions and their empirical testing in the field.

Tractable database design and datalog abduction through bounded treewidth

May 2010


23 Reads

Given that most elementary problems in database design are NP-hard, the currently used database design algorithms produce suboptimal results. For example, the current 3NF decomposition algorithms may continue further decomposing a relation even though it is already in 3NF. In this paper we study database design problems whose sets of functional dependencies have bounded treewidth. For such sets, we develop polynomial-time and highly parallelizable algorithms for a number of central database design problems such as:•primality of an attribute;•3NF-test for a relational schema or subschema;•BCNF-test for a subschema.In order to define the treewidth of a relational schema, we shall associate a hypergraph with it. Note that there are two main possibilities of defining the treewidth of a hypergraph H: One is via the primal graph of H and one is via the incidence graph of H. Our algorithms apply to the case where the primal graph is considered. However, we also show that the tractability results still hold when the incidence graph is considered instead.It turns out that our results have interesting applications to logic-based abduction. By the well-known relationship with the primality problem in database design and the relevance problem in propositional abduction, our new algorithms and tractability results can be easily carried over from the former field to the latter. Moreover, we show how these tractability results can be further extended from propositional abduction to abductive diagnosis based on non-ground datalog.

Abstract Query Performance Prediction

November 2006


83 Reads

The prediction of query performance is an interesting and important issue in Information Retrieval (IR). Current predictors involve the use of relevance scores, which are time-consuming to compute. Therefore, current predictors are not very suitable for practical applications. In this paper, we study six predictors of query performance, which can be generated prior to the retrieval process without the use of relevance scores. As a consequence, the cost of computing these predictors is marginal. The linear and non-parametric correlations of the proposed predictors with query performance are thoroughly assessed on the Text REtrieval Conference (TREC) disk4 and disk5 (minus CR) collection with the 249 TREC topics that were used in the recent TREC2004 Robust Track. According to the results, some of the proposed predictors have significant correlation with query performance, showing that these predictors can be useful to infer query performance in practical applications.

Labyrinth, an abstract model for hypermedia applications. Description of its static components

December 1997


25 Reads

In information technology, models are abstract devices to represent the components and functions of software applications. When a model is general and consistent, it represents a useful design tool to unambiguously describe the application. Traditional models are not suitable for the design of hypermedia systems and, therefore, specific design models and methodologies are needed. In the present article, the requirements for such models are analysed, an overview of the characteristics of the existing models for hypermedia applications is made and an abstract model fulfilling the analysed requirements is presented. The model, called Labyrinth, allows 1) the design of platform-independent hypermedia applications; 2) the categorisation, generalisation and abstraction of sparse unstructured heterogeneous information in multiple and interconnected levels; 3) the creation of personalisations (personal views) in multiuser hyperdocuments for both groups and individual users and 4) the design of advanced security mechanisms for hypermedia applications.

Contextualization as an independent abstraction mechanism for conceptual modeling

March 2007


56 Reads

The notion of context appears in computer science, as well as in several other disciplines, in various forms. In this paper, we present a general framework for representing the notion of context in information modeling. First, we define a context as a set of objects, within which each object has a set of names and possibly a reference: the reference of the object is another context which “hides” detailed information about the object. Then, we introduce the possibility of structuring the contents of a context through the traditional abstraction mechanisms, i.e., classification, generalization, and attribution. We show that, depending on the application, our notion of context can be used as an independent abstraction mechanism, either in an alternative or a complementary capacity with respect to the traditional abstraction mechanisms. We also study the interactions between contextualization and the traditional abstraction mechanisms, as well as the constraints that govern such interactions. Finally, we present a theory for contextualized information bases. The theory includes a set of validity constraints, a model theory, as well as a set of sound and complete inference rules. We show that our core theory can be easily extended to support embedding of particular information models in our contextualization framework.

Abstractions in temporal information

December 1983


17 Reads

Three prevalent abstractions in temporal information are examined by using the machinery of first order logic. The abstraction of time allows one to concentrate on temporal objects only as they relate to other temporal objects in time. It is represented by a functional relationship between temporal objects and time intervals. The abstraction of identity allows one to concentrate on how an observed phenomenon relates to other phenomena in terms of their being manifestations of the same object. It is represented by a functional relationship between temporal phenomena and “completed” temporal objects. The abstraction of circumstance embodies a focus of attention on particular configurations or states of groups of temporal phenomena. It is represented by functional relationships between thesis groups and other objects called “events” or “states”.A novel concept, called absolute/relative abstraction, is used to formalize the abstractions of time and identity. The abstraction of circumstance, on the other hand, is an example of aggregation. The significance and use of thesis abstractions in the representation and processing of historical information is discussed.

Choice of the optimal number of blocks for data access by an index

December 1986


9 Reads

A computational algorithm with the aim of reducing access costs in database and file system applications is presented. The idea being developed is to determine the optimal number of contiguous data blocks, that is the multiblocking factor, to be transferred to memory in a single access. This choice is not effected for each file or relation, but a multiblocking factor will be selected independently for each index of a relation or of a file. The determination of different values is effected in order to increase the percentage of useful information transferred during each access and, therefore, to decrease the total number of I/O operations. The effectiveness of the method is shown by the experimental results obtained using an actual database. The selection criterion of the multiblocking factor associated with an index is based on the measurement of the average clustering of the key value occurrences in the stored records.

Access support relations: An indexing method for object bases

March 1992


69 Reads

In this work access support relations are introduced as a means for optimizing query processing in object-oriented database systems. The general idea is to maintain separate structures (dissociated from the object representation) to redundantly store those object references that are frequently traversed in database queries. The proposed access support relation technique is no longer restricted to relate an object (tuple) to an atomic value (attribute value) as in conventional indexing. Rather, access support relations relate objects with each other and can span over reference chains which may contain collection-valued components in order to support queries involving path expressions. We present several alternative extensions and decompositions of access support relations for a given path expression, the best of which has to be determined according to the application-specific database usage profile. An analytical performance analysis of access support relations is developed. This analytical cost model is, in particular, used to determine the best access support relation extension and decomposition with respect to specific database configuration and usage characteristics.

The Snapshot Index: An I/O-Optimal access method for timeslice queries

May 1995


14 Reads

We present an access method for timeslice queries that reconstructs a past state s(t) of a time-evolving collection of objects, in () I/O's, where ¦s(t)¦ denotes the size of the collection at time t, n is the total number of changes in the collection's evolution and b is the size of an I/O transfer. Changes include the addition, deletion or attribute modification of objects; they are assumed to occur in increasing time order and always affect the most current state of the collection (thus our index supports transaction-time.) The space used is () while the update processing is constant per change, i.e., independent of n. This is the first I/O-optimal access method for this problem using () space and (1) updating (in the expected amortized sense due to the use of hashing.) This performance is also achieved for interval intersection temporal queries. An advantage of our approach is that its performance can be tuned to match particular application needs (trading space for query time and vice versa). In addition, the Snapshot Index can naturally migrate data on a write-once optical medium while maintaining the same performance bounds.

An integrated model of record segmentation and access path selection for databases

December 1988


10 Reads

An analytic model is developed to integrate two closely related subproblems of physical database design: record segmentation and access path selection. Several restrictive assumptions of the past research on record segmentation, e.g. a single access method and the dominance of one subfile over the other, are relaxed in this model. A generic design process for this integrated performance model is suggested and applied to a relational database. A heuristic procedure and an optimal algorithm are developed for solving the model. Extensive computational results are reported to show the effectiveness of these solution techniques.

MOF-tree: A spatial access method to manipulate multiple overlapping features

December 1997


22 Reads

In this paper we investigate the manipulation of large sets of 2-dimensional data representing multiple overlapping features (e.g. semantically distinct overlays of a given region), and we present a new access method, the MOF-tree. We perform an analysis with respect to the storage requirements and a time analysis with respect to window query operations involving multiple features (e.g. to verify if a constraint defined on multiple overlays holds or not inside a certain region). We examine both the pointer-based as well as the pointerless MOF-tree representations, using as space complexity measure the number of bits used in main memory and the number of disk pages in secondary storage respectively. In particular, we show that the new structure is space competitive in the average case, both in the pointer version and in the linear version, with respect to multiple instances of a region quadtree and a linear quadtree respectively, where each instance represents a single feature. Concerning the time performance of the new structure, we analyze the class of window (range) queries, posed on the secondary memory implementation. We show that the I/O worst-case time complexity for processing a number of window queries in the given image space, is competitive with respect to multiple instances of a linear quadtree, as confirmed by experimental results. Finally, we show that the MOF-tree can efficiently support spatial join processing in a spatial DBMS.

Top-cited authors