John L. Pfaltz

John L. Pfaltz
  • Ph.D. Mathematics, Univ. of Md. 1969
  • Professor at University of Virginia

About

134
Publications
14,433
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,453
Citations
Current institution
University of Virginia
Current position
  • Professor

Publications

Publications (134)
Chapter
Most memory research has assumed that our long-term memories are somehow retained in our brain, usually by modified synaptic connections. This paper proposes a very different scenario, in which the basic substrate of these memories are molecules which flow within a newly discovered circulatory system similar to our lymph system. Moreover, the infor...
Conference Paper
Any concept of a long-term, relatively permanent, memory must include a mechanism for encoding episodic experiences into some appropriate format and later retrieving and recreating a reasonable approximation of the original "episode". Psychologists commonly call these processes "consolidation" and "recall" [8]. In this paper, we present two computa...
Chapter
In this paper, a category of undirected graphs is introduced where the morphisms are chosen in the style of mathematical graph theory rather than as algebraic structures as is more usual in the area of graph transformation.
Conference Paper
This paper presents two computable functions, \(\omega \) and \(\varepsilon \), that map networks into networks. If all cognition occurs as an active neural network, then it is thought that \(\omega \) models long-term memory consolidation and \(\varepsilon \) models memory recall. A derived, intermediate network form, consisting of chordless cycle...
Article
We study network (i.e., undirected simple graph) structures by investi- gating associated closure operators and the corresponding closed sets. To describe the dynamic behavior of networks, we employ continu- ous transformations and neighborhood homomorphisms between them. These transformations and homomorphisms are then studied. In par- ticular, th...
Conference Paper
We introduce closed sets, which we will call knowledge units, to represent tight collections of experience, facts, or skills, etc. Associated with each knowledge unit is the notion of its generators consisting of those attributes which characterize it. Using these closure concepts, we then provide a rigorous mathematical model of learning in terms...
Article
Full-text available
An expansive, monotone operator is dominating; if it is also idempotent it is a closure operator. Although they have distinct properties, these two kinds of discrete operators are also intertwined. Every closure operator is dominating; every dominating operator embodies a closure. Both can be the basis of continuous set transformations. Dominating...
Conference Paper
This paper explores a novel way of implementing set-valued operators that are used in analysis and retrieval in large social networks. The software we describe has been implemented and thoroughly tested in several demanding applications.
Article
Full-text available
A mathematical model for dynamic networks is developed that is based on closed, rather than open, sets. For a social network, it seems appropriate to use a neighborhood concept to establish these sets. We then define a rigorous concept of continuous change, and show that it shares some of the properties associated with the continuity of the calculu...
Conference Paper
Full-text available
Using closure concepts, we show that within every undirected network, or graph, there is a unique irreducible subgraph which we call its "spine". The chordless cycles which comprise this irreducible core effectively characterize the connectivity structure of the network as a whole. In particular, it is shown that the center of the network, whether...
Conference Paper
Using closure concepts, we show that within every undirected network, or graph, there is a unique irreducible subgraph. The chordless cycles which comprise this irreducible core effectively characterize the connectivity structure of the network as a whole. By counting the number of cycles of length 3 ≤ k ≤ max_length, we can also create a kind of s...
Article
Full-text available
This paper provides a mathematical explanation for the phenomenon of “triadic closure” so often seen in social networks. It appears to be a natural consequence when network change is constrained to be continuous. The concept of chordless cycles in the network’s “irreducible spine” is used in the analysis of the network’s dynamic behavior. A surpris...
Article
Full-text available
We introduce the concepts of closed sets and closure operators as mathematical tools for the study of social networks. Dynamic networks are represented by transformations. It is shown that under continuous change/transformation, all networks tend to "break down" and become less complex. It is a kind of entropy. The product of this theoretical decom...
Conference Paper
Full-text available
There exist a variety of procedures for identifying clusters in large networks. This paper focuses on finding the connections between such clusters. We employ the concept of closed sets to reduce a network down to its fundamental cycles. These cycles begin to capture the global structure of the network by eliminating a great deal of the fine detail...
Article
We review a method of generating logical rules, or axioms, from empirical data. This method, using closed set properties of formal concept analysis, has been previously described and tested on rather large sets of deterministic data. In spite of the fact that formal concept techniques have been used to prune frequent set data mining results, freque...
Conference Paper
Full-text available
A rigorous concept of continuity for dynamic networks is developed. It is based on closed, rather than open, sets. It is local in nature, in that if the network change is discontinuous it will be so at a single point and the discontinuity will be apparent in that point's immediate neighborhood. Necessary and sufficient criteria for continuity are p...
Conference Paper
Full-text available
We review a method of generating logical rules, or axioms, from empirical data. This method, using closed set properties of formal concept analysis, has been previously described and tested on rather large sets of deterministic data. The contribution of this paper is a completely new extension of this method to create implications involving numeric...
Conference Paper
We propose that a scientific database should be inherently different from, say a business database. The difference is based on the nature of science itself, in which hypotheses, or logical implications, form an essential part of the discipline. Empirical observations give rise to tentative hypotheses. Individual hypotheses are then tested, refuted...
Conference Paper
Full-text available
Formal Concept Analysis is based on the occurrence of symbolic attributes in individual objects, or observations. But, when the attribute is numeric, treatment has been awkward. In this paper, we show how one can derive logical implications in which the atoms can be not only boolean symbolic attributes, but also ordinal inequalities, such as x ≤ 9....
Article
Full-text available
Discrete systems such as sets, monoids, groups are familiar categories. The internal structure of the latter two is defined by an algebraic operator. In this paper we concentrate on discrete systems that are characterized by unary operators; these include choice operators σ, encountered in economics and social theory, and closure operators φ, encou...
Conference Paper
Full-text available
This paper describes issues encountered in the design and implementation of a parallel object-oriented database system. In particular, we find that the design of a client/server interface (that is, whether to use a page server or query server architecture) depends greatly on the expected application environment. We believe that the query server mod...
Conference Paper
Full-text available
Suppose that whenever event x occurs, a second event y must subsequently occur. We say that x “causes” y, or y is causally dependent on x. Deterministic causality abounds in software where execution of one routine can necessarily force execution of a subsequent sub-routine. Discovery of such causal dependencies can be an important step to understan...
Article
Discrete systems such as sets, monoids, groups are familiar categories. The internal strucutre of the latter two is defined by an algebraic operator. In this paper we describe the internal structure of the base set by a closure operator. We illustrate the role of such closure in convex geometries and partially ordered sets and thus suggestthe wide...
Article
Full-text available
The most comon definition of the Poset category is incorrect. Here we present a valid categorical concept using morphisms that are "closed" and "complete".
Conference Paper
Full-text available
In this paper we develop a discrete, T 0 topology in which (1) closed sets play a more prominent role than open sets, (2) atoms comprising the space have discrete dimension, which (3) is used to define boundary elements, and (4) configurations within the topology can have connectivity (or separation) of different degrees. To justify this discrete,...
Article
Full-text available
Discrete systems such as sets, monoids, groups are familiar categories. The internal strucutre of the latter two is defined by an algebraic operator. In this paper we describe the internal structure of the base set by a closure operator. We illustrate the role of such closure in convex geometries and partially ordered sets and thus suggest the wide...
Article
In this report, we establish essential closures in class hierarchies of database systems that support set operations, such as union and intersection, in their query language. In particular, we rigorously demonstrate that multiple inheritance is an implementation requirement, as is the formal treatment of the class hierarchies as lattices with defin...
Article
ADAMS is based on five primitive concepts: Attribute, co-domain, element, map, and set. Each instance of these primitives is a named entity to which user references resolve. Thus, the concept of naming and name resolution is crucial to the effective management of data in ADAMS.
Article
Full-text available
systems is to save database states on a separate secure device so that the database can be recovered when errors and failures occur. This paper presents a non-interfering checkpointing mechanism being developed for ADAMS. Instead of waiting for a consistent state to occur, our checkpointing approach constructs a state that would result by completin...
Article
The items in a spatial database have location, extent, and shape with respect to a spatial coordinate system. Simple approximations to these attributes, say by bounding rectangles, are storage efficient and easy to manipulate. But effective spatial retrieval (on either location, extent, or shape) require a more precise representation of these attri...
Article
Full-text available
ADAMS provides a mechanism for applications programs, written in many languages, to define and access common persistent databases. The basic constructs are element, class, set, map, attribute, and codomain. From these the user may define new data structures and new data classes belonging to a semantic hierarchy that supports multiple inheritance. T...
Article
this paper we introduce the concept of an entity database model. This model is not intended to be a complete working database model in the sense that various semantic database models, (e.g. DAPLEX [Shi81], SDM, Semantic Data Model [HaM81], Galileo [ACO85], IFO [AbH87]) or object oriented database models, (e.g. Smalltalk [CoM84, MSe86], Orient84/K [...
Article
Full-text available
This report describes the ADAMS language as it would be used by an applications programmer. It does not describe how ADAMS is implemented.
Article
ADAMS provides a mechanism for applications programs, written in many languages, to define and access common persistent databases. The basic constructs are element, class, set, map, attribute, and codomain. From these the user may define new data structures and new data classes belonging to a semantic hierarchy that supports multiple inheritance.
Conference Paper
We show that a formal concept lattice \( \mathcal{L} \), with explicit generators, constitutes a viable medium for discrete, deterministic, data mining. All implications of interest can be found and extracted from \( \mathcal{L} \) independent of the frequency of their occurance. More importantly, we show that these lattices can be grown from a bin...
Article
Full-text available
Wepresent a closed set data mining paradigm which is particularly effective for uncovering the kind of deterministic, causal dependencies that characterize much of basic science. While closed sets have been used before in frequent set data mining, we believe this is the first algorithm to incrementally combine closed sets one at a time to actually...
Article
Full-text available
This report is a compendium of results uncovered in CS 851, spring semester 2002.
Conference Paper
Full-text available
While the persistent data of many advanced database applications, such as OLAP and scientific studies, are characterized by very high dimensionality, typical queries posed on these data appeal to a small number of relevant dimensions. Unfortunately, the multidimensional access methods designed for high-dimensional data perform rather poorly for the...
Article
Full-text available
The objective of this work is to examine the feasibility of, as well as to learn about, a process of developing software architecture that prevents the possibility of mismatch between homogeneous components implemented according to the architectural specification. This paper shows how the architecture can be organized, which restrictions it can use...
Article
Closure is a fundamental property of many discrete systems. Transitive closure in relations has been well studied, as has geometric convexity closure and closure in various kinds of graphs. The closed sets of these uniquely generated, antimatroid operators illustrate a well-behaved internal structure. This paper shows that much of this structure is...
Article
Because antimatroid closure spaces satisfy the anti-exchange axiom, it is easy to show that they are uniquely generated. That is, the minimal set of elements determining a closed set is unique. A prime example is a discrete geometry in Euclidean space where closed sets are uniquely generated by their extreme points. But, many of the geometries aris...
Article
We report our experience of tightly coupling a global change simulation of the terrestrial carbon cycle with an object--oriented database. By tightly coupled, we mean an interleaving of computational and data access statements within the same module so that intermediate model values are easily represented in persistent storage. This implementation...
Article
Full-text available
this paper we take a more limited view and simply regard R as an observation of a set of attributes A associated with a set of objects O. In our formulation, objects are denoted by numbered rows and attributes denoted by lettered columns. "Formal Concept Analysis" [5] has been developed by Rudolf Wille [15], Bernard Ganter and their colleagues at D...
Conference Paper
Landscape and ecosystem models have typically been viewed as “black boxes”, that given a set of inputs yield a set of outputs. This view does not easily lend itself to investigations as to why a model behaves as it does, especially when multiple models are coupled together to create a larger model. We hypothesize that, for this type of investigatio...
Conference Paper
A persistent myth is that database systems must be data storage systems. Most are. The relational model of data [9],[21], with its emphasis on tuples and tables, perpetuates this myth. So does the popular concept of data warehousing [11],[20].
Article
Full-text available
On March 12-13, 1990, the National Science Foundation sponsored a two day workshop, hosted by the University of Virginia, at which representatives from the earth, life, and space sciences gathered together with computer scientists to discuss the problems facing the scientific community in the area of database management. A summary of the discussion...
Conference Paper
Full-text available
Phrase structure grammars, in which non-terminal symbols on the left side of a production can be rewritten by the string on the right side, together with their Chomsky hierarch classification, are familiar to computer scientists. But, these grammars are most effective only to generate, and parse, strings. In this paper, we introduce a new kind of g...
Article
Full-text available
When closure operators are defined over figures in the plane, they are normally defined with respect to convex closure in the Euclidean plane. This report concentrates on discrete closure operators defined over the discrete, rectilinear plane. Basic to geometric convexity is the concept of a geodesic, or shortest path. Such geodesics can be regarde...
Conference Paper
Full-text available
Scientific simulations evolve constantly. Both the logical organization of the underlying database and the scientist's view of data may change rapidly. The underlying DBMS must provide appropriate support for the evolution of scientific simulations, their rapidly increasing computational intensity, as well as the growing volumes and dimensionality...
Article
Full-text available
Often the structure of discrete sets can be described in terms of a closure operator. When each closed set has a unique minimal generating set (as in convex geometries in which the extreme points of a convex set generate the closed set), we have an antimatroid closure space. In this paper, we show there exist antimatroid closure spaces of any size,...
Article
Full-text available
The concept of distance in matroids and geometric lattices is a familiar one. This paper examines the problem of defining a metric over anti-matroids and their corresponding lower semi-modular lattices and some ensuing consequences. One of the most intriguing results is that distance need not be "local" in an anti-matriod. The distance between two...
Article
Full-text available
Investigation of the transformations of vector spaces, whose most abstract formulations are called matroids, is basic in mathematics; but transformations of discrete spaces have received relatively little attention. This paper develops the concept of transformations of discrete spaces in the context of antimatroid closure spaces. The nature of thes...
Article
Full-text available
: A major concern of environmental scientists, and others with long term data requirements, has been the establishment of metadata standards so that data recorded today will be accessible 50 to 100 years hence. We contend that more important than the standards themselves will be a context in which they can be represented and can evolve as new requi...
Conference Paper
Large scientific applications which rely on highly parallel computational analysis require highly parallel data access. We describe an object-oriented scientific database system that achieves nearly linear scale-up over large, million object data sets. Of primary importance are those features which seem central to the development of this, or any ot...
Article
Full-text available
This dissertation examines the geometric Steiner tree problem: given a set of terminals in the plane, find a minimum-length interconnection of those terminals according to some geometric distance metric. In the process, however, it addresses a much more general and widely applicable problem, that of finding a minimum-weight spanning tree in a hyper...
Article
Full-text available
This dissertation introduces Methodically Organized Argument Trees, a new approach to the development and presentation of assurance arguments about security properties. The MOAT approach was developed to assist users of the Legion Security Model, one of many new approaches to building novel distributed security architectures. Users of these new app...
Article
Full-text available
This dissertation presents a new parallel object-oriented database system implementation and architecture. The system, parallel ADAMS, we have implemented as appropriate to large-scale scientific database applications, where the retrieval of complex data from very large collections is a primary operation.
Conference Paper
Full-text available
The issue of quality control has become increasingly important as more online databases are integrated into digital libraries. This can have a dramatic effect on the search effectiveness of an online system. Authority work, the need to discover and reconcile variant forms of strings in bibliographic entries, will become more difficult. Spelling var...
Article
Full-text available
The k-nearest neighbor (kNN) classifier is a popular and effective method for associating a feature vector with a unique element in a known, finite set of classes. A common choice for the distance metric used in kNN classification is the quadratic distance Q(x; A; y) = (x Gamma y) 0 A(xGammay), where x and y are n-vectors of features, A is a symmet...
Article
Closure spaces have been previously investigated by Paul Edelman and Robert Jamison as ‘convex geometries’. Consequently, a number of the results given here duplicate theirs. However, we employ a slightly different, but equivalent, defining axiom which gives a new flavor to our presentation. The major contribution is the definition of a partial ord...
Article
Full-text available
For many World Wide Web applications there is a need to provide session semantics so that users have the impression of a continuous interaction. This is true, for example, when one searches a database interactively. Because WWW servers are stateless some extra mechanism is necessary to give the impression of session semantics. This report discusses...
Conference Paper
Full-text available
. We develop the concept of a "closure space" which appears with different names in many aspects of graph theory. We show that acyclic graphs can be almost characterized by the partition coefficients of their associated closure spaces. The resulting nearly total ordering of all acyclic graphs (or partial orders) provides an effective isomorphism fi...
Article
Full-text available
We present a linear algorithm to count the number of binary partitions of 2 n . It is also shown how such binary partitions are related to closure spaces on n elements, thereby giving a lower bound on their enumeration as well. 1 Background A binary partition of the integer N is a sequence of non-negative integers ! a n ; Delta Delta Delta ; a 0 ?,...
Article
Full-text available
There are many ways that 2 n can be expressed as the sum of lower powers of 2, that is P n k=0 a k Delta 2 k = 2 n , where a k is a non-negative integer. Each collection of coefficients ! Delta Delta Delta a k Delta Delta Delta ? is a partition of 2 n . This paper presents a way of counting the number, pn , of such partitions, which is super expone...
Article
The purpose of this second three year grant was to investigate opportunities for medium grained parallelism in data intensive applications; or more succinctly, can one create effective parallel database systems? It should be noted that the last three years have not been kind to parallel object-oriented database systems -- to their knowledge none of...
Article
Many scientific investigations centre around the study of the change of state associated with some phenomenon. Traditional database design is concerned with identifying and representing those attributes that characterize the system state itself, not its change. This chapter examines some of the issues raised when we consider the explicit representa...
Article
Full-text available
A functional approach to data representation in scientific databases is discussed. A syntax is developed to present ideas and concepts of a functional data representation. (AIP)
Article
Full-text available
: This report describes the ADAMS language as it would be used by an applications programmer. It does not describe how ADAMS is implemented. The first three sections assume no knowledge of ADAMS whatever, and are quite tutorial in nature. Only basic, introductory concepts are covered. The remaining sections, although still tutorial, presume some fa...
Article
Full-text available
: ADAMS is based on five primitive concepts: Attribute, co-domain, element, map, and set. Each instance of these primitives is a named entity to which user references resolve. Thus, the concept of naming and name resolution is crucial to the effective management of data in ADAMS. This paper describes the hierarchical name space proposed for the ada...
Article
Full-text available
In this report, we establish essential closures in class hierarchies of database systems that support set operations, such as union and intersection, in their query language. In particular, we rigorously demonstrate that multiple inheritance is an implementation requirement, as is the formal treatment of the class hierarchies as lattices with defin...
Article
Full-text available
: The purpose of this report is to describe semantic properties of languages that support direct access to items in persistent data spaces. These languages differ in non-trivial ways from more familiar languages which describe processes operating over data represented in a transient memory. These semantics are presented with respect to a formal Tur...
Article
Full-text available
this paper we introduce the concept of an entity database model. This model is not intended to be a complete working database model in the sense that various semantic database models, (e.g. DAPLEX [Shi81], SDM, Semantic Data Model [HaM81], Galileo [ACO85], IFO [AbH87]) or object oriented database models, (e.g. Smalltalk [CoM84, MSe86], Orient84/K [...
Article
Full-text available
: ADAMS provides a mechanism for applications programs, written in many languages, to define and access common persistent databases. The basic constructs are element, class, set, map, attribute, and codomain. From these the user may define new data structures and new data classes belonging to a semantic hierarchy that supports multiple inheritance....
Article
ADAMS is an ambitious effort to provide new database access paradigms for the kinds of scientific applications that require massively parallel access to very large data sets in order to be effective. Many of the Grand Challenge Problems fall into this category, as well as those kinds of scientific research which depend on widely distributed shared...
Article
Full-text available
The items in a spatial database have location, extent, and shape with respect to a spatial coordinate system. Simple approximations to these attributes, say by bounding rectangles, are storage efficient and easy to manipulate. But effective spatial retrieval (on either location, extent, or shape) require a more precise representation of these attri...
Article
This three year grant was an ambitious project to implement a hypercube database system''; it was somewhat more ambitious than we had realized. For this final report we begin with an overview that reviews the original proposal; discusses its magnitude; explains our research approach; and outlines our results in terms of what we consider to be its m...
Article
The National Science Foundation sponsored a two day workshop hosted by the University of Virginia on March 12-13, 1990 at which representatives from the earth, life, and space sciences met with computer scientists to discuss the issues facing the scientific community in the area of database management. The workshop1 participants concluded that init...
Conference Paper
Full-text available
An approach to exploiting coarse-grained parallelism in database applications is presented. This approach combines the database facilities of ADAMS with the dependency detection and parallel execution facilities of Mentat. The approach to providing mode-two parallelism is to make changes behind the ADAMS language interface, thus insulating users fr...
Article
In this report, a novel approach to ordered retrieval in very large files is developed. The method employs a B-tree like search algorithm that is independent of key type or key length because all keys in index blocks are encoded by a 1 byte surrogate. The replacement of actual key sequences by the 1 byte surrogate ensures a maximal possible fan out...
Conference Paper
Full-text available
Article
This report describes the implementation of an ADAMS prototype called the ADAMS Preprocessor (AP). The goals of the project are discussed and then the specific subset of the ADAMS interface language which is supported by the AP is defined. After presenting the systems's user interface and an overview of the language's basic concepts, a BNF notation...
Conference Paper
Full-text available
In this paper, a novel am to ordered retrieval in very large files is developed. The method employs a B-tree like search algorithm that is independent of key type or key length because all keys in index blocks are encoded by a log2(M) bit surrogate, where M is the maximal key length. For example, keys of length less than 32 bytes can be represented...

Network

Cited By