About
40
Publications
15,719
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,315
Citations
Introduction
I am a researcher and author in Web science, knowledge representation, semantic technologies, and knowledge-based artificial intelligence. KBAI is largely focused on understanding natural language and machine learning using knowledge bases.
I am also the lead author of the open-source KBpedia knowledge structure. KBpedia integrates and maps to seven large-scale public knowledge bases (esp. Wikidata, Wikipedia, GeoNames, and schema.org) using the KBpedia Knowledge Ontology (KKO), based on the teachings and principles of the great 19th century American scientist, philosopher and logician, Charles Sanders Peirce.
Publications
Publications (40)
Hierarchies abound to help us organize our world. A hierarchy places items into a general order, where more ‘general’ is also more ‘abstract’. The etymology of hierarchy is grounded in notions of religious and social rank. This article, after a historical review, focuses on knowledge systems, an interloper of the term hierarchy since at least the 1...
Knowledge representation (KR) is a field of artificial intelligence to convey information about the world to a computer to solve complex tasks. This book is a fresh viewpoint on KR and ontology engineering, informed by a variety of projects over the past dozen years, and guided by the ideas of Charles Sanders Peirce (1839-1914), an American logicia...
This chapter overviews a dozen knowledge representation (KR) possibilities in breadth. Four potential near-term applications are word sense disambiguation, relation extraction, reciprocal mapping, and extreme knowledge supervision. Word sense disambiguation applied to new domains needs to overcome what is known as the knowledge acquisition bottlene...
Access to information—and impediments to it—is a significant determinant of wealth and economic growth. Knowledge representation is a primary driver for using computers as a means to improve the economic well-being of all peoples. Solow, a student of Schumpeter, had the insight in two papers in the 1950s, for which he won a Nobel Prize, that techno...
Knowledge representation involves a trade-off in expressivity and practicality. Knowledge graphs and knowledge bases need to be comprehensive for their applicable domains of use, populated with ‘vivid’ knowledge. Specifying all knowledge interactions is neither feasible nor computationally tractable. We use formalisms and logic to infer many relati...
For Peirce, the triadic nature of the sign—and its relation between the sign, its object, and its interpretant—was the speculative grammar breakthrough that then allowed him to better describe the process of sign-making and its role in the logic of inquiry and truth-testing (semiosis). We begin our analysis of a speculative grammar suitable to know...
Gregory Bateson defined information as the “difference that makes a difference.” Claude Shannon, the founder of information theory, emphasized the engineering aspect of information, defining it as a message or sequence of messages communicated over a channel; he specifically excluded meaning. C.S. Peirce emphasized meaning and related it to the tri...
The path to knowledge-based artificial intelligence (KBAI) directly coincides with a framework to aid data interoperability and responsive knowledge management (KM). KBAI, data interoperability, and KM are the three main opportunities covered in this book. A gateway to these opportunities is to address the sources of semantic heterogeneities of inf...
The ideas behind Peircean pragmatism are how to think about signs and representations (semiosis); logically reason and handle new knowledge (abduction) and probabilities (induction); make economic research choices (pragmatic maxim); categorize; and let the scientific method inform our inquiry. The connections of Peirce’s sign theory, his three-fold...
Critical work tasks of any new domain installation are the creation of the domain knowledge graph and its population with relevant instance data. It is easier to implement and test an incremental approach. Most of the implementation effort is to conceptualize (in a knowledge graph) the structure of the new domain and to populate it with instances (...
When we process information to identify relations or extract entities, to type or classify them, or to fill out their attributes, we need to gauge how well our algorithms work. KM poses a couple of differences from traditional scientific hypothesis testing. The problems we are dealing with in information retrieval (IR), natural language understandi...
Truth, though fallible, exists. Knowledge thus should express a coherent reality, to reflect a logical consistency and structure that comport with our observations about the world. How we represent that reality has syntactic variation and ambiguities of a semantic nature that can only be resolved by context. To deal in the realm of knowledge and be...
The idea of a SuperType is exactly equivalent to the root node of a typology, wherein multiple entity types with similar essences and characteristics are related to one another via a natural classification. In this chapter, we discuss the use of types as our general classification structure, and then typologies as modular ways to further organize t...
Openness is a recent and profound force, both creative and destructive. The mindset of ‘openness’ is not a discrete thing, but a concept with separate strands. Open logics and the open-world assumption enable us to add information to existing systems without the need to re-architect the underlying schema. Open content works to promote derivative an...
The three areas covered in depth in this chapter are workflows and business process management (BPM), semantic parsing, and robotics. The production and consumption of knowledge should warrant as much attention as do the actions or processes on the factory floor. Workflows are a visible gap in most knowledge management. A reason for the gap is that...
We see physical and informational networks, connections, relationships, and links grow all around us. This chapter contemplates the universal graph structure at the core of these developments. Relations between nodes, different than those of a hierarchical or subsumptive nature, provide still different structural connections across the knowledge gr...
This major work on knowledge representation is based on the writings of Charles S. Peirce, a logician, scientist, and philosopher of the first rank at the beginning of the 20th century. This book follows Peirce's practical guidelines and universal categories in a structured approach to knowledge representation that captures differences in events, e...
Peirce posited a “third grade of clearness of apprehension” to better understand a topic at hand, a part of his pragmatic maxim. This book has attempted to adhere to this ‘practionary’ form, the first attempt to so apply Peirce to a single concept. As first stated, knowledge representation is a field of artificial intelligence dedicated to represen...
One can create a proper enterprise knowledge management environment at acceptable cost using available open-source components and solid architectural design. Component services provide ontology and knowledge management functions in piecemeal functionality that we can integrate directly into existing workflows. Our purposes in data interoperability...
Since this book is about guidance from Charles Peirce and not about him, I have segregated his fascinating personal story and views to this appendix. Most Peircean scholars acknowledge changes in Peirce’s views over time, from his early writings in the 1860s to those at the turn of the century and up until his death in 1914. Peirce’s architectonic...
The two most labor-intensive steps in machine learning for natural language are 1) feature engineering, and 2) labeling of training sets. A systematic view of machine learning relating knowledge and human language features — coupled with large-scale knowledge bases such as Wikipedia and Wikidata — can lead to faster and cheaper learners across a co...
KBpedia is a computable knowledge structure resulting from the combined mapping of six, large-scale, public knowledge bases — Wikipedia, Wikidata, OpenCyc, GeoNames, DBpedia, and UMBEL. KBpedia is structured to enable useful splits across a myriad of dimensions from entities to relations to types that can all be selected to create positive and nega...
The chapter introduces the process of design of two upper-level ontologies-PROTON and UMBEL-into reference ontologies and their integration in the so-called Reference Knowledge Stack (RKS). It is argued that RKS is an important step in the efforts of the Linked Open Data (LOD) project to transform the Web into a global data space with diverse real...
Many in silico investigations in bioinformatics require access to multiple, distributed data sources and analytic tools. The requisite data sources may include large public data repositories, community databases, and project databases for use in domain-specific research. Different data sources frequently utilize distinct query languages and return...
Is RDF a framework, data model or vocabulary? Actually, it is all three. RDF -- the Resource Description Framework -- is, as its name implies, an abstract, conceptual framework for defining and using metadata and metadata vocabularies. RDF is a data model that is expressed as simple subject–predicate–object 'triples'. The referenced 'resources' in...
This first-ever demonstration of the new zLinks plug-in shows how any existing Web document link can be automatically transformed into a portal to relevant Linked Data. Each existing link disambiguates to its contextual and relevant subject concept (SC) or named entity (NE). The SCs are grounded in the OpenCyc knowledge base, supplemented by aliase...
Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed service designed to provide researchers with the web ab...
The Semantic Web is used to extract structure and meaningful information according to understandable schemes from mostly uncharacterized content. This structured content can be stored as Resource Description Framework (RDF) triples that can be further managed and manipulated. Powerful and flexible middleware operations, such as those from OpenLink,...
Searching on the Internet today can be compared to dragging a net across the surfgace of the ocean. While a great deal may be caught in the net, there is still a wealth of information that is deep, and therefore, missed. The reason is simple. Most of the Web's information is buried far down on dynamically generated sites, and standard search engine...
TEAM-UP is a partnership program of the US electric utility
industry and the US Department of Energy to help develop utility PV
markets. TEAM-UP is a utility-directed program to significantly increase
utility PV experience by promoting installations of utility PV power
systems. Two primary program areas are proposed for TEAM-UP: (1)
grid-independen...
Fuel cells and other advanced electric-generation technologies have not experienced a record of successful commercialization efforts. To lower costs for these technologies, it requires substantial production volumes with a significant investment in manufacturing facilities, all dependent on developer confidence in the ultimate market. Yet, market a...
Increased commercial use of PV technology has been hindered by its higher cost compared to fossil fuels. In fact, the market for PVs has suffered from the [open quotes]chicken-or-egg[close quotes] syndrome: Prices won't come down until more PV cells are purchased, and utilities can't afford to buy them until prices do come down. The Utility PhotoVo...
Several key areas are identified in the paper for the potential integration of municipal services. These are: (1) load management and conservation; (2) joint utility-customer energy services, (3) generation and new energy systems, (4) integrated energy systems, (5) administration and operations; (6) financing and planning, and (7) communications. T...
The report gives a comprehensive evaluation of factors (environmental, technical, economic, and institutional) influencing solid coal use in industrial boilers. Trends in coal use, recent legislative warrants, and technical and logistic problems in coal use at industrial plants are reviewed. Demographic aspects of the existing industrial boiler pop...
If present scientific information is reasonable, the world is likely to experience noticeable global warming by the beginning of the next century if high annual growth rates of fossil fuel energy use continue. Only with optimistic assumptions and low growth rates will carbon-dioxide-induced temperature increases be held below 2°C or so over the nex...
Biotechnology is not new to industry. From baking, brewing, cheese-making, ore leaching, waste treatment, drug production and other commercial applications, organisms from single-celled bacteria to higher plants have proven to be tiny and efficient factories. But a new era is upon us. Genetic engineering is just beginning to tap a vast reservoir of...
Fuel cells have many promising features for utilities desiring urban power generation. However, fuel cell power plants suitable for this application are not commercially available. This report describes a nearly two-year effort by representatives from several municipal utilities, in collaboration with the American Public Power Association and EPRI,...
Electric utilities actively engaged in promoting conservation and tighter building envelopes find the issue of indoor air pollution cuts two ways. On the one hand, some conservation programs may increase indoor pollutant levels. On the other hand, electric devices and appliances offer the cleanest alternative for heating and cooking in homes and bu...