Conference Paper

Exploring the Feature Space to Aid Learning in Design Space Exploration

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In this paper, we highlight that bringing a human designer into the design space exploration (DSE) can potentially better balance a designer's learning and design perfor-mance compared to a "black-box" only optimization [14,15]. Existing approaches to human-in-the-loop design space exploration take designer inputs on design-and feature selection [16,17,18] and present feedback on performance metrics and the diversity of generated methods [3,19]. The interaction between a human designer and the computer includes visual graphical user interfaces [16], natural language processing interfaces for question answering and textual explanations [20], and tangible physical interfaces [14]. ...
... The input type pertains to how user feedback is incorporated. A user might steer design exploration by choosing a desired set of designs or design variable values [34], select design parameters or features [35,17], or set a range for desired objective values [15,19]. Using highlevel rather than low-level features can reduce the number of input commands. ...
... Any desired DSE outcomes must be represented as a particular loss function. Existing generative design methods can optimize predefined performance metrics like structural compliance [36,5,4], maximize the diversity of generated designs [3,7,8], or learn driving features behind selected designs [17,35]. Finally, the mode of interaction between a user and the underlying tool can be a graphical user interface, natural language interface, or tangible interface. ...
Article
Full-text available
Deep generative models have shown significant promise in improving performance in design space exploration. But there is limited understanding of their interpretability, a necessity when model explanations are desired and problems are ill-defined. Interpretability involves learning design features behind design performance, called designer learning. This study explores human-machine collaboration's effects on designer learning and design performance. We conduct an experiment (N=42) designing mechanical metamaterials using a conditional variational auto-encoder. The independent variables are: (i) the level of automation of design synthesis, e.g., manual (where the user manually manipulates design variables), manual feature-based (where the user manipulates the weights of the features learned by the encoder), and semi-automated feature-based (where the agent generates a local design based on a start design and user-selected step size); and (ii) feature semanticity, e.g., meaningful versus abstract features. We assess feature-specific learning using Item Response Theory and design performance using utopia distance and hypervolume improvement. The results suggest design performance depends on the subjects' feature-specific knowledge, emphasizing the precursory role of learning. The semi-automated synthesis locally improves the utopia distance. Still, it does not result in higher global hypervolume improvement compared to manual design synthesis, and reduced designer learning compared to manual feature-based synthesis. The subjects learn semantic features better than abstract features only when design performance is sensitive to them. Potential cognitive constructs influencing learning in human-AI collaborative settings are discussed, such as cognitive load and recognition heuristics.
... In this paper, we highlight that bringing a human designer into the design space exploration (DSE) can potentially better balance a designer's learning and design perfor-mance compared to a "black-box" only optimization [14,15]. Existing approaches to human-in-the-loop design space exploration take designer inputs on design-and feature selection [16,17,18] and present feedback on performance metrics and the diversity of generated methods [3,19]. The interaction between a human designer and the computer includes visual graphical user interfaces [16], natural language processing interfaces for question answering and textual explanations [20], and tangible physical interfaces [14]. ...
... The input type pertains to how user feedback is incorporated. A user might steer design exploration by choosing a desired set of designs or design variable values [34], select design parameters or features [35,17], or set a range for desired objective values [15,19]. Using highlevel rather than low-level features can reduce the number of input commands. ...
... Any desired DSE outcomes must be represented as a particular loss function. Existing generative design methods can optimize predefined performance metrics like structural compliance [36,5,4], maximize the diversity of generated designs [3,7,8], or learn driving features behind selected designs [17,35]. Finally, the mode of interaction between a user and the underlying tool can be a graphical user interface, natural language interface, or tangible interface. ...
Conference Paper
Full-text available
Deep generative models have shown significant promise to improve performance in design space exploration (DSE), but they lack interpretability. A component of interpretability in DSE is helping designers learn how input design decisions influence multi-objective performance. This experimental study explores how human-machine collaboration influences both designer learning and design performance in deep learning-based DSE. A within-subject experiment is implemented with 42 subjects involving mechanical metamaterial design using a conditional variational auto-encoder. The independent variables in the experiment are two interactivity factors: (i) simulatability, e.g., manual design generation (high simulatability), manual feature-based design synthesis, and semi-automated feature-based synthesis (low simulatibility); and (ii) semanticity of features, e.g., meaningful versus abstract latent features. We perform assessment of designer learning using item response theory and design performance using metrics such as distance to utopia point and hypervolume improvement. The findings highlights a highly intertwined relationship between designer learning and design performance. Compared to manual design generation, the semi-automated synthesis generates designs closer to the utopia point. Still, it does not result in greater hyper-volume improvement. Further, the subjects learn the effects of semantic features better than abstract features, but only when the design performance is sensitive to those semantic features. Potential cognitive constructs, such as cognitive load and recognition heuristic, that may influence the interpretability of deep generative models are discussed.
Article
Full-text available
This article describes Daphne, a virtual assistant for designing Earth observation distributed spacecraft missions. It is, to the best of our knowledge, the first virtual assistant for such application. The article provides a thorough description of Daphne, including its question answering system and the main features we have implemented to help system engineers design distributed spacecraft missions. In addition, the article describes a study performed at NASA's Jet Propulsion Laboratory (JPL) to assess the usefulness of Daphne in this use case. The study was conducted with N = 9 subjects from JPL, who were asked to work on a mission design task with two versions of Daphne, one that was fully featured implementing the cognitive assistance functions, and one that only had the features one would find in a traditional design space exploration tool. After the task, they filled out a standard user experience survey, completed a test to assess how much they learned about the task, and were asked a number of questions in a semi-structured exit interview. Results of the study suggest that Daphne can help improve performance during system design tasks compared to traditional tools, while keeping the system usable. However, the study also raises some concerns with respect to a potential reduction in human learning due to the use of the cognitive assistant. The article ends with a list of suggestions for future development of virtual assistants for space mission design.
Conference Paper
Full-text available
This paper shows how to measure the complexity and reduce the dimensionality of a geometric design space. It assumes that high-dimensional design parameters actually lie in a much lower-dimensional space that represents semantic attributes. Past work has shown how to embed designs using techniques like autoencoders; in contrast, this paper quantifies when and how various embeddings are better than others. It captures the intrinsic dimensionality of a design space, the performance of recreating new designs for an embedding, and the preservation of topology of the original design space. We demonstrate this with both synthetic superformula shapes of varying non-linearity and real glassware designs. We evaluate multiple embeddings by measuring shape reconstruction error, topology preservation, and required semantic space dimensionality. Our work generates fundamental knowledge about the inherent complexity of a design space and how designs differ from one another. This deepens our understanding of design complexity in general.
Article
Full-text available
Design of complex systems requires collaborative teams to overcome limitations of individuals; however, teamwork contributes new sources of complexity related to information exchange among members. This paper formulates a human subjects experiment to quantify the relative contribution of technical and social sources of complexity to design effort using a surrogate task based on a parameter design problem. Ten groups of 3 subjects each perform 42 design tasks with variable problem size and coupling (technical complexity) and team size (social complexity) to measure completion time (design effort). Results of a two-level regression model replicate past work to show completion time grows geometrically with problem size for highly coupled tasks. New findings show the effect of team size is independent from problem size for both coupled and uncoupled tasks considered in this study. Collaboration contributes a large fraction of total effort, and it increases with team size: about 50–60 % of time and 70–80 % of cost for pairs and 60–80 % of time and 90 % of cost for triads. Conclusions identify a role for improved design methods and tools to anticipate and overcome the high cost of collaboration.
Article
Full-text available
A visualization methodology is presented in which a Pareto Frontier can be visualized in an intuitive and straightforward manner for an n-dimensional performance space. An approach for preference incorporation is presented that enables a designer to quickly identify 'good' points and regions of the performance spaces for a multi-objective optimization application, regardless of space complexity, numbers of objectives, or numbers of Pareto points. Visualizing Pareto solutions for more than three objectives has long been a significant challenge to the multi-objective optimization community. The Hyper-space Diagonal Counting (HSDC) method described here enables the lossless visualization to be implemented to achieve a hyperspace Pareto frontier. In this paper, we demonstrate the incredible power of using the hyperspace Pareto frontier as a visualization tool for design concept selection in a multiobjective optimization environment.
Conference Paper
Full-text available
We have developed a data visualization interface that facilitates a design by shopping paradigm, allowing a decision-maker to form a preference by viewing a rich set of good designs and use this preference to choose an optimal design. Design automation has allowed us to implement this paradigm, since a large number of designs can be synthesized in a short period of time. The interface allows users to visualize complex design spaces by using multi-dimensional visualization techniques that include customizable glyph plots, parallel coordinates, linked views, brushing, and histograms. As is common with data mining tools, the user can specify upper and lower bounds on the design space variables, assign variables to glyph axes and parallel coordinate plots, and dynamically brush variables. Additionally, preference shading for visualizing a user’s preference structure and algorithms for visualizing the Pareto frontier have been incorporated into the interface to help shape a decision-maker’s preference. Use of the interface is demonstrated using a satellite design example by highlighting different preference structures and resulting Pareto frontiers. The capabilities of the design by shopping interface were driven by real industrial customer needs, and the interface was demonstrated at a spacecraft design conducted by a team at Lockheed Martin, consisting of Mars spacecraft design experts.
Conference Paper
Full-text available
As our ability to generate more and more data for increasingly large engineering models improves, the need for methods for managing that data becomes greater. Information management from a decision-making perspective involves being able to capture and represent significant information to a designer so that they can make effective and efficient decisions. However, most visualization techniques used in engineering, such as graphs and charts, are limited to two-dimensional representations and at most three-dimensional representations. In this paper, we present a new visualization technique to capture and represent engineering information in a multidimensional context. The new technique, Cloud Visualization, is based upon representing sets of points as clouds in both the design and performance spaces. The technique is applicable to both single and multiobjective optimization problems and the relevant issues with each type of problem are discussed. A multiobjective case study is presented to demonstrate the application and usefulness of the Cloud Visualization techniques.
Article
Full-text available
There is a need for automated methods to learn general features of the interactions of a ligand class with its diverse set of protein receptors. An appropriate machine learning approach is Inductive Logic Programming (ILP), which automatically generates comprehensible rules in addition to prediction. The development of ILP systems which can learn rules of the complexity required for studies on protein structure remains a challenge. In this work we use a new ILP system, ProGolem, and demonstrate its performance on learning features of hexose-protein interactions. The rules induced by ProGolem detect interactions mediated by aromatics and by planar-polar residues, in addition to less common features such as the aromatic sandwich. The rules also reveal a previously unreported dependency for residues cys and leu. They also specify interactions involving aromatic and hydrogen bonding residues. This paper shows that Inductive Logic Programming implemented in ProGolem can derive rules giving structural features of protein/ligand interactions. Several of these rules are consistent with descriptions in the literature. In addition to confirming literature results, ProGolem's model has a 10-fold cross-validated predictive accuracy that is superior, at the 95% confidence level, to another ILP system previously used to study protein/hexose interactions and is comparable with state-of-the-art statistical learners.
Article
Full-text available
This paper presents a systematic analysis of twenty four performance measures used in the complete spectrum of Machine Learning classification tasks, i.e., binary, multi-class, multi-labelled, and hierarchical. For each classification task, the study relates a set of changes in a confusion matrix to specific characteristics of data. Then the analysis concentrates on the type of changes to a confusion matrix that do not change a measure, therefore, preserve a classifier’s evaluation (measure invariance). The result is the measure invariance taxonomy with respect to all relevant label distribution changes in a classification problem. This formal analysis is supported by examples of applications where invariance properties of measures lead to a more reliable evaluation of classifiers. Text classification supplements the discussion with several case studies.
Article
Full-text available
Objective measures such as support, confidence, interest factor, correlation, and entropy are often used to evaluate the interestingness of association patterns. However, in many situations, these measures may provide conflicting information about the interestingness of a pattern. Data mining practitioners also tend to apply an objective measure without realizing that there may be better alternatives available for their application. In this paper, we describe several key properties one should examine in order to select the right measure for a given application. A comparative study of these properties is made using twenty-one measures that were originally developed in diverse fields such as statistics, social science, machine learning, and data mining. We show that depending on its properties, each measure is useful for some application, but not for others. We also demonstrate two scenarios in which many existing measures become consistent with each other, namely, when support-based pruning and a technique known as table standardization are applied. Finally, we present an algorithm for selecting a small set of patterns such that domain experts can find a measure that best fits their requirements by ranking this small set of patterns.
Article
Full-text available
Classification rule mining aims to discover a small set of rules in the database that forms an accurate classifier. Association rule mining finds all the rules existing in the database that satisfy some minimum support and minimum confidence constraints. For association rule mining, the target of discovery is not pre-determined, while for classification rule mining there is one and only one predetermined target. In this paper, we propose to integrate these two mining techniques. The integration is done by focusing on mining a special subset of association rules, called class association rules (CARs). An efficient algorithm is also given for building a classifier based on the set of discovered CARs. Experimental results show that the classifier built this way is, in general, more accurate than that produced by the state-of-the-art classification system C4.5. In addition, this integration helps to solve a number of problems that exist in the current classification systems. Introduction ...
Conference Paper
One of the major challenges faced by the decision maker in the design of complex engineering systems is information overload. When the size and dimensionality of the data exceeds a certain level, a designer may become overwhelmed and no longer be able to perceive and analyze the underlying dynamics of the design problem at hand, which can result in premature or poor design selection. There exist various knowledge discovery and visual analytic tools designed to relieve the information overload, such as BrickViz, Cloud Visualization, ATSV, and LIVE, to name a few. However, most of them do not explicitly support the discovery of key knowledge about the mapping between the design space and the objective space, such as the set of high-level design features that drive most of the trade-offs between objectives. In this paper, we introduce a new interactive method, called iFEED, that supports the designer in the process of high-level knowledge discovery in a large, multiobjective design space. The primary goal of the method is to iteratively mine the design space dataset for driving features, i.e., combinations of design variables that appear to consistently drive designs towards specific target regions in the design space set by the user. This is implemented using a data mining algorithm that mines interesting patterns in the form of association rules. The extracted patterns are then used to build a surrogate classification model based on a decision tree that predicts whether a design is likely to be located in the target region of the tradespace or not. Higher level features will generate more compact classification trees while improving classification accuracy. If the mined features are not satisfactory, the user can go back to the first step and extract higher level features. Such iterative process helps the user to gain insights and build a mental model of how design variables are mapped into objective values. A controlled experiment with human subjects is designed to test the effectiveness of the proposed method. A preliminary result from the pilot experiment is presented. Copyright © 2016 by ASME Country-Specific Mortality and Growth Failure in Infancy and Yound Children and Association With Material Stature Use interactive graphics and maps to view and sort country-specific infant and early dhildhood mortality and growth failure data and their association with maternal
Article
Real-world optimization problems typically involve multiple objectives to be optimized simultaneously under multiple constraints and with respect to several variables. While multi-objective optimization itself can be a challenging task, equally difficult is the ability to make sense of the obtained solutions. In this two-part paper, we deal with data mining methods that can be applied to extract knowledge about multi-objective optimization problems from the solutions generated during optimization. This knowledge is expected to provide deeper insights about the problem to the decision maker, in addition to assisting the optimization process in future design iterations through an expert system. The current paper surveys several existing data mining methods and classifies them by methodology and type of knowledge discovered. Most of these methods come from the domain of exploratory data analysis and can be applied to any multivariate data. We specifically look at methods that can generate explicit knowledge in a machine-usable form. A framework for knowledge-driven optimization is proposed, which involves both online and offline elements of knowledge discovery. One of the conclusions of this survey is that while there are a number of data mining methods that can deal with data involving continuous variables, only a few ad hoc methods exist that can provide explicit knowledge when the variables involved are of a discrete nature. Part B of this paper proposes new techniques that can be used with such datasets and applies them to discrete variable multi-objective problems related to production systems.
Article
In the early-phase design of complex systems, a model of design performance is coupled with visualizations of competing designs and used to aid human decision-makers in finding and understanding an optimal design. This consists of understanding the tradeoffs among multiple criteria of a "good" design and the features of good designs. Current visualization techniques are limited when visualizing many performance criteria and/or do not explicitly relate the mapping between the design space and the objective space. We present a new technique called Cityplot, which can visualize a sample of an arbitrary (continuous or combinatorial) design space and the corresponding single or multidimensional objective space simultaneously. Essentially a superposition of a dimensionally reduced representation of the design decisions and bar plots representing the multiple criteria of the objective space, Cityplot can provide explicit information on the relationships between the design decisions and the design criteria. Cityplot can present decision settings in different parts of the space and reveal information on the decision †' criteria mapping, such as sensitivity, smoothness, and key decisions that result in particular criteria values. By focusing the Cityplot on the Pareto frontier from the criteria, Cityplot can reveal tradeoffs and Pareto optimal design families without prior assumptions on the structure of either. The method is demonstrated on two toy problems and two real engineered systems, namely, the NASA earth observing system (EOS) and a guidance, navigation and control (GNC) system.
Article
Software tools that enable interactive data visualization are now commonly available for engineering design. These tools allow engineers to inspect, filter, and select promising alternatives from large multivariate design spaces based upon an examination of the trade-offs between multiple objectives. There are two general approaches for visually representing data: (1) discretely, by plotting a sample of designs as distinct points; and (2) continuously, by plotting the functional relationships between design variables and design metrics as curves or surfaces. In this paper, we examine these two approaches through a human subjects experiment. Participants were asked to complete two design tasks with an interactive visualization tool: one by using a sample of discrete designs and one by using a continuous representation of the design space. Metrics describing the optimality of the design outcomes, the usage of different graphics, and the task workload were quantified by mouse tracking, user process descriptions, and analysis of the selected designs. The results indicate that users had more difficultly in selecting multiobjective optimal designs with common continuous graphics than with discrete graphics. The findings suggest that innovative features and additional usability studies are required in order for continuous trade space visualization tools to achieve their full potential.
Conference Paper
A visualization methodology is developed in which a Hyperspace Pareto Frontier (HPF) can be visualized in an intuitive way for an n-dimensional performance space. The new approach is termed the Hyper-Radial Visualization (HRV) method. Three approaches for preference incorporation are developed that enable designers to quickly identify better regions in the performance space for multi-objective optimization problems. Visualizing Pareto solutions that have more than three objective functions is a significant challenge in the multi-objective optimization community. The HRV method described here provides a lossless visualization tool to represent an HPF and is applied using a Direct Sorting Method (DSM) to design concept selection in multi-objective optimization problems in this paper. The viability and desirability of using the HRV for visualizing an HPF is explored here. First, this paper presents an approach to generate a lossless and intuitive representation of HPF by using the HRV method. Second, three different color-coded preference schemes are developed and presented to enable easy sorting of results in the HRV. Finally, the HRV-based visualization method is validated using three multi-objective optimization problems.
Conference Paper
*† ‡ § ¶ Information overload has become a notable problem in many multidisciplinary design optimization analyses. When dealing with large design datasets, engineering designers can quickly become overwhelmed by the data, solutions, and their relationship. This paper presents a work-centered visual analytics framework to address such challenges. The proposed framework integrates user-centered interactive visualization and data-oriented computational algorithms into two analytical loops to help designers perform in-depth analysis on a trade space. An application system prototype, LIVE, has been developed to support multidisciplinary design optimization. The proposed system allows designers to analyze data, discover patterns, and formalize preferences in a uniform and integrated software platform by combining visualization and data mining. This approach is expected to help designers efficiently making sense of complicated multi-dimensional design data sets. I. Background uring the process of analysis and optimization of multidisciplinary problems, information overload has become a notable problem. A posteriori design paradigm 1 allows engineer designers to view the visualization of the entire design space before choosing optimal designs based on their preferences. Comparing to traditional a priori approaches, a posteriori approaches have the advantage of giving designers more control in the design exploration and selection process; however, the amount of computer-simulated design alternatives and solutions can easily overwhelm designers and go beyond human's analysis capability. Selecting interesting design candidates from such large-scale data sets becomes especially difficult for designers even when aided by optimization algorithms. Therefore, effective tools to support exploring and analyzing such massive datasets are needed in the process of multidisciplinary design optimization. Information visualization has been used to support trade space exploration during design analysis and optimization because it can utilize human’s visual perception capabilities to identify patterns and features within data sets. However, traditional visualization techniques cannot handle large, multi-dimensional data sets well due to the limitations of display devices and human’s cognitive bandwidth. Recently, the new discipline of visual analytics 2
Conference Paper
This paper introduces a novel methodology for the optimization, analysis and decision support in production systems engineering. The methodology is based on the innovization procedure, originally introduced to unveil new and innovative design principles in engineering design problems. The innovization procedure stretches beyond an optimization task and attempts to discover new design/operational rules/principles relating to decision variables and objectives, so that a deeper understanding of the underlying problem can be obtained. By integrating the concept of innovization with simulation and data mining techniques, a new set of powerful tools can be developed for general systems analysis. The uniqueness of the approach introduced in this paper lies in that decision rules extracted from the multi-objective optimization using data mining are used to modify the original optimization. Hence, faster convergence to the desired solution of the decision-maker can be achieved. In other words, faster convergence and deeper knowledge of the relationships between the key decision variables and objectives can be obtained by interleaving the multi-objective optimization and data mining process. In this paper, such an interleaved approach is illustrated through a set of experiments carried out on a simulation model developed for a real-world production system analysis problem.
Conference Paper
Genetic Algorithm (GA) is one of the effective methods in the application to optimization problems. Recently, Multi-Objective Genetic Algorithm (MOGA), which is the application of Genetic Algorithm to Multi-objective Optimization Problems, is focused on in the engineering design field. In this field, the analysis of design variables in the acquired Pareto solutions, which gives the designers useful knowledge in the applied problem, is important as well as the acquisition of advanced solutions. This paper proposes a visualization method using an idea of Isomap, that visualizes manifold embedded in the high dimensional space, which was originally proposed in the field of multiple classification analysis. The proposed method visualizes the geometric distance of solutions in the design variable space considering their distance in the objective space. This method enables a user to analyze the design variables of the acquired solutions considering their relationship in the objective space. This paper applies the proposed method to the conceptual design optimization problem of hybrid rocket engine and studies the effectiveness of the proposed method. We found interesting structure in the distribution of Pareto solutions by applying the proposed method to this problem. This paper shows that the visualized result gives some knowledge on the features between design variables and fitness values in the acquired Pareto solutions.
Article
This paper presents the results of an experiment that measured human abilities to solve parameter design problems specific to the building design domain. The subjects of the experiment were university students who solved a series of parameter design problems that varied in terms of scale (number of design variables) and coupling (interactions between variables). Results show an exponential decrease in solution quality as the scale of the problem increases. Coupling has a comparable impact on solution quality for problems involving two to three variables, but becomes less significant as the scale of the problem increases. We discuss these findings in the context of information processing models for human cognition and explore the implications for current design theories and methodologies.
Article
How do we learn concepts and categories from examples? Part of the answer might be that we induce the simplest category consistent with a given set of example objects. This seemingly obvious idea, akin to simplicity principles in many fields, plays surprisingly little role in contemporary theories of concept learning, which are mostly based on the storage of exemplars, and avoid summarization or overt abstraction of any kind. This article reviews some evidence that complexity minimization does indeed play a central role in human concept learning. The chief finding is that subjects' ability to learn concepts depends heavily on the concepts' intrinsic complexity; more complex concepts are more difficult to learn. This pervasive effect suggests, contrary to exemplar theories, that concept learning critically involves the extraction of a simplified or abstracted generalization from examples.
Article
This paper presents experimental results re- garding human abilities to solve parameter design tasks. In particular, the investigations focus on how the solution of parameter design problems is affected by coupling among the design variables and the scale of the problem (number of design variables). An experiment was conducted with human subjects who solved parameter design tasks through a simple graphical user interface. It is established that, for parameter design tasks with only two inputs and two outputs, coupling had only a moderate effect on the subjects' solution of the problem. As the number of vari- ables increases, the effect of coupling among variables has a drastic effect on the solution procedures and the com- pletion time. The time for a human to solve a coupled parameter design problem rises geometrically as problem size rises from 2· 2t o 5·5. These results are discussed in the context of information processing models of human cognition. The implications for current design theories and methodologies are explored.
Book
Knowledge representation is at the very core of a radical idea for understanding intelligence. Instead of trying to understand or build brains from the bottom up, its goal is to understand and build intelligent behavior from the top down, putting the focus on what an agent needs to know in order to behave intelligently, how this knowledge can be represented symbolically, and how automated reasoning procedures can make this knowledge available as needed. This landmark text takes the central concepts of knowledge representation developed over the last 50 years and illustrates them in a lucid and compelling way. Each of the various styles of representation is presented in a simple and intuitive form, and the basics of reasoning with that representation are explained in detail. This approach gives readers a solid foundation for understanding the more advanced work found in the research literature. The presentation is clear enough to be accessible to a broad audience, including researchers and practitioners in database management, information retrieval, and object-oriented systems as well as artificial intelligence. This book provides the foundation in knowledge representation and reasoning that every AI practitioner needs.
Article
As many-objective optimisation algorithms mature the problem owner is faced with visualising and understanding a set of mutually non-dominating solutions in a high dimensional space. We review existing methods and present new techniques to address this problem. We address a common problem with the well known heatmap visualisation, that the often arbitrary ordering of rows and columns renders the heatmap unclear, by using spectral seriation to rearrange the solutions and objectives and thus enhance the clarity of the heatmap. A multi-objective evolutionary optimiser is used to further enhance the simultaneous visualisation of solutions in objective and parameter space. Two methods for visualising multi-objective solutions in the plane are introduced. First, we use RadViz and exploit interpretations of barycentric coordinates for convex polygons and simplices to map a mutually non-dominating set to the interior of a regular convex polygon in the plane, providing an intuitive representation of the solutions and objectives. Second, we introduce a new measure of the similarity of solutions the dominance distance which captures the order relations between solutions. This metric provides an embedding in Euclidean space, which is shown to yield coherent visualisations in two dimensions. The methods are illustrated on standard test problems and data from a benchmark many-objective problem.
Article
Knowledge discovery in multi-dimensional data is a challenging problem in engineering design. For example, in trade space exploration of large design data sets, designers need to select a subset of data of interest and examine data from different data dimensions and within data clusters at different granularities. This exploration is a process that demands both humans, who can heuristically decide what data to explore and how best to explore it, and computers, which can quickly extract features that may be of interest in the data. Thus, to support this process of knowledge discovery, we need tools that can go beyond traditional computer-oriented optimisation approaches and support advanced designer-centred trade space exploration and data interaction. This paper is an effort to address this need. In particular, we propose the interactive multiscale-nested clustering and aggregation framework to support trade space exploration of multi-dimensional data common to design optimisation. A system prototype of this framework is implemented to allow users to visually examine large design data sets through interactive data clustering, aggregation, and visualisation. The paper also presents an evaluation study involving morphing wing design using this prototype system.
Article
The algorithm quasi-optimal (AQ) is a powerful machine learning methodology aimed at learning symbolic decision rules from a set of examples and counterexamples. It was first proposed in the late 1960s to solve the Boolean function satisfiability problem and further refined over the following decade to solve the general covering problem. In its newest implementations, it is a powerful but yet little explored methodology for symbolic machine learning classification. It has been applied to solve several problems from different domains, including the generation of individuals within an evolutionary computation framework. The current article introduces the main concepts of the AQ methodology and describes AQ for source detection(AQ4SD), a tailored implementation of the AQ methodology to solve the problem of finding the sources of atmospheric releases using distributed sensor measurements. The AQ4SD program is tested to find the sources of all the releases of the prairie grass field experiment. Copyright © 2010 John Wiley & Sons, Inc. For further resources related to this article, please visit the WIREs website.
Conference Paper
Visual clutter denotes a disordered collection of graphical entities in information visualization. Clutter can obscure the structure present in the data. Even in a small dataset, clutter can make it hard for the viewer to find patterns, relationships and structure. In this paper, we define visual clutter as any aspect of the visualization that interferes with the viewer's understanding of the data, and present the concept of clutter-based dimension reordering. Dimension order is an attribute that can significantly affect a visualization's expressiveness. By varying the dimension order in a display, it is possible to reduce clutter without reducing information content or modifying the data in any way. Clutter reduction is a display-dependent task. In this paper, we follow a three-step procedure for four different visualization techniques. For each display technique, first, we determine what constitutes clutter in terms of display properties; then we design a metric to measure visual clutter in this display; finally we search for an order that minimizes the clutter in a display
Article
Empirical studies of retrieval performance have shown a tendency for Precision to decline as Recall increases. This article examines the nature of the relationship between Precision and Recall. The relationships between Recall and the number of documents retrieved, between Precision and the number of documents retrieved, and between Precision and Recall are described in the context of different assumptions about retrieval performance. It is demonstrated that a tradeoff between Recall and Precision is unavoidable whenever retrieval performance is consistently better than retrieval at random. More generally, for the Precision–Recall trade-off to be avoided as the total number of documents retrieved increases, retrieval performance must be equal to or better than overall retrieval performance up to that point. Examination of the mathematical relationship between Precision and Recall shows that a quadratic Recall curve can resemble empirical Recall–Precision behavior if transformed into a tangent parabola. With very large databases and/or systems with limited retrieval capabilities there can be advantages to retrieval in two stages: Initial retrieval emphasizing high Recall, followed by more detailed searching of the initially retrieved set, can be used to improve both Recall and Precision simultaneously. Even so, a tradeoff between Precision and Recall remains. © 1994 John Wiley & Sons, Inc.
Conference Paper
Previous studies propose that associative classification has high classification accuracy and strong flexibility at handling unstructured data. However, it still suffers from the huge set of mined rules and sometimes biased classification or overfitting since the classification is based on only a single high-confidence rule. The authors propose a new associative classification method, CMAR, i.e., Classification based on Multiple Association Rules. The method extends an efficient frequent pattern mining method, FP-growth, constructs a class distribution-associated FP-tree, and mines large databases efficiently. Moreover, it applies a CR-tree structure to store and retrieve mined association rules efficiently, and prunes rules effectively based on confidence, correlation and database coverage. The classification is performed based on a weighted χ2 analysis using multiple strong association rules. Our extensive experiments on 26 databases from the UCI machine learning database repository show that CMAR is consistent, highly effective at classification of various kinds of databases and has better average classification accuracy in comparison with CBA and C4.5. Moreover, our performance study shows that the method is highly efficient and scalable in comparison with other reported associative classification methods
Article
We consider the problem of discovering association rules between items in a large database of sales transactions. We presenttwo new algorithms for solving this problem that are fundamentally different from the known algorithms. Experiments with synthetic as well as real-life data show that these algorithms outperform the known algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems. We also showhow the best features of the two proposed algorithms can be combined into a hybrid algorithm, called AprioriHybrid. Scale-up experiments show that AprioriHybrid scales linearly with the number of transactions. AprioriHybrid also has excellent scale-up properties with respect to the transaction size and the number of items in the database. 1 Introduction Database mining is motivated by the decision support problem faced by most large retail organizations [S + 93]. Progress in bar-code technology has made it possible for retail ...
Design by shopping: A new paradigm?
  • R Balling
R. Balling, "Design by shopping: A new paradigm?," in Proceedings of the Third World Congress of structural and multidisciplinary optimization (WCSMO-3), 1999, vol. 1, pp. 295-297.
Visualization and Data Mining of Pareto Solutions Using Self-Organizing Map
  • S Obayashi
  • D Sasaki
S. Obayashi and D. Sasaki, "Visualization and Data Mining of Pareto Solutions Using Self-Organizing Map," Evol. Multi-Criterion Optim., vol. LNCS 2632, pp. 796-809, 2003.
A proposal on analysis support system based on association rule analysis for non-dominated solutions
  • S Watanabe
  • Y Chiba
  • M Kanazaki
S. Watanabe, Y. Chiba, and M. Kanazaki, "A proposal on analysis support system based on association rule analysis for non-dominated solutions," in Proceedings of the 2014 IEEE Congress on Evolutionary Computation, CEC 2014, 2014, pp. 880-887.
Boosting design space explorations with existing or automatically learned knowledge
  • R Jahr
  • H Calborean
  • L Vintan
  • T Ungerer
R. Jahr, H. Calborean, L. Vintan, and T. Ungerer, "Boosting design space explorations with existing or automatically learned knowledge," Proc. 16th Int. GI/ITG Conf. Meas. Model. Eval. Comput. Syst. Dependability Fault Toler., pp. 221-235, 2012.
Intelligent optimization via learnable evolution model
  • R S Michalski
  • J Wojtusiak
  • K A Kaufman
R. S. Michalski, J. Wojtusiak, and K. A. Kaufman, "Intelligent optimization via learnable evolution model," Proc. -Int. Conf. Tools with Artif. Intell. ICTAI, no. 3, pp. 332-335, 2006.
Kriging-modelbased multi-objective robust optimization and trade-off-rule mining using association rule with aspiration vector
  • K Sugimura
  • S Jeong
  • S Obayashi
  • T Kimura
K. Sugimura, S. Jeong, S. Obayashi, and T. Kimura, "Kriging-modelbased multi-objective robust optimization and trade-off-rule mining using association rule with aspiration vector," 2009 IEEE Congr. Evol. Comput. CEC 2009, pp. 522-529, 2009.
Knowledge-intensive global optimization of Earth observing system architectures: a climate-centric case study
  • D Selva
D. Selva, "Knowledge-intensive global optimization of Earth observing system architectures: a climate-centric case study," SPIE Remote Sens., vol. 9241, p. 92411S-92411S, 2014.
Intuitive Design Selection Using Visualized n-Dimensional Pareto Frontier
  • G Agrawal
  • C L Bloebaum
  • K E Lewis
G. Agrawal, C. L. Bloebaum, and K. E. Lewis, "Intuitive Design Selection Using Visualized n-Dimensional Pareto Frontier," in 46th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics & Materials Conference, AIAA 2005-1813, 2005, no. April, pp. 1-14.