Preprint

A CNN-based Patent Image Retrieval Method for Design Ideation

Authors:
To read the file of this research, you can request a copy directly from the authors.

Abstract

The patent database is often used in searches of inspirational stimuli for innovative design opportunities because of its large size, extensive variety and rich design information in patent documents. However, most patent mining research only focuses on textual information and ignores visual information. Herein, we propose a convolutional neural network (CNN)- based patent image retrieval method. The core of this approach is a novel neural network architecture named Dual-VGG that is aimed to accomplish two tasks: visual material type prediction and International Patent Classification (IPC) class label prediction. In turn, the trained neural network provides the deep features in the image embedding vectors that can be utilized for patent image retrieval. The accuracy of both training tasks and patent image embedding space are evaluated to show the performance of our model. This approach is also illustrated in a case study of robot arm design retrieval. Compared to traditional keyword-based searching and Google image searching, the proposed method discovers more useful visual information for engineering design.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the authors.

Article
The goal of this research is to develop a computer-aided visual analogy support (CAVAS) framework to augment designers' visual analogical thinking by stimulating them by providing relevant visual cues from a variety of categories. Two steps are taken to reach this goal: developing a flexible computational framework to explore various visual cues, i.e., shapes or sketches, based on the relevant datasets and conducting human-based behavioral studies to validate such visual cue exploration tools. This paper presents the results and insights obtained from the first step by addressing two research questions: how can the computational framework CAVAS be developed to provide designers in sketching with certain visual cues for stimulating their visual thinking process? and how can a computation tool learn a latent space which can capture the shape patterns of sketches. A visual cue exploration framework and a deep clustering model Cavas-DL are proposed to learn a latent space of sketches that reveal shape patterns for multiple sketch categories and simultaneously cluster the sketches to preserve and provide category information as part of visual cues. The distance- and overlap-based similarities are introduced and analyzed to identify long- and short-distance analogies. Performance evaluations of our proposed methods are carried out with different configurations, and the visual presentations of the potential analogical cues are explored. The results have demonstrated the applicability of the Cavas-DL model as the basis for the human-based validation studies in the next step.
Conference Paper
Full-text available
During a design process, designers iteratively go back and forth between different design stages to explore the design space and search for the best design solution that satisfies all design constraints. For complex design problems, human has shown surprising capability in effectively reducing the dimensionality of design space and quickly converging it to a reasonable range for algorithms to step in and continue the search process. Therefore, modeling how human designers make decisions in such a sequential design process can help discover beneficial design patterns, strategies, and heuristics, which are important to the development of new algorithms embedded with human intelligence to augment computational design. In this paper, we develop a deep learning based approach to model and predict designers’ sequential decisions in a system design context. The core of this approach is an integration of the function-behavior-structure model for design process characterization and the long short term memory unit model for deep leaning. This approach is demonstrated in a solar energy system design case study, and its prediction accuracy is evaluated benchmarked on several commonly used models for sequential design decisions, such as Markov Chain model, Hidden Markov Chain model, and random sequence generation model. The results indicate that the proposed approach outperforms the other traditional models. This implies that during a system design task, designers are very likely to reply on both short-term and long-term memory of past design decisions in guiding their decision making in future design process. Our approach is general to be applied in many other design contexts as long as the sequential design action data is available.
Article
Full-text available
Traditionally, the ideation of design opportunities and new concepts relies on human expertise or intuition and is faced with high uncertainty. Inexperienced or specialized designers often fail to explore ideas broadly and become fixed on specific ideas early in the design process. Recent data-driven design methods provide external design stimuli beyond one’s own knowledge, but their uses in rapid ideation are still limited. Intuitive and directed ideation techniques, such as brainstorming, mind mapping, Design-by-Analogy, SCAMPER, TRIZ and Design Heuristics may empower designers in rapid ideation but are limited in the designer’s own knowledge base. Herein, we harness data-driven design and rapid ideation techniques to introduce a data-driven computer-aided rapid ideation process using the cloud-based InnoGPS system. InnoGPS integrates an empirical network map of all technology domains based on the international patent classification which are connected according to knowledge distance based on patent data, with a few map-based functions to position technologies, explore neighborhoods, and retrieve knowledge, concepts and solutions in the near or far fields for design analogies and syntheses. The functions of InnoGPS fuse design science, network science, data science and interactive visualization and make the design ideation process data-driven, theoretically-grounded, visually-inspiring, and rapid. We demonstrate the procedures of using InnoGPS as a data-driven rapid ideation tool to generate new rolling toy design concepts.
Article
Full-text available
The objective of this paper is to analyze the impact of the factors that stimulate inspiration in the design process. An empirical study is proposed in this paper. Three factors were summarized, including knowledge, knowledge relations and innovative strategies. Representations of these three factors that were extracted from the international patent classification (IPC) were analyzed. An experimental scheme was designed, where 40 undergraduate students were selected as subjects and divided into five groups. One group was used as the control group with no materials provided, and the other four groups were used as the experimental subjects. The three factors and their combinations were provided to each of the four experimental groups. Subjects in each group were asked to propose preliminary designs of a coin-sorting device. The submitted designs were evaluated based on the novelty and practicality of the innovation. By comparing the groups, the impact of each factor on inspiration could be analyzed. The results show that the verb knowledge in the IPC has a significant impact on stimulating inspiration. For untrained students lacking experience, innovative strategies have a negative impact on the novelty of the inspiration. The knowledge relations can overcome the negative impact on the novelty induced by the innovative strategies.
Article
Full-text available
The use of industrial robots is increasing in areas such as food, consumer goods, wood, plastics and electronics, but is still mostly concentrated in the automotive industry. The aim of this project has been to develop a concept of a lightweight robot using lightweight materials such as aluminum and carbon fiber together with a newly developed stepper motor prototype. The wrist also needs to be constructed for cabling to run through on the inside. It is expensive to change cables and therefore the designing to reduce the friction on cable, is crucial to increase time between maintenance. A concept generation was performed based on the function analysis, the the specifications of requirements that had been established. From the concept generation, twenty-four sustainable concepts divided into four groups (representing an individual part of the whole concept) were evaluated.
Article
Full-text available
Data-driven engineering designers often search for design precedents in patent databases to learn about relevant prior arts, seek design inspiration, or assess the novelty of their own new inventions. However, patent retrieval relevant to the design of a specific product or technology is often unstructured and unguided, and the resultant patents do not sufficiently or accurately capture the prior design knowledge base. This paper proposes an iterative and heuristic methodology to comprehensively search for patents as precedents of the design of a specific technology or product for data-driven design. The patent retrieval methodology integrates the mining of patent texts, citation relationships, and inventor information to identify relevant patents; particularly, the search keyword set, citation network, and inventor set are expanded through the designer's heuristic learning from the patents identified in prior iterations. The method relaxes the requirement for initial search keywords while improving patent retrieval completeness and accuracy. We apply the method to identify self-propelled spherical rolling robot (SPSRRs) patents. Furthermore, we present two approaches to further integrate, systemize, visualize, and make sense of the design information in the retrieved patent data for exploring new design opportunities. Our research contributes to patent data-driven design.
Article
Full-text available
Local features and descriptors that perform well in the case of photographic images are often unable to capture the content of binary technical drawings due to their different characteristics. Motivated by this, a new local feature representation, the contextual local primitives, is proposed in this paper. It is based on the detection of the junction and end points, classification of the local primitives to local primitive words and establishment of the geodesic connections of the local primitives. We exploit the granulometric information of the binary patent images to set all the necessary parameters of the involved mathematical morphology operators and window size for the local primitive extraction, which makes the whole framework parameter free. The contextual local primitives and, their spatial areas as a histogram weighting factor are evaluated by performing binary patent image retrieval experiments. It is found that the proposed contextual local primitives perform better than the local primitives only, the SIFT description of the contextual Hessian points, the SIFT description of local primitives and state of the art local content capturing methods. Moreover, an analysis of the approach in the perspective of a general patent image retrieval system reveals of its being efficient in multiple aspects.
Conference Paper
Full-text available
Convolutional Neural Networks (CNNs) achieve state-of-the-art performance in many computer vision tasks. However, this achievement is preceded by extreme manual annotation in order to perform either training from scratch or fine-tuning for the target task. In this work, we propose to fine-tune CNN for image retrieval from a large collection of unordered images in a fully automated manner. We employ state-of-the-art retrieval and Structure-from-Motion (SfM) methods to obtain 3D models, which are used to guide the selection of the training data for CNN fine-tuning. We show that both hard positive and hard negative examples enhance the final performance in particular object retrieval with compact codes.
Conference Paper
Full-text available
Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections - one between each layer and its subsequent layer - our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less memory and computation to achieve high performance. Code and models are available at https://github.com/liuzhuang13/DenseNet.
Article
Full-text available
Functional technical performance usually follows an exponential dependence on time but the rate of change (the exponent) varies greatly among technological domains. This paper presents a simple model that provides an explanatory foundation for these phenomena based upon the inventive design process. The model assumes that invention – novel and useful design – arises through probabilistic analogical transfers that combine existing knowledge by combining existing individual operational ideas to arrive at new individual operating ideas. The continuing production of individual operating ideas relies upon injection of new basic individual operating ideas that occurs through coupling of science and technology simulations. The individual operational ideas that result from this process are then modeled as being assimilated in components of artifacts characteristic of a technological domain. According to the model, two effects (differences in interactions among components for different domains and differences in scaling laws for different domains) account for the differences found in improvement rates among domains whereas the analogical transfer process is the source of the exponential behavior. The model is supported by a number of known empirical facts: further empirical research is suggested to independently assess further predictions made by the model.
Article
Full-text available
Practitioners of biomimetic design express one consistent need—access to relevant biological information. This information is most useful when the transferable elements have been abstracted from the biological principle, are organized by design and engineering function, and are supported by contextual search functions. However, the fine details of how the information is organized and accessed are critical. In this chapter, we reflect upon AskNature.org, a biomimetic database created to address these issues, and its key elements, including a Biomimicry Taxonomy, biological strategy pages organized by function, search and tri-browse features, and bio-inspired product pages as examples of bio-inspired design successes.
Article
Full-text available
Design-by-analogy is a growing field of study and practice, due to its power to augment and extend traditional concept generation methods by expanding the set of generated ideas using similarity relationships from solutions to analogous problems. This paper presents the results of experimentally testing a new method for extracting functional analogies from general data sources, such as patent databases, to assist designers in systematically seeking and identifying analogies. In summary, the approach produces significantly improved results on the novelty of solutions generated and no significant change in the total quantity of solutions generated. Computationally, this design-by-analogy facilitation methodology uses a novel functional vector space representation to quantify the functional similarity between represented design problems and, in this case, patent descriptions of products. The mapping of the patents into the functional analogous words enables the generation of functionally relevant novel ideas that can be customized in various ways. Overall, this approach provides functionally relevant novel sources of design-by-analogy inspiration to designers and design teams.
Article
Full-text available
Design-by-analogy is a powerful approach to augment traditional concept generation methods by expanding the set of generated ideas using similarity relationships from solutions to analogous problems. While the concept of design-by-analogy has been known for some time, few actual methods and tools exist to assist designers in systematically seeking and identifying analogies from general data sources, databases, or repositories, such as patent databases. A new method for extracting functional analogies from data sources has been developed to provide this capability, here based on a functional basis rather than form or conflict descriptions. Building on past research, we utilize a functional vector space model (VSM) to quantify analogous similarity of an idea's functionality. We quantitatively evaluate the functional similarity between represented design problems and, in this case, patent descriptions of products. We also develop document parsing algorithms to reduce text descriptions of the data sources down to the key functions, for use in the functional similarity analysis and functional vector space modeling. To do this, we apply Zipf's law on word count order reduction to reduce the words within the documents down to the applicable functionally critical terms, thus providing a mapping process for function based search. The reduction of a document into functional analogous words enables the matching to novel ideas that are functionally similar, which can be customized various ways. This approach thereby provides relevant sources of design-by-analogy inspiration. As a verification of the approach, two original design problem case studies illustrate the distance range of analogical solutions that can be extracted. This range extends from very near-field, literal solutions to far-field cross-domain analogies.
Article
Full-text available
This work lends insight into the meaning and impact of "near" and "far" analogies. A cognitive engineering design study is presented that examines the effect of the distance of analogical design stimuli on design solution generation, and places those findings in context of results from the literature. The work ultimately sheds new light on the impact of analogies in the design process and the significance of their distance from a design problem. In this work, the design repository from which analogical stimuli are chosen is the U.S. patent database, a natural choice, as it is one of the largest and easily accessed catalogued databases of inventions. The "near" and "far" analogical stimuli for this study were chosen based on a structure of patents, created using a combination of latent semantic analysis and a Bayesian based algorithm for discovering structural form, resulting in clusters of patents connected by their relative similarity. The findings of this engineering design study are juxtaposed with the findings of a previous study by the authors in design by analogy, which appear to be contradictory when viewed independently. However, by mapping the analogical stimuli used in the earlier work into similar structures along with the patents used in the current study, a relationship between all of the stimuli and their relative distance from the design problem is discovered. The results confirm that "near" and "far" are relative terms, and depend on the characteristics of the potential stimuli. Further, although the literature has shown that "far" analogical stimuli are more likely to lead to the generation of innovative solutions with novel characteristics, there is such a thing as too far. That is, if the stimuli are too distant, they then can become harmful to the design process. Importantly, as well, the data mapping approach to identify analogies works, and is able to impact the effectiveness of the design process. This work has implications not only in the area of finding inspirational designs to use for design by analogy processes in practice, but also for synthesis, or perhaps even unification, of future studies in the field of design by analogy. [DOI: 10.1115/1.4023158]
Article
Full-text available
Many areas of academic and industrial work make use of the notion of a ‘technology’. This paper attempts to reduce the ambiguity around the definition of what constitutes a ‘technology’ by extension of a method described previously that finds highly relevant patent sets for specified technological fields. The method relies on a less ambiguous definition that includes both a functional component and a component consisting of the underlying knowledge in a technological field to form a two-component definition. These two components form a useful definition of a technology that allows for objective, repeatable and thus comparable analysis of specific technologies. 28 technological domains are investigated: the extension of an earlier technique is shown to be capable of finding highly relevant and complete patent sets for each of the technologies. Overall, about 500,000 patents from 1976 to 2012 are classified into these 28 domains. The patents in each of these sets are not only highly relevant to the domain of interest but there are relatively low numbers of patents classified into any two of these domains (total patents classified in two domains are 2.9 % of the total patents and the great majority of patent class pairs have zero overlap with a few of the 378 patent class pairs containing the bulk of the doubly listed patents). On the other hand, the patents within a given domain cite patents in other domains about 90 % of the time. These results suggest that technology can be usefully decomposed to distinct units but that the inventions in these relatively tightly contained units depend upon widely spread additional knowledge.
Conference Paper
Full-text available
The aim of this document is to describe the methods we used in the Patent Image Classification and Image-based Patent Retrieval tasks of the Clef-IP 2011 track. The patent image classification task consisted in categorizing patent images into pre-defined categories such as abstract drawing, graph, flowchart, table, etc. Our main aim in participating in this sub-task was to test how our image categorizer performs on this type of categorization problem. Therefore, we used SIFT-like local orientation histograms as low level features and on the top of that we built a visual vocabularies specific to patent images using Gaussian mixture model (GMM). This allowed us to represent images with Fisher Vectors and to use linear classifiers to train one-versus-all classifiers. As the results show, we obtain very good classification performance. Concerning the Image-based Patent Retrieval task, we kept the same image repre-sentation as for the Image Classification task and used dot product as similarity measure. Nevertheless, in the case of patents the aim was to rank patents based on patent similarities, which in the case of pure image-based retrieval implies to be able to compare a set of images versus another set of images. Therefore, we investigated different strategies such as averaging Fisher Vector representation of an image set or considering the maximum similarity between pairs of images. Finally, we also built runs where the predicted image classes were considered in the retrieval process.
Article
Full-text available
This paper presents a relatively simple, objective and repeatable method for selecting sets of patents that are representative of a specific technological domain. The methodology consists of using search terms to locate the most representative international and US patent classes and determines the overlap of those classes to arrive at the final set of patents. Five different technological fields (computed tomography, solar photovoltaics, wind turbines, electric capacitors, electrochemical batteries) are used to test and demonstrate the proposed method. Comparison against traditional keyword searches and individual patent class searches shows that the method presented in this paper can find a set of patents with more relevance and completeness and no more effort than the other two methods. Follow on procedures to potentially improve the relevancy and completeness for specific domains are also defined and demonstrated. The method is compared to an expertly selected set of patents for an economic domain, and is shown to not be a suitable replacement for that particular use case. The paper also considers potential uses for this methodology and the underlying techniques as well as limitations of the methodology.
Article
Full-text available
TRIZ, the Soviet-initiated Theory of Inventive Problem Solving, is gaining acknowledgement both as a systematic methodology for innovation and a powerful tool for technology forecasting. Nevertheless, the analysis of patents necessary for gathering the data to be used for the previsional activity is very cumbersome and sometimes unworthy due to the intrinsic low reliability of forecasting tasks. With this perspective it is necessary to speed up the identification of the technical/physical conflict(s) overcome by an invention, according to its textual description. Although text-mining tools have reached relevant capabilities for extracting useful information from huge sets of documents, no specific means are available to support the analysis of patents with the aim of identifying the contradiction underlying a given technical system. This paper proposes a computer-aided approach for accomplishing such a task: the algorithm is described and validated by means of practical examples.
Article
Full-text available
The paper introduces infused design: an approach for establishing effective collaboration between designers from different engineering fields. In infused design, the design problem representation is brought up to a mathematical meta-level, which is common to all engineering disciplines. The reasoning about the problem is then done by using mathematical terminology and tools that, due to their generality, are the same for all engineers, disregarding their background. This gives engineers an opportunity to infuse their work with knowledge, methods, and solutions shared by specialists from other engineering fields. When these knowledge, methods, and solutions cross disciplinary boundaries, they are provably relevant to any problem in another domain to which it can be transformed. The suggested meta-level consists of general discrete mathematical models, called combinatorial representations (CR). Specific mathematical basis for the combinatorial representations chosen in this paper is graph theory although other representations are possible. We explain the theory of infused design and carefully contrast it with other approaches. This comparison clearly demonstrates the advantages of infused design and its potential. We conclude with several practical issues related to the introduction of infused design into practice and briefly discuss the role of information systems in infused design. A companion paper includes several examples that demonstrate the details of infused design.
Conference Paper
Full-text available
This paper proposes a novel binary image descriptor, namely the Adaptive Hierarchical Density Histogram, that can be utilized for complex binary image retrieval. This novel descriptor exploits the distribution of the image points on a two-dimensional area. To reflect effectively this distribution, we propose an adaptive pyramidal decomposition of the image into non-overlapping rectangular regions and the extraction of the density histogram of each region. This hierarchical decomposition algorithm is based on the recursive calculation of geometric centroids. The presented technique is experimentally shown to combine efficient performance, low computational cost and scalability. Comparison with other prevailing approaches demonstrates its high potential.
Conference Paper
Full-text available
This paper presents the evaluation of a number of algorithm alternatives for content based retrieval from a database of technical drawings representing patents. The objective is to help patent evaluators in their quest for a possible patent bearing too much similarity with the one under investigation. To achieve this, we have devised a system where images (drawings) are represented using attributed graphs based on the extracted line-patterns or histograms of attributes computed from the graphs. Retrieval is either performed using histogram comparison (see Huet, B. and Hancock, E.R., PAMI, vol.21, no.12, p.1363-70, 1999) or thanks to a graph similarity measure (see Huet and Hancock, CVPR, p.138-43, 1998). Promising results are presented along with possible work extension
Conference Paper
Full-text available
A patent always contains some images along with the text. Many text based systems have been developed to search the patent database. In this paper, we describe PATSEEK that is an image based search system for US patent database. The objective is to let the user check the similarity of his query image with the images that exist in US patents. The user can specify a set of key words that must exist in the text of the patents whose images will be searched for similarity. PATSEEK automatically grabs images from the US patent database on the request of the user and represents them through an edge orientation autocorrelogram. L1 and L2 distance measures are used to compute the distance between the images. A recall rate of 100% for 61% of query images and an average 32% recall rate for rest of the images has been observed.
Article
Technology innovation in electric vehicles is of significant interest to researchers, companies and policy-makers of many countries. Electric vehicles integrate various kinds of distinct technologies and decomposing the overall electric vehicle field into several key domains allows determination of more detailed, valuable information. To provide both broader and more detailed information about technology development in the EV field, unlike most previous studies on electric vehicle innovation which analyzed this field as a whole, this research decomposed the electric vehicle field into domains, which are power electronics, battery, electric motor as well as charging and discharging subdomains and then further extracted the subdomains. Furthermore, In addition, the improvement rates, technology trajectories and major patent assignees in these domains and key subdomains are determined using patents extracted for each domain from the US patent system. The main findings are: (1) The estimated rates of performance improvement per year are 18.3% for power electronics, 7.7% for electric motors, 23.8% for charging and discharging and 11.7% for batteries. The relatively lower improvement rate for electric motors and batteries suggests their potential to hinder the popularization of electric vehicles. Besides, as for the subdomains, the relatively higher technology improvement rate of lithium-ion battery or permanent magnet motor in its domain supports the current trend of battery or motor type quantitively from a patent analysis view. A possible implication for the policy makers encouraging EV development is to issue more incentive plans for innovations in the battery and electric motor domains, especially for lithium-ion battery and permanent magnet motor. (2) The technology trajectories depict the development of four critical subdomains over time, which quantitively proves the focuses and emerging topics of the subdomains and thereby provide guidance to research topic selection. For example, the silicon negative electrode is a promising topic in the subdomain of lithium-ion battery. (3) The key players in the four critical subdomains appear to be Toyota and Honda in hybrid power electronics, E-One Moli Energy Corp in lithium-ion batteries, Panasonic in Permanent Magnet motors and Toyota in discharging. The key players found by the main path method from the view of innovation are also important players in EV from the market view. Other market participants should pay more attention to the adjustment of business strategy of these companies to monitor the market, and make effort to invent important EV related technologies.
Article
The growing developments in general semantic networks, knowledge graphs and ontology databases have motivated us to build a large-scale comprehensive semantic network of technology-related data for engineering knowledge discovery, technology search and retrieval, and artificial intelligence for engineering design and innovation. Specially, we constructed a technology semantic network (TechNet) that covers the elemental concepts in all domains of technology and their semantic associations by mining the complete U.S. patent database from 1976. To derive the TechNet, natural language processing techniques were utilized to extract terms from massive patent texts and recent word embedding algorithms were employed to vectorize such terms and establish their semantic relationships. We report and evaluate the TechNet for retrieving terms and their pairwise relevance that is meaningful from a technology and engineering design perspective. The TechNet may serve as an infrastructure to support a wide range of applications, e.g., technical text summaries, search query predictions, relational knowledge discovery, and design ideation support, in the context of engineering and technology, and complement or enrich existing semantic databases. To enable such applications, we made the TechNet public via an online interface and APIs for public users to retrieve technology related terms and their relevancies.
Article
Modern Machine Learning (ML) techniques are transforming many disciplines ranging from transportation to healthcare by uncovering pattern in data, developing autonomous systems that mimic human abilities, and supporting human decision-making. Modern ML techniques, such as deep neural networks, are fueling the rapid developments in artificial intelligence. Engineering design researchers have increasingly used and developed ML techniques to support a wide range of activities from preference modeling to uncertainty quantification in high-dimensional design optimization problems. This special issue brings together fundamental scientific contributions across these areas.
Article
Inspirational stimuli, such as analogies, are a prominent mechanism used to support designers. However, generating relevant inspirational stimuli remains challenging. This work explores the potential of using an untrained crowd workforce to generate stimuli for trained designers. Crowd workers developed solutions for twelve open-ended design problems from the literature. Solutions were text-mined to extract words along a frequency domain, which, along with computationally derived semantic distances, partitioned stimuli into closer or further distance categories for each problem. The utility of these stimuli was tested in a human subjects experiment (N = 96). Results indicate crowdsourcing holds potential to gather impactful inspirational stimuli for open-ended design problems. Near stimuli improve the feasibility and usefulness of designs solutions, while distant stimuli improved their uniqueness.
Article
The biological domain has the potential to offer a rich source of analogies to solve engineering design problems. However, due to the complexity embedded in biological systems, adding to the lack of structured, detailed, and searchable knowledge bases, engineering designers find it hard to access the knowledge in the biological domain, which therefore poses challenges in understanding the biological concepts in order to apply these concepts to engineering design problems. In order to assist the engineering designers in problem-solving, we report, in this paper, a web-based tool called Idea-Inspire 4.0 that supports analogical design using two broad features. First, the tool provides access to a number of biological systems using a searchable knowledge base. Second, it explains each one of these biological systems using a multi-modal representation: that is, using function decomposition model, text, function model, image, video, and audio. In this paper, we report two experiments that test how well the multi-modal representation in Idea-Inspire 4.0 supports understanding and application of biological concepts in engineering design problems. In one experiment, we use Bloom's method to test “ analysis ” and “ synthesis ” levels of understanding of a biological system. In the next experiment, we provide an engineering design problem along with a biological-analogous system and examine the novelty and requirement-satisfaction (two major indicators of creativity) of resulting design solutions. In both the experiments, the biological system (analogue) was provided using Idea-Inspire 4.0 as well as using a conventional text-image representation so that the efficacy of Idea-Inspire 4.0 is tested using a benchmark.
Article
Despite the fact that inspirational stimuli (e.g., analogies) have been shown to be an effective means to assist designers, little is known about the neurological processes supporting inspired design ideation. To explore the impact of inspirational stimuli on design ideation, an fMRI concept generation task was developed (N = 21). Results demonstrate that inspirational stimuli of any kind (near or far from the problem space) improve the fluency of idea generation. Furthermore, neuroimaging data help to uncover two distinct brain activation networks based upon reasoning with and without inspirational stimuli. We term these inspired internal search and unsuccessful external search. These brain activation networks give insight into differences between ideating with and without inspirational stimuli, and between inspirational stimuli of varying distances.
Article
We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that's pretty swell. It's a little bigger than last time but more accurate. It's still fast though, don't worry. At 320x320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 mAP@50 in 51 ms on a Titan X, compared to 57.5 mAP@50 in 198 ms by RetinaNet, similar performance but 3.8x faster. As always, all the code is online at https://pjreddie.com/yolo/
Article
Three-dimensional (3D) printing (3DP) has received significant attention for its promise of reinventing the way products are manufactured. At times, there have been differences in the expectations of the public of what the technology would be able to do as compared with what the technology was capable of at a certain point in time. Although experts often feel that future improvements will overcome this, there is usually not a sense of how rapidly these future improvements will occur. The semiconductor industry and others have effectively addressed this problem by tracking how quickly the technology improves over time (i.e., Moore's law). This article looks at industrial stereolithography (SLA) 3DP and measures the technological improvement rate empirically at 37.6% per year and compares it with an estimate based on analysis of the patents that have been issued by the United States Patent and Trademark Office in SLA 3DP. This approach has been shown to give reliable estimates and in this SLA 3D case gives a very close estimate of 46.8% per year. We then find representative sets of patents and use the estimating technique to calculate the expected technological improvement rate for four other main types of 3DP: Inkjet-powder-based 3DP (35.0%), metal selective laser sintering (SLS) (33.3%), fused filament fabrication (FFF) (16.7%), nonmetal SLS (21.1%), and for the 3DP domain as a whole (29.4%). Further analysis on the patent sets provides top assignees and the most central patents for each of the 3DP domains. Overall, the improvement rates found support the idea that 3DP is rapidly improving and, therefore, potentially capable of fulfilling its promise but also that some 3DP approaches (particularly FFF and nonmetal SLS) are likely improving at a lower rate.
Technical Report
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively.
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
Article
Engineers and technology firms must continually explore new design opportunities and directions to sustain or thrive in technology competition. However, the related decisions are normally based on personal gut feeling or experiences. Although the analysis of user preferences and market trends may shed light on some design opportunities from a demand perspective, design opportunities are always conditioned or enabled by the technological capabilities of designers. Herein, we present a data-driven methodology for designers to analyze and identify what technologies they can design for the next, based on the principle-what a designer can currently design condition or enable what it can design next. The methodology is centered on an empirically built network map of all known technologies, whose distances are quantified using more than 5 million patent records, and various network analytics to position a designer according to the technologies that they can design, navigate technologies in the neighborhood, and identify feasible paths to far fields for novel opportunities. Furthermore, we have integrated the technology space map, and various map-based functions for designer positioning, neighborhood search, path finding, and knowledge discovery and learning, into a data-driven visual analytic system named InnoGPS. InnoGPS is a global position system (GPS) for finding innovation positions and directions in the technology space, and conceived by analogy from the GPS that we use for positioning, neighborhood search, and direction finding in the physical space.
Conference Paper
There is a continuous demand for novel and innovative products in the market. In order to develop novel ideas, natural systems are considered to be superior source of inspiration. In order to assist designers in ideation, an analogical design tool called Idea Inspire 3.0 is developed; it is a revised version of Idea-Inspire developed in 2005. The latest version is web-based, and supports retrieval, visualizationand addition of systems. It uses a novel, dynamic representation with amulti-system, multi-instance SAPPhIRE model as basis, and a multi-modal explanation for enhanced understanding that shouldlead to better ideation. In this paper, these latest features of Idea-Inspire along with their potential benefits are discussed.
Conference Paper
We propose a novel approach for instance-level image retrieval. It produces a global and compact fixed-length representation for each image by aggregating many region-wise descriptors. In contrast to previous works employing pre-trained deep networks as a black box to produce features, our method leverages a deep architecture trained for the specific task of image retrieval. Our contribution is twofold: (i) we leverage a ranking framework to learn convolution and projection weights that are used to build the region features; and (ii) we employ a region proposal network to learn which regions should be pooled to form the final global descriptor. We show that using clean training data is key to the success of our approach. To that aim, we use a large scale but noisy landmark dataset and develop an automatic cleaning approach. The proposed architecture produces a global image representation in a single forward pass. Our approach significantly outperforms previous approaches based on global descriptors on standard datasets. It even surpasses most prior works based on costly local descriptor indexing and spatial verification. Additional material is available at www. xrce. xerox. com/ Deep-Image-Retrieval.
Conference Paper
We propose a simple and straightforward way of creating powerful image representations via cross-dimensional weighting and aggregation of deep convolutional neural network layer outputs. We first present a generalized framework that encompasses a broad family of approaches and includes cross-dimensional pooling and weighting steps. We then propose specific non-parametric schemes for both spatial- and channel-wise weighting that boost the effect of highly active spatial responses and at the same time regulate burstiness effects. We experiment on different public datasets for image search and show that our approach outperforms the current state-of-the-art for approaches based on pre-trained networks. We also provide an easy-to-use, open source implementation that reproduces our results.
Article
The Bag-of-Words (BoW) model has been predominantly viewed as the state of the art in Content-Based Image Retrieval (CBIR) systems since 2003. The past 13 years has seen its advance based on the SIFT descriptor due to its advantages in dealing with image transformations. In recent years, image representation based on the Convolutional Neural Network (CNN) has attracted more attention in image retrieval, and demonstrates impressive performance. Given this time of rapid evolution, this article provides a comprehensive survey of image retrieval methods over the past decade. In particular, according to the feature extraction and quantization schemes, we classify current methods into three types, i.e., SIFT-based, one-pass CNN-based, and multi-pass CNN-based. This survey reviews milestones in BoW image retrieval, compares previous works that fall into different BoW steps, and shows that SIFT and CNN share common characteristics that can be incorporated in the BoW model. After presenting and analyzing the retrieval accuracy on several benchmark datasets, we highlight promising directions in image retrieval that demonstrate how the CNN-based BoW model can learn from the SIFT feature.
Article
Recently, image representation built upon Convolutional Neural Network (CNN) has been shown to provide effective descriptors for image search, outperforming pre-CNN features as short-vector representations. Yet such models are not compatible with geometry-aware re-ranking methods and still outperformed, on some particular object retrieval benchmarks, by traditional image search systems relying on precise descriptor matching, geometric re-ranking, or query expansion. This work revisits both retrieval stages, namely initial search and re-ranking, by employing the same primitive information derived from the CNN. We build compact feature vectors that encode several image regions without the need to feed multiple inputs to the network. Furthermore, we extend integral images to handle max-pooling on convolutional layer activations, allowing us to efficiently localize matching objects. The resulting bounding box is finally used for image re-ranking. As a result, this paper significantly improves existing CNN-based recognition pipeline: We report for the first time results competing with traditional methods on the challenging Oxford5k and Paris6k datasets.
Article
This paper aims to accelerate the test-time computation of convolutional neural networks (CNNs), especially very deep CNNs that have substantially impacted the computer vision community. Unlike existing methods that are designed for approximating linear filters or linear responses, our method takes the nonlinear units into account. We develop an effective solution to the resulting nonlinear optimization problem without the need of stochastic gradient descent (SGD). More importantly, while current methods mainly focus on optimizing one or two layers, our nonlinear method enables an asymmetric reconstruction that reduces the rapidly accumulated error when multiple (e.g., >=10) layers are approximated. For the widely used very deep VGG-16 model, our method achieves a whole-model speedup of 4x with merely a 0.3% increase of top-5 error in ImageNet classification. Our 4x accelerated VGG-16 model also shows a graceful accuracy degradation for object detection when plugged into the latest Fast R-CNN detector.
Article
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif-ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make train-ing faster, we used non-saturating neurons and a very efficient GPU implemen-tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.
Article
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively.
Article
Parallel to the concept of the human genome and its impact on biology and other disciplines, we revealed a similar concept in engineering sciences, termed the “Interdisciplinary Engineering Knowledge Genome”, which is an organized collection of system and method “genes” that encode instructions for generating new systems and methods in diverse engineering disciplines. Resting on the firm mathematical foundation of combinatorial representations, the Interdisciplinary Engineering Knowledge Genome unifies many engineering disciplines, providing a basis for transforming knowledge between them, supporting new educational practices, promoting inventions, aiding design, and bootstrapping new discoveries in engineering and science. Given the formal underlying combinatorial representations, these merits could be automated. This paper elucidates this new concept and demonstrates its value and power in engineering design.
Article
We envision that the next generation of knowledge-based CAD systems will be characterized by four features: they will be based on cognitive accounts of design, and they will support collaborative design, conceptual design, and creative design. In this paper, we first analyze these four dimensions of CAD. We then report on a study in the design, development and deployment of a knowledge-based CAD system for supporting biologically inspired design that illustrates these four characteristics. This system, called DANE for Design by Analogy to Nature Engine, provides access to functional models of biological systems. Initial results from in situ deployment of DANE in a senior-level interdisciplinary class on biologically inspired design indicates its usefulness in helping designers conceptualize design of complex systems, thus promising enough to motivate continued work on knowledge-based CAD for biologically inspired design. More importantly from our perspective, DANE illustrates how cognitive studies of design can inform the development of CAD systems for collaborative, conceptual, and creative design, help assess their use in practice, and provide new insights into human interaction with knowledge-based CAD systems.
Article
The term data augmentation refers to methods for constructing iterative optimization or sampling algorithms via the introduction of unobserved data or latent variables. For deterministic algorithms,the method was popularizedin the general statistical community by the seminal article by Dempster, Laird, and Rubin on the EM algorithm for maximizing a likelihood function or, more generally, a posterior density. For stochastic algorithms, the method was popularized in the statistical literature by Tanner and Wong’s Data Augmentation algorithm for posteriorsampling and in the physics literatureby Swendsen and Wang’s algorithm for sampling from the Ising and Potts models and their generalizations; in the physics literature,the method of data augmentationis referred to as the method of auxiliary variables. Data augmentationschemes were used by Tanner and Wong to make simulation feasible and simple, while auxiliary variables were adopted by Swendsen and Wang to improve the speed of iterative simulation. In general,however, constructing data augmentation schemes that result in both simple and fast algorithms is a matter of art in that successful strategiesvary greatlywith the (observed-data) models being considered.After an overview of data augmentation/auxiliary variables and some recent developments in methods for constructing such
Article
C-K theory is a unified Design theory and was first introduced in 2003 (Hatchuel and Weil 2003). The name “C-K theory” reflects the assumption that Design can be modelled as the interplay between two interdependent spaces with different structures and logics: the space of concepts (C) and the space of knowledge (K). Both pragmatic views of Design and existing Design theories define Design as a dynamic mapping process between required functions and selected structures. However, dynamic mapping is not sufficient to describe the generation of new objects and new knowledge which are distinctive features of Design. We show that C-K theory captures such generation and offers a rigorous definition of Design. This is illustrated with an example: the design of Magnesium-CO2 engines for Mars explorations. Using C-K theory we also discuss Braha and Reich’s topological structures for design modelling (Braha and Reich 2003). We interpret this approach as special assumptions about the stability of objects in space K. Combining C-K theory and Braha and Reich’s models opens new areas for research about knowledge structures in Design theories. These findings confirm the analytical and interpretative power of C-K theory.
Article
In this article, we discuss the potential benefits, the requirements and the challenges involved in patent image retrieval and subsequently, we propose a framework that encompasses advanced image analysis and indexing techniques to address the need for content-based patent image search and retrieval. The proposed framework involves the application of document image pre-processing, image feature and textual metadata extraction in order to support effectively content-based image retrieval in the patent domain. To evaluate the capabilities of our proposal, we implemented a patent image search engine. Results based on a series of interaction modes, comparison with existing systems and a quantitative evaluation of our engine provide evidence that image processing and indexing technologies are currently sufficiently mature to be integrated in real-world patent retrieval applications.