Figure - uploaded by Shuo Jiang
Content may be subject to copyright.
Source publication
The patent database is often used by designers to search for inspirational stimuli for innovative design opportunities because of the large size, extensive variety and the massive quantity of design information contained in patent documents. Growing work on design-by-analogy has adopted various vectorization approaches for associating design docume...
Contexts in source publication
Context 1
... COM had been successfully applied in several prior studies [80,81]. The relevancy rate of the patents in each domain is reported in Table 1 [79,82]. Note that some irrelevant patents involved in this data set can also be viewed as far-field stimuli for designers in design-by-analogy process. ...Context 2
... alternative network parameters in the training might potentially lead to better results. In addition, while very Table 4 The ID and titles of the patents containing the patent images in Fig. 9 Patent ID Title Query US4227853 Manipulator wrist tool interface 1 US5467889 Nestable elastic fuel tank and method for making same 2 US4251040 Wind driven apparatus for power generation 3 US6623235 Robot arm edge gripping device for handling substrates… 4 US7571526 Multi-blade router tool, edger with multi-blade router tool… 5 US6837462 Boom load alleviation using visual means 6 US8844860 Foldable rise and stare vehicle 7 US6231002 System and method for defending a vehicle 8 US8777165 Aircraft fuel system 9 US6116845 Apparatus for supporting a workpiece for transfer Fig. 10 Top nine most proximate patent images for the given query image in the database as retrieved by the pre-trained ResNet50 model often images can be self-explanatory and specific enough to provide inspiration, sometimes it might be difficult for designers to make sense of an image alone. In such cases, adding related text content to the image might aid the sense making of the image. ...Citations
... In addition, Zhang and Jin proposed an unsupervised deep learning model, Sketch-pix2seq, to extract shape features from Quickdraw sketches, creating a latent space that enables defining visual similarities and searching for analogical sketches [74]. Jiang et al. developed a CNN-based model to create feature vectors representing patent images, which combine visual and technological information to enhance visual stimuli retrieval [37]. While these models present promising methods to support image-based analogy-driven design, the specialized nature of these models reduces their practical appeal to designers seeking exposure to out-of-distribution inspirations. ...
With recent advancements in the capabilities of Text-to-Image (T2I) AI models, product designers have begun experimenting with them in their work. However, T2I models struggle to interpret abstract language and the current user experience of T2I tools can induce design fixation rather than a more iterative, exploratory process. To address these challenges, we developed Inkspire, a sketch-driven tool that supports designers in prototyping product design concepts with analogical inspirations and a complete sketch-to-design-to-sketch feedback loop. To inform the design of Inkspire, we conducted an exchange session with designers and distilled design goals for improving T2I interactions. In a within-subjects study comparing Inkspire to ControlNet, we found that Inkspire supported designers with more inspiration and exploration of design ideas, and improved aspects of the co-creative process by allowing designers to effectively grasp the current state of the AI to guide it towards novel design intentions.
... Additionally, these approaches predominantly focus on patent textual information, neglecting the technical insights embedded in patent images. While prior studies on patent images have concentrated on image retrieval [22], dataset construction [23], and model optimization [24], which have largely overlooked the influence of deep semantic information in patent images on infringement analysis. Effectively utilizing semantic knowledge and patent images is a promising approach for addressing cases where textual content is dissimilar but images exhibit high similarity. ...
Patent infringement analysis (PIA) represents a critical task in patent circumvention design. It aims to identify the likelihood of infringement for target technologies to enhance product innovation, serving as a crucial measure for technical protection. Traditional PIA processes rely on examiners’ subjective judgments on the patent’s technical advantages and textual similarity, leading to the infringement results heavily dependent on personal experience. Pervious similarity calculation models based on keywords or textual content lack the utilization of knowledge of functions, structures, and features within patents and their interrelations, resulting in inaccurate assessments of technological infringement for similar patents. Furthermore, the potential of patent images in infringement analysis remains underutilized. To overcome the issues, a patent knowledge graph (PKG) driven patent similarity calculation model fusing graph similarity (GS) and image similarity (IS) is proposed to achieve the prediction of the probability of patent infringement using patent text and structure images as multimodal data. First, an ontology model based on requirement-function-structure-location features (RFSL) is constructed. Four entity types are extracted from patent texts and images using a fine-tuned named entity recognition (NER) model combined with semantic relation analysis. Second, syntactic matching rules for eight types of entity relationships are constructed to extract triples, mapping patent texts into graph networks via the PKG. Finally, the integrated Graphical Neural Network (GNN) and Convolutional Neural Network (CNN) are integrated to calculate the overall similarity between the newly filed patent and the comparison patent to output the infringement probability. A case study of steel pipe welding device design is used to validate the proposed approach, then the comparison results confirm the potential of fusion GS and IS in the application in PIA.
... In addition to heuristic reasoning for designing knowledge entities, Chen et al. [33] proposed a method to inspire designers to think (QCue) based on the human reflective process, which mainly promotes thinking by answering users' questions and asking them rhetorical questions. Jiang et al. [34] proposed a convolutional neural network for extracting the feature vectors of patent images for image classification and retrieval. Bai et al. [35] proposed a DMGAN model for transforming sketch features into real image features to retrieve the realistic counterparts of their design concepts. ...
Large language model (LLM) and Crowd Intelligent Innovation (CII) are reshaping the field of engineering design and becoming a new design context. Generative generic-field design can solve more general design problems innovatively by integrating multi-domain design knowledge. However, there is a lack of knowledge representation and design process model in line with the design cognition of the new context. It is urgent to develop generative generic-field design methods to improve the feasibility, innovation, and empathy of design results. This study proposes a method based on design cognition and knowledge reasoning. Firstly, through the problem formulation, a generative universal domain design framework and knowledge base are constructed. Secondly, the knowledge-based discrete physical structure set generation method and system architecture generation method are proposed. Finally, the application tool Intelligent Design Assistant (IDA) is developed, verified, and discussed through an engineering design case. According to the design results and discussion, the design scheme is feasible and reflects empathy for the fuzzy original design requirements. Therefore, the method proposed in this paper is an effective technical scheme of generative generic-field engineering design in line with the design cognition in the new context.
... In addition, Zhang and Jin proposed an unsupervised deep learning model, Sketch-pix2seq, to extract shape features from Quickdraw sketches, creating a latent space that enables defining visual similarities and searching for analogical sketches [74]. Jiang et al. developed a CNN-based model to create feature vectors representing patent images, which combine visual and technological information to enhance visual stimuli retrieval [37]. While these models present promising methods to support image-based analogy-driven design, the specialized nature of these models reduces their practical appeal to designers seeking exposure to out-of-distribution inspirations. ...
... The integration of patent's textual and visual information presents a promising opportunity to the ED process. This integration can significantly contribute to (1) supporting creative thinking and concept generation with design stimuli (Jiang et al., 2021); (2) developing a more systemic understanding of design artifacts (Jiang et al., 2022); (3) improving the performance of patent retrieval and prior art mapping tasks, thereby facilitating designers' access to a broader range of existing solutions which mitigate the risk of design fixation (Atherton et al., 2018). In the context of patent documents, many international patent offices provide free access to their patent databases organizing in only one structured database a huge amount of technical information from around the globe. ...
Images provide concise representations of design artifacts and emerge as the primary mode of communication among innovators, engineers, and designers. The advanced of Artificial Intelligence tools which integrates image and textual information can significantly support the Engineering Design process. In this paper we create 5 different datasets combining both images and text of patents and we develop a set of text-based metrics to assess the quality of text for multimodal applications. Finally, we discuss the challenges arising in the development of multimodal patent datasets.
... • Technical Drawing Classification. Sketch classification methods [57][58][59] have been proposed to recognize sketch images from the Web. However, these datasets contain a limited number of object categories and viewpoints. ...
Recent advances in computer vision (CV) and natural language processing have been driven by exploiting big data on practical applications. However, these research fields are still limited by the sheer volume, versatility, and diversity of the available datasets. CV tasks, such as image captioning, which has primarily been carried out on natural images, still struggle to produce accurate and meaningful captions on sketched images often included in scientific and technical documents. The advancement of other tasks such as 3D reconstruction from 2D images requires larger datasets with multiple viewpoints. We introduce DeepPatent2, a large-scale dataset, providing more than 2.7 million technical drawings with 132,890 object names and 22,394 viewpoints extracted from 14 years of US design patent documents. We demonstrate the usefulness of DeepPatent2 with conceptual captioning. We further provide the potential usefulness of our dataset to facilitate other research areas such as 3D image reconstruction and image retrieval.
... With the development of deep learning, some researchers focus on deep learning to extract image features to improve the performance of retrieval functions. For instance, Jiang [9] extracted image features of design patents under the massive patent image by training a convolutional neural network to optimize the performance of the retrieval system. Besides, an automatic vectorization using a novel convolutional neural network architecture, dual visual geometry group (VGG), was proposed in Lu [15]. ...
In the process of patent retrieval, the traditional content-based single image retrieval method mainly has the following two reasons: a) semantic deviation caused by text description, b) the similarity of a single pixel in the image is high but the whole is inconsistent. Low accuracy leads to unsatisfactory retrieval results, which makes it difficult to obtain product design information timely and effectively and reduces design efficiency. How to obtain data quickly and accurately has become a challenging problem. In this paper, by analyzing the problems existing in Locarno classification method, combined with the characteristics of the patent image, a new improved method is proposed. Firstly, the structural features of product parts are extracted through segmentation. Subsequently, combine it with multi-view image fusion to determine the spatial shape of product parts jointly. Finally, the spatial shape of key structures is confirmed to refine the specific search range as well as improve the search accuracy. The feasibility and effectiveness of the proposed method are verified by taking a shower appearance patent as an example.
... Zhang and Jin (2021) used an unsupervised deep-learning model to construct a latent space for a dataset of sketches. Jiang et al. (2021) constructed a convolutional neural network-based model to derive a vector space where feature vectors embed visual and technology-related information from patent images. Kim and Maher (2023) developed a co-creative artificial intelligence (AI) partner that provides inspirational sketches related by visual and conceptual similarity to designers' sketches. ...
External sources of inspiration can promote the discovery of new ideas as designers ideate on a design task. Data-driven techniques can increasingly enable the retrieval of inspirational stimuli based on nontext-based representations, beyond semantic features of stimuli. However, there is a lack of fundamental understanding regarding how humans evaluate similarity between non-semantic design stimuli (e.g., visual). Toward this aim, this work examines human-evaluated and computationally derived representations of visual and functional similarities of 3D-model parts. A study was conducted where participants (n=36) assessed triplet ratings of parts and categorized these parts into groups. Similarity is defined by distances within embedding spaces constructed using triplet ratings and deep-learning methods, representing human and computational representations. Distances between stimuli that are grouped together (or not) are determined to understand how various methods and criteria used to define non-text-based similarity align with perceptions of 'near' and 'far'. Distinct boundaries in computed distances separating stimuli that are 'too far' were observed, which include farther stimuli when modeling visual vs. functional attributes.
... Furthermore, visual representations would extend the model to be able to account for wordform similarity in logographic languages such as Mandarin. Techniques such as singular value decomposition and artificial neural networks can be employed to extract features from images (Dharmaretnam & Fyshe, 2018;Jiang et al., 2021;Tseng & Hsieh, 2019;Vokey & Jamieson, 2014;Vokey et al., 2018;Wang et al., 2013), and these features can then be used to represent images in vector format. Therefore, there are techniques available to represent both phonological and visual information. ...
Recent research on item-method directed forgetting demonstrates that forget instructions not only decrease recognition for targets, but also decrease false recognition for foils from the same semantic categories as targets instructed to be forgotten. According to the selective rehearsal account of directed forgetting, this finding suggests that remember instructions may engage elaborative rehearsal of the category-level information of items. In contrast to this explanation, Reid and Jamieson (Canadian Journal of Experimental Psychology / Revue canadienne de psychologie expérimentale, 76(2), 75–86, 2022) proposed that the differential rates of false recognition may emerge at retrieval when foils from “remember” and “forget” categories are compared to traces in memory. Using MINERVA S, an instance model of memory based on MINERVA 2 that incorporates structured semantic representations, Reid and Jamieson successfully simulated lower false recognition for foils from “forget” categories without assuming rehearsal of category-level information. In this study, we extend the directed forgetting paradigm to categories consisting of orthographically related nonwords. Presumably participants would have difficulty rehearsing category-level information for these items because they would have no pre-experimental knowledge of these categories. To simulate the findings in MINERVA S, we imported structured orthographic representations rather than semantic representations. The model not only predicted differential rates of false recognition for foils from “remember” and “forget” categories, but also predicted higher rates of false recognition overall than what was observed for semantic categories. The empirical data closely matched these predictions. These data suggest that differential rates of false recognition due to remember and forget instructions emerge at retrieval when participants compare recognition probes to traces stored in memory.
... Employing verbal, textual, and visual sources of inspiration might help to get beyond this conceptual production roadblock. [7][8][9]. Recently, AI-powered software has also been used to create inspiration for designers. For example, Wang et al. [10] created a datadependent idea network employing resources from the web and scientific publications. ...
... Creative ideas and concepts produced during the conceptual design phase significantly contribute to the final product's success [31,32]. Although the conceptual design is a cognitive process, verbal, written, and visual sources of inspiration are generally used to produce creative ideas and concepts [7][8][9]. One of them is Arthur Koestler's "bisociation" model, which will create inspiration in the designer's mind by bringing together different concepts [33]. ...
New, creative and innovative ideas that will be created in the early stages of the design process are very important to develop better and original products. Human designers may become overly attached to certain design ideas that hinder the thinking process toward generating new concepts. This situation can prevent the creation of ideal designs. Finding original design ideas requires a creative mind, knowledge, experience and talent. In addition, verbal, written, and visual sources of inspiration can be helpful and inspiring for generating ideas and concepts. In this study has been created a visual integration model using a data-supported Artificial Intelligence (AI) method to generate creative design ideas. A generative adversarial network model (GAN) is proposed, which produces new creative product images inspired by nature with the combination of a target object and biological object images. This model has been successfully applied to an aircraft design problem, tested and evaluated. The sketches obtained with the generative design model can inspire the designer to find new/creative design ideas and variants. This approach can increase the quality of the ideas produced and make the idea and concept production process easy, simple and quick.