Failed and Successful Cases of Table Creation Using GPT-4. The table consists of five parts: 4 legs and a tabletop. Although GPT-4 successfully gives a correct composition of the table, GPT-4 outputs a floating tabletop without any human intervention.

Failed and Successful Cases of Table Creation Using GPT-4. The table consists of five parts: 4 legs and a tabletop. Although GPT-4 successfully gives a correct composition of the table, GPT-4 outputs a floating tabletop without any human intervention.

Source publication
Preprint
Full-text available
The advancement of Large Language Models (LLMs), including GPT-4, provides exciting new opportunities for generative design. We investigate the application of this tool across the entire design and manufacturing workflow. Specifically, we scrutinize the utility of LLMs in tasks such as: converting a text-based prompt into a design specification, tr...

Contexts in source publication

Context 1
... visualizing the 3D table, however, the relative positioning of each pair of boxes was not always accurate. We noticed that the tabletop appeared to be suspended in the air, not in contact with the legs, as shown in Figure 4. This difficulty, also observed in our 2D tests (Figure 3), pertains to GPT-4's understanding of mathematical concepts. ...
Context 2
... indicated the necessary translations for the misplaced boxes, acknowledging that it would take several prompts to rectify the issue otherwise. After correcting the floating tabletop, the table appeared as intended, as demonstrated in Figure 4. Therefore, to create a table, it only required two prompts, significantly streamlining the procedure for generating a basic table. ...
Context 3
... choice provides compact and comprehensive representations, capturing intricate details accurately. Leveraging the Python library trimesh, we effectively manage and process the shape data extracted from the natural language input (Figure 40). ...
Context 4
... experiment focused on assembling a wooden box using a specific set of tools and materials. In Figure 44, we presented the prompt for generating machine-readable instructions, which involved creating a set of functions to Task: generate Python code based on the following rules. Input: a few functions of box(x,y,z,w,h,d) which creates a box whose center is (x,y,z) and size is (w,h,d) and their parameters. ...
Context 5
... were tested using different input styles (e.g., method of design description) and requested output forms (e.g., direct classification or function creation). Demonstrative examples are shown in Figure 45. We did not test all combinations of design style and output requests but focused on key comparisons and types. ...
Context 6
... began with a simple input design in text form (DS1) and a request for direct evaluation in calculated form (RF1) with an additional binary output asking whether a chair of a given design could support a given load. The specific prompt is included in Figure 46. GPT-4 immediately demonstrated the capacity to handle ambiguity well, assuming a type of wood (oak) and producing numerical material properties for that material when both were unspecified. ...
Context 7
... evaluated the failure point by comparing the yield stress to compressive stress, computed as one quarter of the applied load over the cross-section of a chair leg. This is included in the chat snippet shown in Figure 46. However, in text form it outputted 94,692.2 ...
Context 8
... was able to readily add multiple types of failure without error, also incorporating bending failure of the seat, and excessive stress on the back using simple beam bending and structural mechanics equations. This multi-part failure assessment is included in Figure 46. It further automatically generated a function that could intake a parametric chair design with sensible feature parameters like leg_cross_sectional_area, seat_thickness and seat_material_bending_strength, allowing versatile use of this evaluation. ...
Context 9
... we requested GPT-4 to create a function to compute the spoon's breaking strength. Since it had been inadvertently primed by the long preceding discussion of spoon geometry, it proposed a strength evaluation using the basic heuristic of whether the spoon is within a standard size range (Figure 47). GPT-4 had to be prompted specifically for a yield analysis before offering a mechanics-based equation. ...
Context 10
... such, we propose the workflow for rigorous performance evaluation using GPT-4 to begin with a text-based discussion of the design (DS1 or DS2 with RF1) to understand the relevant features, with no other preceding text in that chat, followed by the development of equations with enough sophistication for the use case, presented in the form of functions for rapid assessment of an input design (RF2). This workflow is depicted in Figure 48, along with additional steps to ideally validate the final result. ...
Context 11
... internal knowledge, it pattern-matches and does not reason at a level to generate the most correct or sophisticated analysis, and will tend to generate more simple rather than more complex equation-based analysis unless specifically walked through refining the code. However, it is capable of more sophisticated text-based discussion, which is Fig. 47. Chat History Errors and Correction for Spoon Mechanics. When analyzing mechanics of a spoon after discussing dimensions in the preceding chat, GPT-4 generated a poor heuristic for spoon breaking from geometry alone; with very specific correction in the same chat, it recovered. why we have found that beginning with text and proceeding ...
Context 12
... analyzing mechanics of a spoon after discussing dimensions in the preceding chat, GPT-4 generated a poor heuristic for spoon breaking from geometry alone; with very specific correction in the same chat, it recovered. why we have found that beginning with text and proceeding to functions provides a more effective workflow, as in Figure 48. ...
Context 13
... next explored the assessment of dynamic electronic device, a quadcopter, as an example of using the workflow of Figure 48. GPT-4 was provided with specifications for the quadcopter that included battery voltage, battery capacity, total weight, and the dimensions of the copter (DS1). ...
Context 14
... this evaluation, GPT-4 did not initially include the constraint that the voltage of the controller needed to stay constant, even though this would be obvious to someone familiar with the domain of knowledge. This means that seemingly "obvious" considerations need to be explicitly included in the prompt in order for a feasible output Fig. 48. Suggested Performance Workflow. Performance analysis proceeds smoothly when using GPT-4 first discusses the design and tradeoffs in text form, then creates methods to assess performance, before applying them to the design in question, with iterations within and between sections as ...
Context 15
... primary focus was determining the likelihood of a chair breaking when subjected to external forces. Figure 49 lists the response and final code generated by GPT-4. With the application of FEM through the external library FEniCS, GPT-4 evaluates the von Mises stress, a crucial parameter in material failure prediction. ...
Context 16
... stress distribution visualization in Figure 49 is performed on the chair previously designed by GPT-4 in Figure 9 and is the output of GPT-4's code rendered in Paraview (which GPT-4 also gives assistance to use), as well as on a chair mesh found from other sources. The result reveals a susceptibility to high stress at the back attachment section of the chair design proposed by GPT-4, as seen in Figure 9. ...
Context 17
... a follow-up experiment (Fig. 54), GPT-4 is asked to perform the same optimization task via Python code, which enables it to use an external library. It chooses L-BFGS-B, which is a reasonable, standard, and easily accessible (though not state-of-the-art) solver for continuous valued problems. It does not, however, provide gradients that can expedite the computation ...
Context 18
... using the L-BFGS-B method with the scipy.optimize.minimize function, as in the provided Python code, would be a good approach for finding the minimum of the function within the given bounds. Fig. 54. GPT-4 Informs User on Choice of Optimization Method. When prompted to choose an alternative optimization method in Python to the Wolfram Alpha plugin, GPT-4 provides a script and a reasonable explanation for the chosen ...
Context 19
... GPT-4 must be prompted again to enforce the bounds consistently throughout the algorithm (specifically, in the crossover and selection operators). Results can be found in Figure 64. Through this example, we conclude that GPT-4 has the potential to aid users in both a) understanding the trade-offs involved in different candidate designs, and b) providing pointers to a reasonable algorithm that can help navigate that space. ...
Context 20
... was successfully accomplished using GPT-4. The detailed process is elaborated in Section 6.3, and the selected parts are shown in Figure 74 (left). ...
Context 21
... resulting fabricated copter frame not only meets the required dimensions, but also balances the strength and weight, necessary for optimum flight performance. We visualize the printed frame in Figure 74 (middle). ...
Context 22
... four motors are attached using screws. All elements are affixed firmly and stably, resulting in a sturdy copter ready for flight, as shown in Figure 74 (right). ...
Context 23
... Whereas the other dimensions of the solid are defined by the sketch primitives. End of the example. Additional Constraints Use exposed design variables whenever you can, and as few as possible. Write code in syntactically correct python, knowing that you have the functions createSketch, circle, rectangle, cap and extrude and the default planes. Fig. 84. A Sketch-Based CAD DSL Prompt with a Global Coordinate System. Our prompt used for the sketch-based CAD experiments with a global coordinate ...

Similar publications

Preprint
Full-text available
Large language models (LLMs) have revolutionized natural language processing with their exceptional capabilities. However, deploying LLMs on resource-constrained edge devices presents significant challenges due to computational limitations, memory constraints, and edge hardware heterogeneity. This survey summarizes recent developments in edge LLMs...

Citations

... The advent of generative artificial intelligence (GenAI) has ushered a momentum of innovation in the field of additive manufacturing (AM), where the boundaries between the physical and digital worlds blur, giving rise to what has been termed Industry 5.0 [1][2][3]. At the forefront of this transformation are Large Language Models (LLMs), monumental advancements in natural language processing that have ignited a paradigm shift in how we approach design and manufacturing, particularly within the field of engineering, biomedicine, and biotechnological applications [4][5][6]. In recent years, the development and proliferation of LLMs, such as GPT-4, Gemini, Llama, and Microsoft Co-Pilot, have unleashed unprecedented capabilities for understanding and generating human-like text and image [7][8][9][10]. ...
... GPT-4 has demonstrated its ability to formulate design spaces, set objectives, and define constraints. It can also select suitable search algorithms for given problems, highlighting its utility as a foundational component in creating inverse design systems (Makatura et al., 2023). ...
Book
Full-text available
In “AI-Driven Architecture: Pioneering the Digital Frontier”, explore how artificial intelligence is reshaping architecture and urban design, pushing the limits of what’s possible. From AI-supported design processes and evolving transparency in algorithms to policy impacts on smart transportation and new paradigms in architectural education, this book delves into the intersection of technology and creativity. Discover how AI is integrated into design processes, drives innovations in conceptual design, and even re-imagines the concept of infinity in architecture. Featuring case studies, thought-provoking insights, and practical examples, this book offers a comprehensive guide to the transformative power of AI in shaping the built environment.
... With the rise of groundbreaking tools such as ChatGPT, generative artificial intelligence (AI) has attracted significant attention, revolutionizing how AI is applied in various fields, including engineering design [1], [2]. This trend has seen the adoption of AI techniques in numerous design activities such as topology optimization, material design, design synthesis, and product design [3], [4], [5]. ...
... Current literature throws up ideas on utilizing LLMs, e.g. [31], or VFMs, e.g., [28][29][30]32], in the industrial domain; little is known about how to enable VFM to perform effectively in specific use cases. Besides having suitable datasets, training with the data demands specific strategies. ...
Preprint
Full-text available
In recent years, the upstream of Large Language Models (LLM) has also encouraged the computer vision community to work on substantial multimodal datasets and train models on a scale in a self-/semi-supervised manner, resulting in Vision Foundation Models (VFM), as, e.g., Contrastive Language-Image Pre-training (CLIP). The models generalize well and perform outstandingly on everyday objects or scenes, even on downstream tasks, tasks the model has not been trained on, while the application in specialized domains, as in an industrial context, is still an open research question. Here, fine-tuning the models or transfer learning on domain-specific data is unavoidable when objecting to adequate performance. In this work, we, on the one hand, introduce a pipeline to generate the Industrial Language-Image Dataset (ILID) based on web-crawled data; on the other hand, we demonstrate effective self-supervised transfer learning and discussing downstream tasks after training on the cheaply acquired ILID, which does not necessitate human labeling or intervention. With the proposed approach, we contribute by transferring approaches from state-of-the-art research around foundation models, transfer learning strategies, and applications to the industrial domain.
... With the development of generative AI, in particular large language models (LLMs), design has taken a new shift in many ways. Broadly situated, the LLMs potential in design lies in enabling five key tasks: converting text prompts to design specifications, converting designs to manufacturing instructions, creating design spaces and variations, calculating performance, and exploring performance-based design solutions [14]. Additionally, in the early design phases, a student designer in need of better representations and appreciations of an imagined product can choose to visualise concepts in high fidelity without the need for prototyping or expertise in visualisation techniques, such as rendering, and to vary the aesthetics of those concept designs through chatbot-type interfaces. ...
Article
Despite the power of large language models (LLMs) in various cross-modal generation tasks, their ability to generate 3D computer-aided design (CAD) models from text remains underexplored due to the scarcity of suitable datasets. Additionally, there is a lack of multimodal CAD datasets that include both reconstruction parameters and text descriptions, which are essential for the quantitative evaluation of the CAD generation capabilities of multimodal LLMs. To address these challenges, we developed a dataset of CAD models, sketches, and image data for representative mechanical components such as gears, shafts, and springs, along with natural-language descriptions collected via Amazon Mechanical Turk. Using CAD programs as a bridge, we facilitate the conversion of textual output from LLMs into precise 3D CAD designs. To enhance the text-to-CAD generation capabilities of GPT models and demonstrate the utility of our dataset, we developed a pipeline to generate fine-tuning training data for GPT-3.5. We fine-tuned four GPT-3.5 models with various data sampling strategies based on the length of a CAD program. We evaluated these models using parsing rate and Intersection over Union (IoU) metrics, comparing their performance to that of GPT-4 without fine-tuning. The new knowledge gained from the comparative study on the four different fine-tuned models provided us with guidance on the selection of sampling strategies to build training datasets in fine-tuning practices of LLMs for text-to-CAD generation, considering the trade-off between part complexity, model performance, and cost.
Article
Conceptual design is an essential stage in the design process, and its ultimate success largely depends on designers’ creativity. Both physical and digital prototypes are commonly adopted by designers to support ideation and creativity, providing intuitive perception and rapid iteration, respectively. In recent advancements, large-scale generation models are able to offer data-enabled creativity support by generating high-quality solutions comparable to human designers. This opens up an imaginary space for designers and brings new possibilities for design tools. In this study, we proposed a hybrid prototype method that synergistically combines physical models and generative artificial intelligence (AI) in the conceptual design stage. Correspondingly, we developed a hybrid prototype system to implement the proposed method. We conducted a comparative user study with 45 designers who completed a design task using the physical prototype method, standalone generative AI, and the hybrid prototype method, respectively. Our results verified the effectiveness of the hybrid prototype method and investigated its mechanism in supporting creativity. Finally, we discussed the application value and optimisation space of the hybrid prototype method.