Wojciech Matusik’s research while affiliated with Massachusetts Institute of Technology and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (400)


Modular Self-Reconfigurable Continuum Robot for General Purpose Loco-Manipulation
  • Article

February 2025

·

39 Reads

IEEE Robotics and Automation Letters

·

Haokai Xu

·

·

[...]

·

Yue Chen

Modular Self-Reconfigurable Robots offer exceptional adaptability and versatility through reconfiguration, but traditional rigid robot designs lack the compliance necessary for effective interaction with complex environments. Recent advancements in modular soft robots address this shortcoming with enhanced flexibility; however, their designs lack the capability of active self-reconfiguration and heavily rely on manual assembly. In this paper, we present a modular self-reconfigurable soft continuum robotic system featuring a continuum backbone and an omnidirectional docking mechanism. This design enables each module to independently perform loco-manipulation and self-reconfiguration. We then propose a kinetostatic model and conduct a geometrical docking range analysis to characterize the robot's performance. The reconfiguration process and the distinct motion gait for each configuration are also developed, including rolling, crawling, and snake-like undulation. Experimental demonstrations show that both single and multiple connected modules can achieve successful loco-manipulation, adapting effectively to various environments. The design is open sourced at https://missinglight.github.io/assets/project/msrcr.html .


A micro-ChromaDot array with AI integration for the detection of multiple biomarkers in a small portable device

January 2025

·

7 Reads

With the rapid growth of digital healthcare, diagnosis, prognosis, and monitoring of chronic and acute diseases at home are increasingly in demand. In light of the intricate biological processes underlying many diseases, a single biomarker is often insufficient for disease diagnosis, whereas multiplexed measurements of biomarkers remain a significant challenge at point-of-care (POC). Herein, we introduce a micro-ChromaDot array (McDa) designed to qualify a panel of biomarkers at POC in a one-biomarker-one-dot fashion. The McDa colorimetric signals were engraved from a 3D microneedle array with chem-spatial signal amplification cascade, which allowed seamless and precise data acquisition and analysis by smartphone. This home-use and cost-effective platform achieved similar specificity and superior sensitivity to conventional enzyme-linked immunoassay (ELISA), a clinic assay that typically requires a fully equipped laboratory, and trained personnel, and is time-consuming. Five biomarkers of clinical sera, collected from a cohort of systemic lupus erythematosus (SLE) patients (n = 42) and healthy controls (n = 34), were measured by McDa and modeled with deep learning. By leveraging the support vector machines (SVM) model combined with 5-fold cross-validation, we demonstrated that a selected panel of three biomarkers effectively discriminated SLE patients from healthy controls with a remarkable 97% specificity, representing an impressive 70% improvement over the clinical standard of the single biomarker antinuclear antibodies (ANA) test, while sustaining the same 93% sensitivity. This platform can easily adapt to assess other biomarkers with a mere drop of blood or any other body liquids and holds great promise to transform POC.


VLMaterial: Procedural Material Generation with Large Vision-Language Models

January 2025

·

2 Reads

Procedural materials, represented as functional node graphs, are ubiquitous in computer graphics for photorealistic material appearance design. They allow users to perform intuitive and precise editing to achieve desired visual appearances. However, creating a procedural material given an input image requires professional knowledge and significant effort. In this work, we leverage the ability to convert procedural materials into standard Python programs and fine-tune a large pre-trained vision-language model (VLM) to generate such programs from input images. To enable effective fine-tuning, we also contribute an open-source procedural material dataset and propose to perform program-level augmentation by prompting another pre-trained large language model (LLM). Through extensive evaluation, we show that our method outperforms previous methods on both synthetic and real-world examples.


FIG. 2. Physical explanation of the scaling for intrinsic fracture energy of stretchable networks. (a) A case study simulates a notched triangular network of strands with linear mechanics (K 2 =K 1 ≈ 1) and loads from the undeformed state (i) until bridging strand fracture and (ii) then quasistatically reduces artificial forces on the ends of the broken strand (iii) until the network reaches equilibrium (iv). (b) The integration of the tracked strand loading force (red) and nonlocal energy release (blue) as a function of length between strand ends explains the critical energy release rate G c to break the bridging strand. This quantity aligns with the measured Γ 0 =M and scales with f f L f in elastic networks. (c) A second case study repeats the procedure for a network of strands with high nonlinearity (K 2 =K 1 ≈ 10 4 ) but the same f f and L f . (d) While the single-strand energy (U strand , red) is much smaller than f f L f , the total integration of the single strand and nonlocal contributions counterbalance and scale with f f L f . (e) Simulation results depict extension from triangular (n loop ¼ 3) to square (n loop ¼ 4) and hexagonal (n loop ¼ 6) lattices for strands with high nonlinearity (K 2 =K 1 ≈ 10 4 ). The measured α parameter scales with the relaxation length or difference between ðn loop − 1ÞL f and L f [i.e., α ∼ ðn loop − 2Þ]. All results are numerically derived from simulations. (f) Measured Γ 0 =M normalized by f f L f , which gives α, is plotted against K 2 =K 1 across lattices.
Scaling Law for Intrinsic Fracture Energy of Diverse Stretchable Networks
  • Article
  • Full-text available

January 2025

·

153 Reads

·

1 Citation

Physical Review X

Networks of interconnected materials permeate throughout nature, biology, and technology due to exceptional mechanical performance. Despite the importance of failure resistance in network design and utility, no existing physical model effectively links strand mechanics and connectivity to predict bulk fracture. Here, we reveal a scaling law that bridges these levels to predict the intrinsic fracture energy of diverse stretchable networks. Simulations and experiments demonstrate its remarkable applicability to a breadth of strand constitutive behaviors, topologies, dimensionalities, and length scales. We show that local strand rupture and nonlocal energy release contribute synergistically to the measured intrinsic fracture energy in networks. These effects coordinate such that the intrinsic fracture energy scales independent of the energy to rupture a strand; it instead depends on the strand rupture force, breaking length, and connectivity. Our scaling law establishes a physical basis for fracture of homogeneous networks with uniform strand mechanics and lattice connectivity throughout. The scaling also extends generally for fabricating tough materials from homogeneous networks across multiple length scales. Published by the American Physical Society 2025

Download

Figure 3: Typical workflow using WiReSens Toolkit to develop tactile sensing applications. Users (A) set parameters via a JSON file, (B) flash a base firmware program to the tactile sensing device using the Arduino library to specify what on-device methods should run once configured, and (C) use the Python library to run various methods on the sensing devices and the receiver according to the JSON configuration.
Figure 4: (A) Hardware open-sourced by WiReSens Toolkit including an ESP32 microcontroller, zero potential readout circuit, and adaptive module and (B) Schematic of general zero potential readout circuit (left) with additional opamp and digital potentiometer for adaptivity (red).
Figure 7: Characterization of digital potentiometer and calibration. (A) Average and standard deviation of ADC readout in the region of applied force (N) for different digital potentiometer resistance values í µí± pot , in Ω. (B) Average ADC readout in the region of applied force before and after calibration during low and high-pressure application cycles. Low-pressure calibration maximizes pressure resolution (blue and yellow curves) and Highpressure calibration avoids saturation (red and brown curves).
Figure 8: Characterization of intermittent sending performance. (A) Simulated and observed average current draw (mA) from a tactile sensing device as a function of the percentage of data transmitted during BLE operations. (B) Visualization of objective function used to select intermittent sending parameters. (C) Average ADC readout, received packet count, and current draw over time during periods of no applied pressure (Idle) and repeated pressure (Active). Low current drain indicates power is saved during Idle periods.
Figure 9: Wireless Musical Gloves: (A) Depiction of tactile sensing array, with readout circuit affixed to the arm via velcro straps (B) Average ADC readout in the Thumb, Index, Middle, Ring, and Little finger regions of one glove during "Mary Had a Little Lamb", before and after calibration, showing higher sensitivity after calibration.
WiReSens Toolkit: An Open-source Platform towards Accessible Wireless Tactile Sensing

November 2024

·

42 Reads

Tactile sensors present a powerful means of capturing, analyzing, and augmenting human-environment interactions. Accelerated by advancements in design and manufacturing, resistive matrix-based sensing has emerged as a promising method for developing scalable and robust tactile sensors. However, the development of portable, adaptive, and long lasting resistive tactile sensing systems remains a challenge. To address this, we introduce WiReSens Toolkit. Our platform provides open-source hardware and software libraries to configure multi-sender, power-efficient, and adaptive wireless tactile sensing systems in as fast as ten minutes. We demonstrate our platform's flexibility by using it to prototype several applications such as musical gloves, gait monitoring shoe soles, and IoT-enabled smart home systems.


High Throughput Li-Ceramic Battery Electrolyte Property Optimization Based on Efficient Sample Space Exploration

November 2024

·

6 Reads

ECS Meeting Abstracts

Solid-state batteries (SSBs) are safer, cheaper, and higher energy density alternatives to conventional Li-ion batteries, of which the global production capacity has reached 150 GWh/year. However, development of new materials and their optimization to predicted ideal structure-property characteristics can typically take >10 years of capital-intensive research to find the best solidification and dopant strategies. Typical solid-state manufacture considers high temperature synthesis based on powder sintering for Co-free cathode material integration, which is time and labor intensive, produces a large CO 2 footprint, and incurs high costs. In contrast, wet-chemical fabrication methods like sequential deposition synthesis (SDS) can realize faster achievements in direct liquid precursor to solidified electrolyte translation. Li 7 La 3 Zr 2 O 12 (LLZO) is a promising solid electrolyte that has seen much interest in the battery field and has been demonstrated to be manufacturable with thickness between 1-15 µm via SDS. In this work, we demonstrate for the first time a high throughput experimental platform, computer-controlled SDS deposition system to rapidly fabricate LLZO and alter its structure-property characteristics with precursor and deposition modulations. Dopant amounts are adjusted in real time to easily control film composition and test a wide parameter field. In-situ annealing is achieved via an advanced heating element implemented directly into the chamber. Raman spectroscopy exhibits sufficient sensitivity to the phases under study, is a contactless and nondestructive technique, and has a low data acquisition time allowing for high-volume analysis coupled to high throughput synthesis. A quantification of Raman spectra is presented to supply a Bayesian optimization framework for greater experimental efficiency by selecting interesting regions by phase and other 2nd order characteristics of the experimental space to selectively sample for desired phases and, importantly, their high-dimensional phase boundaries. The capabilities of the new experimental framework are discussed for further work in accelerated solid-state ceramic materials discovery and we employ here careful considerations on the synthesis methods and computational approaches chosen towards the plethora of ceramic process options. This framework is capable of rapidly synthesizing solid-state electrolytes with a variety of composition and deposition conditions to optimize their properties so they can be rapidly deployed in next-generation batteries in significantly shorter times than traditional methods allow.


Sensor2Text: Enabling Natural Language Interactions for Daily Activity Tracking Using Wearable Sensors

November 2024

·

9 Reads

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies

Visual Question-Answering, a technology that generates textual responses from an image and natural language question, has progressed significantly. Notably, it can aid in tracking and inquiring about daily activities, crucial in healthcare monitoring, especially for elderly patients or those with memory disabilities. However, video poses privacy concerns and has a limited field of view. This paper presents Sensor2Text, a model proficient in tracking daily activities and engaging in conversations using wearable sensors. The approach outlined here tackles several challenges, including low information density in wearable sensor data, insufficiency of single wearable sensors in human activities recognition, and model's limited capacity for Question-Answering and interactive conversations. To resolve these obstacles, transfer learning and student-teacher networks are utilized to leverage knowledge from visual-language models. Additionally, an encoder-decoder neural network model is devised to jointly process language and sensor data for conversational purposes. Furthermore, Large Language Models are also utilized to enable interactive capabilities. The model showcases the ability to identify human activities and engage in Q&A dialogues using various wearable sensor modalities. It performs comparably to or better than existing visual-language models in both captioning and conversational tasks. To our knowledge, this represents the first model capable of conversing about wearable sensor data, offering an innovative approach to daily activity tracking that addresses privacy and field-of-view limitations associated with current vision-based solutions.


Procedural Material Generation with Reinforcement Learning

November 2024

·

15 Reads

·

1 Citation

ACM Transactions on Graphics

Modern 3D content creation heavily relies on procedural assets. In particular, procedural materials are ubiquitous in the industry, but their manipulation remains challenging. Previous work [Hu et al. 2023] conditionally generates procedural graphs that match a given input image. However, the parameter generation step limits how accurately the generated graph matches the input image, due to a reliance on supervision with scarcely available procedural data. We propose to improve parameter prediction accuracy for image-conditioned procedural material generation by leveraging reinforcement learning (RL) and present the first RL approach for procedural materials. RL circumvents the limited availability of procedural data, the domain gap between real and synthetic materials, and the need for end-to-end differentiable loss functions. Given a target image, we retrieve a procedural material and use an RL-trained transformer model to predict a set of parameters that reconstruct the target image as closely as possible. We show that using RL significantly improves parameter prediction to match a given target image compared to supervised methods on both synthetic and real target images.


Medial Skeletal Diagram: A Generalized Medial Axis Approach for Compact 3D Shape Representation

November 2024

·

6 Reads

·

2 Citations

ACM Transactions on Graphics

We propose the Medial Skeletal Diagram, a novel skeletal representation that tackles the prevailing issues around skeleton sparsity and reconstruction accuracy in existing skeletal representations. Our approach augments the continuous elements in the medial axis representation to effectively shift the complexity away from the discrete elements. To that end, we introduce generalized enveloping primitives, an enhancement over the standard primitives in the medial axis, which ensure efficient coverage of intricate local features of the input shape and substantially reduce the number of discrete elements required. Moreover, we present a computational framework for constructing a medial skeletal diagram from an arbitrary closed manifold mesh. Our optimization pipeline ensures that the resulting medial skeletal diagram comprehensively covers the input shape with the fewest primitives. Additionally, each optimized primitive undergoes a post-refinement process to guarantee an accurate match with the source mesh in both geometry and tessellation. We validate our approach on a comprehensive benchmark of 100 shapes, demonstrating the sparsity of the discrete elements and superior reconstruction accuracy across a variety of cases. Finally, we exemplify the versatility of our representation in downstream applications such as shape generation, mesh decomposition, shape optimization, mesh alignment, mesh compression, and user-interactive design.


Figure 3: Predictive performance of MoleVers, averaged over 5 splits, when finetuned on two assays with varying dataset size: (a) CHEMBL5291763, (b) CHEMBL2328568 (Zdrazil et al., 2024).
Ablation studies of our pretraining strategy. We can see that combining both pretraining stage 1 and stage 2 gives the best performance on the downstream datasets.
Impact of pretraining (stage 1) dataset diversity, measured by the number of training sam- ples. The downstream performance of MoleVers improves as the number of training samples in-
Two-Stage Pretraining for Molecular Property Prediction in the Wild

November 2024

·

18 Reads

Accurate property prediction is crucial for accelerating the discovery of new molecules. Although deep learning models have achieved remarkable success, their performance often relies on large amounts of labeled data that are expensive and time-consuming to obtain. Thus, there is a growing need for models that can perform well with limited experimentally-validated data. In this work, we introduce MoleVers, a versatile pretrained model designed for various types of molecular property prediction in the wild, i.e., where experimentally-validated molecular property labels are scarce. MoleVers adopts a two-stage pretraining strategy. In the first stage, the model learns molecular representations from large unlabeled datasets via masked atom prediction and dynamic denoising, a novel task enabled by a new branching encoder architecture. In the second stage, MoleVers is further pretrained using auxiliary labels obtained with inexpensive computational methods, enabling supervised learning without the need for costly experimental data. This two-stage framework allows MoleVers to learn representations that generalize effectively across various downstream datasets. We evaluate MoleVers on a new benchmark comprising 22 molecular datasets with diverse types of properties, the majority of which contain 50 or fewer training labels reflecting real-world conditions. MoleVers achieves state-of-the-art results on 20 out of the 22 datasets, and ranks second among the remaining two, highlighting its ability to bridge the gap between data-hungry models and real-world conditions where practically-useful labels are scarce.


Citations (52)


... The Lake-Thomas [35] fracture criterion often falls short of providing accurate quantitative predictions for fracture toughness in elastomers. As such, several works have attempted to further enhance the conceptual understanding of fracture in elastomers as well as the robustness of predictive capabilities beyond the simple scenario that the Lake-Thomas theory portrays [36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53]. Even though there has been significant progress in the field, the need for advanced damage modeling frameworks is crucial to better predict mechanical failure in elastomers while accounting for all mechanisms at play, especially towards accounting for diffuse damage mechanisms. ...

Reference:

A chain stretch-based gradient-enhanced model for damage and fracture in elastomers
Scaling Law for Intrinsic Fracture Energy of Diverse Stretchable Networks

Physical Review X

... For example, planar components such as the body of a guitar, the seat of a chair, and the base of a fertility statue are currently oversam- pled because the coverage capability is limited by the medial balls that are isotropic, as shown in Fig. 1. Future work could explore the use of different types of bounding volume primitives, such as slab meshes [LWS*15], ellipsoids [LCWK07], or medial skeletal diagrams [GWM23] instead of medial balls, to achieve a more compact representation with fewer skeletal points, while maintaining the coverage optimization goal. ...

Medial Skeletal Diagram: A Generalized Medial Axis Approach for Compact 3D Shape Representation
  • Citing Article
  • November 2024

ACM Transactions on Graphics

... Hu et al. (2023) subsequently conditions MatFormer on input text descriptions or images. Li et al. (2024a) further applies reinforcement learning (RL) fine-tuning to improve parameter predictions. Instead of training a custom network from scratch, we leverage that procedural material graphs in Blender (Blender, 2024a) can be transpiled into Python programs and fine-tune a VLM for image-conditioned program generation. ...

Procedural Material Generation with Reinforcement Learning
  • Citing Article
  • November 2024

ACM Transactions on Graphics

... There are two lines of prior approaches that have similarities to our proposed task, namely methods that focus on the geometry of individual parts and inter-part relationships [8,9,22,52,55,56], and methods such as MEPNet [46] that learns the assembly in specific conditions, such as LEGO shapes. The former rely on priors like peg-hole joints [24] or sequence generation [51] to reduce search complexity but are generative and may yield unstable assemblies due to occlusions [22]. The latter often simplifies the task by assuming that parts are provided step-by-step, e.g., LEGO manuals. ...

Category-Level Multi-Part Multi-Joint 3D Shape Assembly
  • Citing Conference Paper
  • June 2024

... The multilayer-perceptron (MLP)-based convolutional neural networks (CNNs) have been widely used in image classification due to their superior local contextual modeling capabilities all the time [19,[23][24][25][26]. While recently proposed Kolmogorov-Arnold networks (KAN) are believed to be viable alternatives for MLP based neural networks [27,28]. Because of their internal similarity to splines and their external similarity to MLP, KANs are able to optimize learned features with remarkable accuracy, in addition to being able to learn new features. ...

KAN 2.0: Kolmogorov-Arnold Networks Meet Science

... Even with only an image of the target layout, the robot can determine a valid construction sequence by imagining the disassembly process and reversing it [11]. Furthermore, components can be represented as nodes, allowing assembly sequence planning through high-level graph-based representations of assembly characteristics [5,14,15,16]. Involving robotic constraints is essential for experimental applications [3,14], though limited attention has been paid to integrating these constraints with ASP. ...

ASAP: Automated Sequence Planning for Complex Robotic Assembly with Physical Feasibility
  • Citing Conference Paper
  • May 2024

... Portrait relighting has been explored in both 2D [19,22,28,29,33,43,46,52,57,59] and 3D domains [3,6,32,48,49,51,61], with 2D image-based approaches being more relevant to our work. Since 2D portrait relighting is under-constrained, various priors have been proposed, such as morphable models [4] as 3D face priors in [42], explicit inverse rendering in [2,40], and a style transfer approach for relighting in [41]. ...

Lite2Relight: 3D-aware Single Image Portrait Relighting
  • Citing Conference Paper
  • July 2024

... Rev. Biomed. Eng., 2024, 26, 331-355 with permission of Annual Reviews[12] ...

Electronic Skin: Opportunities and Challenges in Convergence with Machine Learning
  • Citing Article
  • July 2024

Annual Review of Biomedical Engineering

Ja Hoon Koo

·

Young Joong Lee

·

Hye Jin Kim

·

[...]

·

Hyoyoung Jeong

... In addition, with superior capabilities in information comprehension, particularly by means of the advanced self-attention mechanism and parallel processing capabilities for efficient identification of internal relationships, large language models (LLMs) hold significant potential in establishing PSP for AM. Moreover, the interactive capabilities of LLMs through text-based or voice-based 57 prompts facilitate the seamless integration of the entire workflow-from design through manufacturing to performance evaluation 58 . This capability would significantly improve the efficiency and decision-making processes in AM. ...

How Can Large Language Models Help Humans in Design And Manufacturing? Part 1: Elements of The LLM-Enabled Computational Design and Manufacturing Pipeline
  • Citing Article
  • May 2024

... Process task guidance requires the development of methods and technology for AI assistants that can help technicians perform complex tasks [14]. Task guidance with AI assistance in manufacturing remains a challenging problem [36]. ...

How Can Large Language Models Help Humans in Design And Manufacturing? Part 2: Synthesizing an End-To-End LLM-Enabled Design and Manufacturing Workflow