About
208
Publications
35,414
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,861
Citations
Introduction
Skills and Expertise
Current institution
analysis
Current position
- student
Publications
Publications (208)
Mixture-of-Experts (MoE) is widely adopted to deploy Large Language Models (LLMs) on edge devices with limited memory budgets. Although MoE is, in theory, an inborn memory-friendly architecture requiring only a few activated experts to reside in the memory for inference, current MoE architectures cannot effectively fulfill this advantage and will y...
Large language models (LLMs) have demonstrated exceptional performance in understanding and generating semantic patterns, making them promising candidates for sequential recommendation tasks. However, when combined with conventional recommendation models (CRMs), LLMs often face challenges related to high inference costs and static knowledge transfe...
Recommender systems (RS) are increasingly vulnerable to shilling attacks, where adversaries inject fake user profiles to manipulate system outputs. Traditional attack strategies often rely on simplistic heuristics, require access to internal RS data, and overlook the manipulation potential of textual reviews. In this work, we introduce Agent4SR, a...
Placement is a critical task with high computation complexity in VLSI physical design. Modern analytical placers formulate the placement objective as a nonlinear optimization task, which suffers a long iteration time. To accelerate and enhance the placement process, recent studies have turned to deep learning-based approaches, particularly leveragi...
Large Language Model (LLM)-based user agents have emerged as a powerful tool for improving recommender systems by simulating user interactions. However, existing methods struggle with cross-domain scenarios due to inefficient memory structures, leading to irrelevant information retention and failure to account for social influence factors such as p...
Current recommendation systems powered by large language models (LLMs) often underutilize their reasoning capabilities due to a lack of explicit logical structuring. To address this limitation, we introduce CoT-Rec, a framework that integrates Chain-of-Thought (CoT) reasoning into LLM-driven recommendations by incorporating two crucial processes: u...
Recommender systems often suffer from popularity bias, where frequently interacted items are overrepresented in recommendations. This bias stems from propensity factors influencing training data, leading to imbalanced exposure. In this paper, we introduce a Fair Sampling (FS) approach to address this issue by ensuring that both users and items are...
Denoising-based diffusion models have attained impressive image synthesis; however, their applications on videos can lead to unaffordable computational costs due to the per-frame denoising operations. In pursuit of efficient video generation, we present a Diffusion Reuse MOtion (Dr. Mo) network to accelerate the video-based denoising process. Our c...
Placement is a critical task with high computation complexity in VLSI physical design. Modern analytical placers formulate the placement objective as a nonlinear optimization task, which suffers a long iteration time. To accelerate and enhance the placement process, recent studies have turned to deep learning-based approaches, particularly leveragi...
Inverse lithography technology (ILT) is an advanced resolution enhancement technology (RET) approach that pushes the limits of current process conditions to achieve smaller feature sizes. This paper presents an efficient ILT framework centered on the concept of solving mask optimization problem from approximation to precision. Firstly, low-resoluti...
This paper introduces Atelier, a large language model (LLM)-based framework for analog circuit design to address the issues of data scarcity and the substantial domain-specific knowledge required in this field. Atelier integrates general-purpose LLMs with a high-quality, compact knowledge base to fulfill the considerable knowledge requirements of a...
Recently, there has been increased emphasis on privacy-preserving computation technologies such as homomorphic encryption (HE) and Zero-knowledge proof (ZKP). Modular multiplication is a critical component for both HE and ZKP. Variable bit-width is a must for many applications of privacypreserving computation, due to variable bit-width requirements...
Sequential recommendation methods can capture dynamic user preferences from user historical interactions to achieve better performance. However, most existing methods only use past information extracted from user historical interactions to train the models, leading to the deviations of user preference modeling. Besides past information, future info...
Language models (LMs) only pretrained on a general and massive corpus usually cannot attain satisfying performance on domain-specific downstream tasks, and hence, applying domain-specific pretraining to LMs is a common and indispensable practice. However, domain-specific pretraining can be costly and time-consuming, hindering LMs' deployment in rea...
Studying the evolution of online news communities is essential for improving the effectiveness of news recommender systems. Traditionally, this has been done through empirical research based on static data analysis. While this approach has yielded valuable insights for optimizing recommender system designs, it is limited by the lack of appropriate...
Video generation using diffusion-based models is constrained by high computational costs due to the frame-wise iterative diffusion process. This work presents a Diffusion Reuse MOtion (Dr. Mo) network to accelerate latent video generation. Our key discovery is that coarse-grained noises in earlier denoising steps have demonstrated high motion consi...
A typical VLSI design flow is divided into separated front-end logic synthesis and back-end physical design (PD) stages, which often require costly iterations between these stages to achieve design closure. Existing approaches face significant challenges, notably in utilizing feedback from physical metrics to better adapt and refine synthesis opera...
Multi-Scalar Multiplication (MSM) is a computationally intensive task that operates on elliptic curves based on GF ( P ). It is commonly used in Zero-knowledge proof (ZKP), where it accounts for a significant portion of the computation time required for proof generation. In this paper, we present PriorMSM, an efficient acceleration architecture for...
Developing chatbots as personal companions has long been a goal of artificial intelligence researchers. Recent advances in Large Language Models (LLMs) have delivered a practical solution for endowing chatbots with anthropomorphic language capabilities. However, it takes more than LLMs to enable chatbots that can act as companions. Humans use their...
Floorplanning is a complex physical design problem that produces initial locations of movable objects, the quality of which has a great impact on downstream tasks such as placement and routing. To improve the efficacy of floorplanning, machine learning techniques have recently been recruited for help. However, the application-specific location cons...
The design space exploration of contemporary microprocessors faces a significant challenge of high computational cost. In this context, we introduce Prior-boosted GRL, a novel framework for the design space exploration of the microarchitectures the microprocessors underpinned by graph embeddings. Using graph representation learning, Prior-boosted G...
Bayesian optimization is more efficient in automatically synthesizing operational amplifier (opamp) topologies compared to conventional methods. However, the design space for behavior-level opamp topologies involves numerous connections that are difficult to comprehend, and evaluating each topology incurs substantial computational costs. To tackle...
Zero-knowledge proof (ZKP) plays a significant role in privacy protection technology. However, the proof generation phase requires considerable time and hardware resources. In this phase, Number Theoretic Transform or Inverse Number Theoretic Transform (NTT/INTT) in polynomial computation, as well as Multiple Scalar Multiplication (MSM), are bottle...
Designing high-performance
$\Delta $
-
$\Sigma $
modulators is a challenging task, often involving a time-consuming, manual topology search process. We present an automated topology synthesis method for
$\Delta $
-
$\Sigma $
modulators that significantly improves efficiency in the search for reliable modulator topologies. Our bi-level Bayesia...
The operational amplifier is a key building block in analog systems. However, the design process of the operational amplifier is time consuming and heavily depends on engineers’ experiences. This article presents OPAMP-Generator, an analog operational amplifier generator, which automates the full design flow from user-defined specifications to GDSI...
Developing chatbots as personal companions has long been a goal of artificial intelligence researchers. Recent advances in Large Language Models (LLMs) have delivered a practical solution for endowing chatbots with anthropomorphic language capabilities. However, it takes more than LLMs to enable chatbots that can act as companions. Humans use their...
Click-through rate (CTR) prediction is widely used in academia and industry. Most CTR tasks fall into a feature embedding & feature interaction paradigm, where the accuracy of CTR prediction is mainly improved by designing practical feature interaction structures. However, recent studies have argued that the fixed feature embedding learned only thr...
Despite their prevalence in deep-learning communities, over-parameterized models convey high demands of computational costs for proper training. This work studies the fine-grained, modular-level learning dynamics of over-parameterized models to attain a more efficient and fruitful training strategy. Empirical evidence reveals that when scaling down...
The process of reading has attracted decades of scientific research. Work in this field primarily focuses on using eye gaze patterns to reveal cognitive processes while reading. However, eye gaze patterns suffer from limited resolution, jitter noise, and cognitive biases, resulting in limited accuracy in tracking cognitive reading states. Moreover,...
Recommender systems are important for providing personalized services to users, but the vast amount of collected user data has raised concerns about privacy (e.g., sensitive data), security (e.g., malicious data) and utility (e.g., toxic data). To address these challenges, recommendation unlearning has emerged as a promising approach, which allows...
Understanding the evolution of online news communities is essential for designing more effective news recommender systems. However, due to the lack of appropriate datasets and platforms, the existing literature is limited in understanding the impact of recommender systems on this evolutionary process and the underlying mechanisms, resulting in sub-...
Safeguarding personal information is paramount for healthcare data sharing, a challenging issue without any silver bullet thus far. We study the prospect of a recent deep-learning advent, dataset condensation (DC), in sharing healthcare data for AI research, and the results are promising. The condensed data abstracts original records and irreversib...
In dynamic interaction graphs, user-item interactions usually follow heterogeneous patterns, represented by different structural information, such as user-item co-occurrence, sequential information of user interactions and the transition probabilities of item pairs. However, the existing methods cannot simultaneously leverage all three structural i...
The collaborative filtering (CF) problem with only user-item interaction information can be solved by graph signal processing (GSP), which uses low-pass filters to smooth the observed interaction signals on the similarity graph to obtain the prediction signals. However, the interaction signal may not be sufficient to accurately characterize user in...
This work pursues the optimization of over-parameterized deep models for superior training efficiency and test performance. We first theoretically emphasize the importance of two properties of over-parameterized models, i.e., the convergence gap and the generalization gap. Subsequent analyses unveil that these two gaps can be upper-bounded by the r...
Our goals are to better understand dog cognition, and to support others who share this interest. Existing investigation methods predominantly rely on human-manipulated experiments to examine dogs' behavioral responses to visual stimuli such as human gestures. As a result, existing experimental paradigms are usually constrained to in-lab environment...
Chip floorplanning has long been a critical task with high computation complexity in the physical implementation of VLSI chips. Its key objective is to determine the initial locations of large chip modules with minimized wirelength while adhering to the density constraint, which in essence is a process of constructing an optimized mapping from circ...
Hydro-fracture geometry prediction is of great practical importance for optimizing construction parameters and evaluating stimulation effects. Existing physical simulation methods are computationally intensive. Deep learning-based methods offer fast model inference, yet typically require a large amount of field data for accurate model training and...
Disentangled feature representation is essential for data-efficient learning. The feature space of deep models is inherently compositional. Existing β-VAE-based methods, which only apply disentanglement regularization to the resulting embedding space of deep models, cannot effectively regularize such compositional feature space, resulting in unsati...
Emotion recognition in smart eyewear devices is valuable but challenging. One key limitation of previous works is that the expression-related information like facial or eye images is considered as the only evidence of emotion. However, emotional status is not isolated; it is tightly associated with people's visual perceptions, especially those with...
Emotion recognition in smart eyewear devices is highly valuable but challenging. One key limitation of previous works is that the expression-related information like facial or eye images is considered as the only emotional evidence. However, emotional status is not isolated; it is tightly associated with people's visual perceptions, especially thos...
This work presents MemX: a biologically-inspired attention-aware eyewear system developed with the goal of pursuing the long-awaited vision of a personalized visual Memex. MemX captures human visual attention on the fly, analyzes the salient visual content, and records moments of personal interest in the form of compact video snippets. Accurate att...
This work presents MemX: a biologically-inspired attention-aware eyewear system developed with the goal of pursuing the long-awaited vision of a personalized visual Memex. MemX captures human visual attention on the fly, analyzes the salient visual content, and records moments of personal interest in the form of compact video snippets. Accurate att...
Deep-learning-based video processing has yielded transformative results in recent years. However, the video analytics pipeline is energy intensive due to high data rates and reliance on complex inference algorithms, which limits its adoption in energy-constrained applications. Motivated by the observation of high and variable spatial redundancy and...
Deep-learning-based video processing has yielded transformative results in recent years. However, the video analytics pipeline is energy-intensive due to high data rates and reliance on complex inference algorithms, which limits its adoption in energy-constrained applications. Motivated by the observation of high and variable spatial redundancy and...
Data-driven approaches have gained increasing interests in fault detection of photovoltaic systems due to the availability of sensor data. However, the noise introduced by environmental variations and measurement variabilities pose significant challenges on effective fault detection. Furthermore, the change in electrical signal magnitude of a fault...
Collaborative filtering (CF) is a popular technique in today's recommender systems, and matrix approximation-based CF methods have achieved great success in both rating prediction and top-N recommendation tasks. However, real-world user-item rating matrices are typically sparse, incomplete and noisy, which introduce challenges to the algorithm stab...
Recommender systems have become an indispensable component in online services during recent years. Effective recommendation is essential for improving the services of various online business applications. However, serious privacy concerns have been raised on recommender systems requiring the collection of users' private information for recommendati...
Operation anomalies are common phenomena in large-scale solar farms. Effective anomaly detection and classification is essential for improving operation reliability and electricity generation. However, this is a challenging task due to the high complexity and wide variety of frequently occurring anomalies. Furthermore, existing pre-installed superv...
Gradient-based learning methods such as stochastic gradient descent are widely used in matrix approximation-based collaborative filtering algorithms to train recommendation models based on observed user-item ratings. One major difficulty in existing gradient-based learning methods is determining proper learning rates, since model convergence would...
The fast-growing wind power industry faces the challenge of reducing operation and maintenance (O&M) costs for wind power plants. Predictive maintenance is essential to improve wind turbine reliability and prolong operation time, thereby reducing the O&M cost for wind power plants. This study presents a solution for predictive maintenance of wind t...
High-resolution satellite imagery data have been widely used in geoscience and remote sensing research. Dealing with data quality issue is the first and most important step before truly making use of these high-resolution images. Scientific results derived from poor-quality data can be problematic and unreliable. In this work, we propose a novel da...
Accurate anomaly diagnosis is essential for reducing operation and maintenance (O&M) cost, while improving safety and reliability of large-scale photovoltaic (PV) systems. Although many methods have been proposed, they either require extra sensing devices or suffer from high false alarm rates. In this work, we present a cost-effective hierarchical...
During summer, melt ponds have a significant influence on Arctic sea-ice albedo. The melt pond fraction (MPF) also has the ability to forecast the Arctic sea-ice in a certain period. It is important to retrieve accurate melt pond fraction (MPF) from satellite data for Arctic research. This paper proposes a satellite MPF retrieval model based on the...
Matrix approximation (MA) is one of the most popular techniques in today's recommender systems. In most MA-based recommender systems, the problem of risk minimization should be defined, and how to achieve minimum expected risk in model learning is one of the most critical problems to recommendation accuracy. This paper addresses the expected risk m...
Recommender systems have achieved great success in recent years, and matrix approximation (MA) is one of the most popular techniques for collaborative filtering (CF) based recommendation. However, a major issue is that MA methods perform poorly at detecting strong localized associations among closely related users and items. Recently, some MA-based...
Massive amounts of remotely sensed data are being generated at an unprecedented rate, offering new opportunities for data-driven scientific discovery in the Earth sciences and related domains. However, due to the sheer volume of remotely sensed data and the lack of effective data analytics tools, most data remain in the dark, with little to no qual...
Running is one of the most popular sports with hundreds of millions of participants worldwide. Good running form is the key to fast, efficient, and injury-free running. Existing kinematic analysis technologies, such as high-speed camera systems, are expensive, difficult to operate, and exclusive to sports physiology laboratories and elite athletes....
In commercial sales and services, recommender systems have been widely adopted to predict customers’ purchase interests using their prior purchasing behaviors. Cold-start is a known challenge to existing recommendation techniques, e.g., the popular collaborative filtering method is not applicable to predict the interests of “white-space” customers...
Matrix approximation (MA) is one of the most popular techniques for collaborative filtering (CF). Most existing MA methods train user/item latent factors based on a user-item rating matrix and then use the global latent factors to model all users/items. However, globally optimized latent factors may not reflect the unique interests shared among onl...
Matrix approximation is one of the most effective methods for collaborative filtering-based recommender systems. However, the high computation complexity of matrix factorization on large datasets limits its scalability. Prior solutions have adopted co-clustering methods to partition a large matrix into a set of smaller submatrices, which can then b...
Abstract-Wearables are a leading category in the Internet of Things. Compared to mainstream mobile phones, wearables target one order of magnitude form factor reduction, and offer the potential of providing ubiquitous, personalized services to end users. Aggressive reduction in size imposes serious limits on battery capacity. Wearables are equipped...
Collaborative filtering (CF) methods are widely adopted by existing recommender systems, which can analyze and predict user “ratings” or “preferences” of newly generated items based on user historical behaviors. However, privacy issue arises in this process as sensitive user private data are collected by the recommender server. Recently proposed pr...
In item-based top-N recommender systems, the recommendation results are generated based on item correlation computation among all users. Therefore, recommendation results can be used to infer the correlations among recommended items. This is not an issue as long as the total amount of queries produced by a typical user is small, and the queried ite...
Intercore communication in many-core processors presently faces scalability issues similar to those that plagued intracity telecommunications in the 1960s. Optical communication promises to address these challenges now, as then, by providing low latency, high bandwidth, and low power communication. Silicon photonic devices presently are vulnerable...
Advances in embedded systems and low-cost gas sensors are enabling a new wave of low cost air quality monitoring tools. Our team has been engaged in the development of low-cost wearable air quality monitors (M-Pods) using the Arduino platform. The M-Pods use commercially available metal oxide semiconductor (MOx) sensors to measure CO, O3, NO2, and...
Three sets of devices were simulated, designed, and laid out for fabrication in the EuroPractice shuttle program and then measured in-house after fabrication. A combination of analytical and numerical modeling is used to extract the dispersion curves that define the effective index of refraction as a function of wavelength for three different class...
Mobile devices are quickly becoming a primary medium for personal information gathering, management, and sharing. Managing personal image data on mobile platforms is a challenging problem due to large data set size, content and context diversity, heterogeneous individual usage patterns, and resource constraints. This article presents a user-centric...
An array of passive silicon-on-insulator optical devices is laid out in repeating patterns on four foundry-fabricated wafers. The physical and optical characterization of these microrings, racetrack resonators, and directional couplers are found to exhibit significant variation in optical response. A device-heating experiment carried out on a numbe...