Shuai Zhang

Shuai Zhang
  • Doctor of Philosophy
  • PostDoc Position at ETH Zurich

About

73
Publications
71,825
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,003
Citations
Current institution
ETH Zurich
Current position
  • PostDoc Position
Additional affiliations
February 2020 - present
ETH Zurich
Position
  • PostDoc Position
Education
October 2016 - December 2019
UNSW Sydney
Field of study
  • Computer Science
September 2010 - July 2014
Nanjing University
Field of study
  • Information Management and Information System

Publications

Publications (73)
Article
Full-text available
With the ever-growing volume of online information, recommender systems have been an effective strategy to overcome such information overload. The utility of recommender systems cannot be overstated, given its widespread adoption in many web applications, along with its potential impact to ameliorate many problems related to over-choice. In recent...
Preprint
Full-text available
In this work, we move beyond the traditional complex-valued representations, introducing more expressive hypercomplex representations to model entities and relations for knowledge graph embeddings. More specifically, quaternion embeddings, hypercomplex-valued embeddings with three imaginary components, are utilized to represent entities. Relations...
Article
Full-text available
At online retail platforms, it is crucial to actively detect the risks of transactions to improve customer experience and minimize financial loss. In this work, we propose xFraud, an explainable fraud transaction prediction framework which is mainly composed of a detector and an explainer. The xFraud detector can effectively and efficiently predict...
Conference Paper
Full-text available
In this work, we introduce Context-Aware MultiModal Learner (CaMML), for tuning large multimodal models (LMMs). CaMML, a lightweight module, is crafted to seamlessly integrate multimodal contextual samples into large models, thereby empowering the model to derive knowledge from analogous , domain-specific, up-to-date information and make grounded i...
Preprint
The question-answering (QA) capabilities of foundation models are highly sensitive to prompt variations, rendering their performance susceptible to superficial, non-meaning-altering changes. This vulnerability often stems from the model's preference or bias towards specific input characteristics, such as option position or superficial image feature...
Preprint
Full-text available
Recommender systems powered by generative models (Gen-RecSys) extend beyond classical item ranking by producing open-ended content, which simultaneously unlocks richer user experiences and introduces new risks. On one hand, these systems can enhance personalization and appeal through dynamic explanations and multi-turn dialogues. On the other hand,...
Preprint
Full-text available
We study the well-motivated problem of online distribution shift in which the data arrive in batches and the distribution of each batch can change arbitrarily over time. Since the shifts can be large or small, abrupt or gradual, the length of the relevant historical data to learn from may vary over time, which poses a major challenge in designing a...
Preprint
Full-text available
Having an LLM that aligns with human preferences is essential for accommodating individual needs, such as maintaining writing style or generating specific topics of interest. The majority of current alignment methods rely on fine-tuning or prompting, which can be either costly or difficult to control. Model steering algorithms, which modify the mod...
Preprint
Full-text available
While the Transformer architecture has achieved remarkable success across various domains, a thorough theoretical foundation explaining its optimization dynamics is yet to be fully developed. In this study, we aim to bridge this understanding gap by answering the following two core questions: (1) Which types of Transformer architectures allow Gradi...
Conference Paper
Full-text available
We present an overview of a workshop focused on the exploration of generative models within recommender systems (RS). It highlights the dual nature of these technologies: on the one hand, they offer groundbreaking opportunities for enhancing RS through improved personalization, innovative content creation, and interactive user experiences; on the o...
Conference Paper
Full-text available
Large Language Models (LLMs) have shown great ability in solving traditional natural language tasks and elementary reasoning tasks with appropriate prompting techniques. However , their ability is still limited in solving complicated science problems. In this work, we aim to push the upper bound of the reasoning capability of LLMs by proposing a co...
Preprint
How do we transfer the relevant knowledge from ever larger foundation models into small, task-specific downstream models that can run at much lower costs? Standard transfer learning using pre-trained weights as the initialization transfers limited information and commits us to often massive pre-trained architectures. This procedure also precludes c...
Preprint
Full-text available
Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluating vector similarities between encoded query texts and web documents. As performance demands for ve...
Preprint
Full-text available
Google AlphaGos win has significantly motivated and sped up machine learning (ML) research and development, which led to tremendous ML technical advances and wider adoptions in various domains (e.g., Finance, Health, Defense, and Education). These advances have resulted in numerous new concepts and technologies, which are too many for people to cat...
Preprint
Full-text available
This work proposes POMP, a prompt pre-training method for vision-language models. Being memory and computation efficient, POMP enables the learned prompt to condense semantic information for a rich set of visual concepts with over twenty-thousand classes. Once pre-trained, the prompt with a strong transferable ability can be directly plugged into a...
Chapter
Recommender systems have achieved widespread success in real-life applications. Personalized recommendation can reduce customers’ effort in finding items they are interested in. It is also critical in some industries as it can increase customer stickiness and help industries to stand out from competitors. Recommender systems made a significant prog...
Preprint
Full-text available
Reasoning is a fundamental problem for computers and deeply studied in Artificial Intelligence. In this paper, we specifically focus on answering multi-hop logical queries on Knowledge Graphs (KGs). This is a complicated task because, in real-world scenarios, the graphs tend to be large and incomplete. Most previous works have been unable to create...
Preprint
Full-text available
At online retail platforms, detecting fraudulent accounts and transactions is crucial to improve customer experience, minimize loss, and avoid unauthorized transactions. Despite the variety of different models for deep learning on graphs, few approaches have been proposed for dealing with graphs that are both heterogeneous and dynamic. In this pape...
Chapter
Full-text available
Recommender systems are essential and are playing a more and more important role in our daily life, ranging from entertainment to online shopping. They have great commercial value, not only can improve the user experience by saving users time to locate related items, but also increase the exposure rate of long-tail items. Factorization machines (FM...
Article
Full-text available
Tree-based models and deep neural networks are two schools of effective classification methods in machine learning. While tree-based models are robust irrespective of data domain, deep neural networks have advantages in handling high-dimensional data. Adding a differentiable neural decision forest to the neural network can generally help exploit th...
Preprint
Full-text available
Recent works have demonstrated reasonable success of representation learning in hypercomplex space. Specifically, "fully-connected layers with Quaternions" (4D hypercomplex numbers), which replace real-valued matrix multiplications in fully-connected layers with Hamilton products of Quaternions, both enjoy parameter savings with only 1/4 learnable...
Preprint
Learning embedding spaces of suitable geometry is critical for representation learning. In order for learned representations to be effective and efficient, it is ideal that the geometric inductive bias aligns well with the underlying structure of the data. In this paper, we propose Switch Spaces, a data-driven approach for learning representations...
Preprint
Full-text available
Massive account registration has raised concerns on risk management in e-commerce companies, especially when registration increases rapidly within a short time frame. To monitor these registrations constantly and minimize the potential loss they might incur, detecting massive registration and predicting their riskiness are necessary. In this paper,...
Preprint
Full-text available
At online retail platforms, it is crucial to actively detect risks of fraudulent transactions to improve our customer experience, minimize loss, and prevent unauthorized chargebacks. Traditional rule-based methods and simple feature-based models are either inefficient or brittle and uninterpretable. The graph structure that exists among the heterog...
Preprint
Full-text available
Modeling user interests is crucial in real-world recommender systems. In this paper, we present a new user interest representation model for personalized recommendation. Specifically, the key novelty behind our model is that it explicitly models user interests as a hypercuboid instead of a point in the space. In our approach, the recommendation sco...
Preprint
Deep neural networks are widely used in personalized recommendation systems. Unlike regular DNN inference workloads, recommendation inference is memory-bound due to the many random memory accesses needed to lookup the embedding tables. The inference is also heavily constrained in terms of latency because producing a recommendation for a user must b...
Article
Full-text available
Metric learning based methods have attracted extensive interests in recommender systems. Current methods take the user-centric way in metric space to ensure the distance between user and negative item to be larger than that between the current user and positive item by a fixed margin. While they ignore the relations among positive item and negative...
Article
Unlike all prior work, we investigate the notion of ‛ ${unraveling\ metric\ vector\ spaces}$ ‚, i.e., deriving meaning and low-rank structure from distance or metric space. Our new model bridges two commonly adopted paradigms for recommendations - metric learning approaches and factorization-based models, distinguishing itself accordingly. More co...
Conference Paper
This paper proposes Quaternion Collaborative Filtering (QCF), a novel representation learning method for recommendation. Our proposed QCF relies on and exploits computation with Quaternion algebra, benefiting from the expressiveness and rich representation learning capability of Hamilton products. Quaternion representations, based on hypercomplex n...
Conference Paper
Full-text available
Deep learning based recommender systems have been extensively explored in recent years. However, the large number of models proposed each year poses a big challenge for both researchers and practitioners in reproducing the results for further comparisons. Although a portion of papers provides source code, they adopted different programming language...
Article
Factorization Machines (FMs) are a class of popular algorithms that have been widely adopted for collaborative filtering and recommendation tasks. FMs are characterized by its usage of the inner product of factorized parameters to model pairwise feature interactions, making it highly expressive and powerful. This paper proposes Holographic Factoriz...
Chapter
Deep learning has been widely used in many software disciplines in both academia and industry including computer vision, speech recognition and translation, natural languages processing, search engine, bioinformatics, sensor data processing, finance, etc., due to its scalability in big data environments and accuracy at higher level than ever before...
Preprint
Full-text available
Many state-of-the-art neural models for NLP are heavily parameterized and thus memory inefficient. This paper proposes a series of lightweight and memory efficient neural architectures for a potpourri of natural language processing (NLP) tasks. To this end, our models exploit computation using Quaternion algebra and hypercomplex spaces, enabling no...
Preprint
Full-text available
This paper proposes Quaternion Collaborative Filtering (QCF), a novel representation learning method for recommendation. Our proposed QCF relies on and exploits computation with Quaternion algebra, benefiting from the expressiveness and rich representation learning capability of Hamilton products. Quaternion representations, based on hypercomplex n...
Preprint
Full-text available
Deep learning based recommender systems have been extensively explored in recent years. However , the large number of models proposed each year poses a big challenge for both researchers and practitioners in reproducing the results for further comparisons. Although a portion of papers provides source code, they adopted different programming languag...
Conference Paper
Full-text available
Factorization Machines (FMs) are a class of popular algorithms that have been widely adopted for collaborative filtering and recommendation tasks. FMs are characterized by its usage of the inner product of factorized parameters to model pairwise feature interactions, making it highly expressive and powerful. This paper proposes Holographic Factoriz...
Chapter
Full-text available
Generative adversarial networks (GAN)-based approaches have been extensively investigated whereas GAN-inspired regression (i.e., numeric prediction) has rarely been studied in image and video processing domains. The lack of sufficient labeled data in many real-world cases poses great challenges to regression methods, which generally require suffici...
Chapter
Full-text available
Knowledge acquisition and exchange are generally crucial yet costly for both businesses and individuals, especially when the knowledge concerns various areas. Question Answering Communities offer an opportunity for sharing knowledge at a low cost, where communities users, many of whom are domain experts, can potentially provide high-quality solutio...
Article
A Brain-Computer Interface (BCI) acquires brain signals, analyzes and translates them into commands that are relayed to actuation devices for carrying out desired actions. With the widespread connectivity of everyday devices realized by the advent of the Internet of Things (IoT), BCI can empower individuals to directly control objects such as smart...
Preprint
Full-text available
Many well-established recommender systems are based on representation learning in Euclidean space. In these models, matching functions such as the Euclidean distance or inner product are typically used for computing similarity scores between user and item embeddings. This paper investigates the notion of learning user and item representations in Hy...
Preprint
Full-text available
In this paper, we propose a novel sequence-aware recommendation model. Our model utilizes self-attention mechanism to infer the item-item relationship from user's historical interactions. With self-attention, it is able to estimate the relative weights of each item in user interaction trajectories to learn better representations for user's transien...
Preprint
Full-text available
Knowledge acquisition and exchange are generally crucial yet costly for both businesses and individuals, especially when the knowledge concerns various areas. Question Answering Communities offer an opportunity for sharing knowledge at a low cost, where communities users, many of whom are domain experts, can potentially provide high-quality solutio...
Conference Paper
Modeling user-item interaction patterns is an important task for personalized recommendations. Many recommender systems are based on the assumption that there exists a linear relationship between users and items while neglecting the intricacy and non-linearity of real-life historical interactions. In this paper, we propose a neural network based re...
Article
Ratings are an essential criterion for evaluating the quality of movies and a critical indicator of whether a customer would watch a movie. Therefore, an important related research challenge is to predict the rating of a movie before it is released in cinema or even before it is produced. Many existing approaches fail to address this challenge beca...
Preprint
Full-text available
Random forest and deep neural network are two schools of effective classification methods in machine learning. While the random forest is robust irrespective of the data domain, the deep neural network has advantages in handling high dimensional data. In view that a differentiable neural decision forest can be added to the neural network to fully e...
Chapter
Meetup brings people with similar interests together to do things that matter to them. For example, it provides a platform for getting people who love hiking, coding, running marathons, learning foreign languages together so that they can help, teach and learn from each other. Thanks to the development of web and mobile technologies, organizing the...
Preprint
The dominant, state-of-the-art collaborative filtering (CF) methods today mainly comprises neural models. In these models, deep neural networks, e.g.., multi-layered perceptrons (MLP), are often used to model nonlinear relationships between user and item representations. As opposed to shallow models (e.g., factorization-based models), deep models g...
Preprint
In the past decade, matrix factorization has been extensively researched and has become one of the most popular techniques for personalized recommendations. Nevertheless, the dot product adopted in matrix factorization based recommender models does not satisfy the inequality property, which may limit their expressiveness and lead to sub-optimal sol...
Conference Paper
Full-text available
Modeling user-item interaction patterns is an important task for personalized recommendations. Many recommender systems are based on the assumption that there exists a linear relationship between users and items while neglecting the intricacy and non-linearity of real-life historical interactions. In this paper, we propose a neural network based re...
Preprint
Full-text available
Modelling user-item interaction patterns is an important task for personalized recommendations. Many recommender systems are based on the assumption that there exists a linear relationship between users and items, while neglecting the intricacy and non-linearity of real-life historical interactions. In this paper, we propose a neural recommendation...
Preprint
Full-text available
A Brain-Computer Interface (BCI) acquires brain signals, analyzes and translates them into commands that are relayed to actuation devices for carrying out desired actions. With the widespread connectivity of everyday devices realized by the advent of the Internet of Things (IoT), BCI can empower individuals to directly control objects such as smart...
Article
In the past decade, matrix factorization has been extensively researched and has become one of the most popular techniques for personalized recommendations. Nevertheless, the dot product adopted in matrix factorization based recommender models does not satisfy the inequality property, which may limit their expressiveness and lead to sub-optimal sol...
Conference Paper
In this paper, we present a novel structure, Semi-AutoEncoder, based on AutoEncoder. We generalize it into a hybrid collaborative filtering model for rating prediction as well as personalized top-n recommendations. Experimental results on two real-world datasets demonstrate its state-of-the-art performances.
Conference Paper
Full-text available
Collaborative filtering (CF) has been successfully used to provide users with personalized products and services. However, dealing with the increasing sparseness of user-item matrix still remains a challenge. To tackle such issue, hybrid CF such as combining with content based filtering and leveraging side information of users and items has been ex...
Preprint
With the ever-growing volume of online information, recommender systems have been an effective strategy to overcome such information overload. The utility of recommender systems cannot be overstated, given its widespread adoption in many web applications, along with its potential impact to ameliorate many problems related to over-choice. In recent...
Article
Full-text available
In this paper, we first present a novel structure of AutoEncoder, namely Semi-AutoEncoder. We generalize it into a designated hybrid collaborative filtering model, which is able to predict ratings as well as to generate personalized top-N recommendations. Experimental results on two real-world datasets demonstrate its state-of-the-art performances.
Article
Collaborative filtering (CF) has been successfully used to provide users with personalized products and services. However, dealing with the increasing sparseness of user-item matrix still remains a challenge. To tackle such issue, hybrid CF such as combining with content based filtering and leveraging side information of users and items has been ex...
Article
Full-text available
Recommender systems have been actively and extensively studied over past decades. In the meanwhile, the boom of Big Data is driving fundamental changes in the development of recommender systems. In this paper, we propose a dynamic intention-aware recommender system to better facilitate users to find desirable products and services. Compare to prior...
Article
Full-text available
This paper seeks to investigate the application of semantic web technologies in information resource integration at libraries. Firstly, the authors proposed an ontology- and linked- data-driven semantic integration framework to realize (1) the integration of document resources of different types and in different formats; (2) semantic integration of...

Network

Cited By