
Zhengyi YangUNSW Sydney | UNSW · School of Computer Science and Engineering
Zhengyi Yang
Doctor of Philosophy
About
53
Publications
6,316
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
423
Citations
Introduction
I am an Associate Lecturer in the School of Computer Science and Engineering, University of New South Wales (UNSW). My primary research interests are graph database systems, distributed graph processing and graph mining.
Publications
Publications (53)
Homophily, the tendency of similar nodes to connect, is a fundamental phenomenon in network science and a critical factor in the performance of graph neural networks (GNNs). While existing studies primarily explore homophily in homogeneous graphs, where nodes share the same type, real-world networks are often more accurately modeled as heterogeneou...
In recent years, data management and analytics have attracted significant attention from both academia and industry, driven by the rapid growth in the volume, velocity, and variety of data generation[...]
Graphs are a commonly used model in data mining to represent complex relationships, with nodes representing entities and edges representing relationships. However, graphs have limitations in modeling high-order relationships. In contrast, hypergraphs offer a more versatile representation, allowing edges to join any number of nodes. This capability...
This report summarizes the program and outcomes of the 3rd International Workshop on Large-Scale Graph Data Analytics (LSGDA 2024). The workshop was held in conjunction with the VLDB 2024 conference in Guangzhou, China, on August 26, 2024. The aim of the workshop was to provide a forum for researchers from academia and industry to exchange ideas, t...
As one of the most male-dominated industries, the Australian construction industry is known for low representation of women in the workforce. Previous studies have indicated that women face significant obstacles in their career developments and were struggling to attain managerial positions. Building on past research, this research conducted an exp...
In the context of searching a single data graph G, graph pattern matching is to find all the occurrences of a pattern graph Q in G, specified by a matching rule. It is of paramount importance in many real applications such as social network analysis and cyber security, among others. A wide spectrum of studies target general graph pattern matching....
Deep learning has attracted wide attention recently because of its excellent feature representation ability and end-to-end automatic learning method. Especially in clinical medical imaging diagnosis, the semi-supervised deep learning model is favored and widely used because it can make maximum use of a limited number of labeled data and combine it...
In few-shot semantic segmentation (FSS), the key challenges are efficiently tuning the interaction between the support set and the query set and distinguishing between context, background, and interfering items. To address these challenges, we propose prototype comparison networks for one-shot segmentation (OPCN) to capture the details required for...
JIT (Just-in-Time) technology has garnered significant attention for improving the efficiency of database execution. It offers higher performance by eliminating interpretation overhead compared to traditional execution engines. LLVM serves as the primary JIT architecture, which was implemented in PostgreSQL since version 11. However, recent advance...
The k nearest neighbor (kNN) join operation is a fundamental task that combines two high-dimensional databases, enabling data points in the User dataset U to identify their k nearest neighbor points from the Item dataset I. This operation plays a crucial role in various domains, including knowledge discovery, data mining, similarity search applicat...
With the rapid development of artificial intelligence, machine learning is gradually becoming popular for predictions in all walks of life. In meteorology, it is gradually competing with traditional climate predictions dominated by physical models. This survey aims to consolidate the current understanding of Machine Learning (ML) applications in we...
With the rapid development of artificial intelligence, machine learning is gradually becoming popular in predictions in all walks of life. In meteorology, It is gradually competing with traditional climate predictions dominated by physical models. This survey aims to consolidate the current understanding of Machine Learning (ML) applications in wea...
Weather and climate prediction have been crucial in human history for enabling effective agricultural planning, safeguarding against natural disasters, and facilitating strategic decision-making in various sectors. In this context, the need for accurate and timely forecasting is highly significant. Machine learning has the potential to improve the...
Given a user dataset $$\varvec{U}$$ U and an object dataset $$\varvec{I}$$ I , a kNN join query in high-dimensional space returns the $$\varvec{k}$$ k nearest neighbors of each object in dataset $$\varvec{U}$$ U from the object dataset $$\varvec{I}$$ I . The kNN join is a basic and necessary operation in many applications, such as databases, data m...
Structural Graph Clustering (SCAN) is a fundamental problem in graph analysis and has received considerable attention recently. Existing distributed solutions either lack efficiency or suffer from high memory consumption when addressing this problem in billion-scale graphs. Motivated by these, in this paper, we aim to devise a distributed algorithm...
Given a user dataset U and an object dataset I , a kNN join query in high-dimensional space returns the k nearest neighbors of each object in dataset U from the object dataset I . The kNN join is a basic and necessary operation in many applications, such as databases, data mining, computer vision, multi-media, machine learning, recommendation syste...
Given a user dataset U and an object dataset I, a kNN join query in high-dimensional space returns the k nearest neighbors of each object in dataset U from the object dataset I. The kNN join is a basic and necessary operation in many applications, such as databases, data mining, computer vision, multi-media, machine learning, recommendation systems...
Given a user dataset U and an object dataset I, a kNN join query in high-dimensional space returns the k nearest neighbors of each object in dataset U from the object dataset I. The kNN join is a basic and necessary operation in many applications, such as databases, data mining, computer vision, multi-media, machine learning, recommendation systems...
Given a user dataset U and an object dataset I, a kNN join query in high-dimensional space returns the k nearest neighbors of each object in dataset U from the object dataset I. The kNN join is a basic and necessary operation in many applications, such as databases, data mining, computer vision, multi-media, machine learning, recommenda-tion system...
k nearest neighbours (kNN) queries are fundamental in many applications, ranging from data mining, recommendation system and Internet of Things, to Industry 4.0 framework applications. In mining, specifically, it can be used for the classification of human activities, iterative closest point registration and pattern recognition and has also been he...
Pests and diseases are an inevitable problem in agricultural production, causing substantial economic losses yearly. The application of convolutional neural networks to the intelligent recognition of crop pest images has become increasingly popular due to advances in deep learning methods and the rise of large-scale datasets. However, the diversity...
Hop-constrained s-t simple path (\({\textsf{HC}}\text {-}{\textsf{s}}\text {-}{\mathsf {t~path}} \)) enumeration is a fundamental problem in graph analysis. Existing solutions for this problem focus on unlabelled graphs and assume queries are issued without any label constraints. However, in many real-world applications, graphs are edge-labelled an...
Given a user dataset U and an object dataset I in high-dimensional space, a kNN join query retrieves each object in dataset U its k nearest neighbors from the dataset I. kNN join is a fundamental and essential operation in applications from many domains such as databases, computer vision, multi-media, machine learning, recommendation systems, and m...
Uncertain graphs are graphs where each edge is assigned with a probability of existence. In this paper, we study the problem of hop-constrained s-t simple path enumeration in large uncertain graphs. To the best of our knowledge, we are the first to study this problem in the literature. Specifically, we propose a light-weight index to prune candidat...
Subgraph enumeration is a fundamental problem in graph analytics, which aims to find all instances of a given query graph on a large data graph. In this paper, we propose a system called HUGE to efficiently process subgraph enumeration at scale in the distributed context. HUGE features 1) an optimiser to compute an advanced execution plan without t...
Subgraph matching is a basic operation widely used in many applications. However, due to its NP-hardness and the explosive growth of graph data, it is challenging to compute subgraph matching, especially in large graphs. In this paper, we aim at scaling up subgraph matching on a single machine using FPGAs. Specifically, we propose a CPU-FPGA co-des...
There are many real-world application domains where data can be naturally modelled as a graph, such as social networks and computer networks. Relational Database Management Systems (RDBMS) find it hard to capture the relationships and inherent graph structure of data and are inappropriate for storing highly connected data; thus, graph databases hav...
Graphs are widely used to model the intricate relationships among objects in a wide range of applications. The advance in graph data has brought significant value to artificial intelligence technologies. Recently, a number of graph database systems have been developed. In this paper, we present a comprehensive overview and empirical investigation o...
Graph pattern matching is one of the most fundamental problems in graph database and is associated with a wide spectrum of applications. Due to its computational intensiveness, researchers have primarily devoted their efforts to improving the performance of the algorithm while constraining the graphs to have singular labels on vertices (edges) or n...
Recently there emerge many distributed algorithms that aim at solving subgraph matching at scale. Existing algorithm-level comparisons failed to provide a systematic view to the pros and cons of each algorithm mainly due to the intertwining of strategy and optimization. In this paper, we identify four strategies and three general-purpose optimizati...
Recently there emerge many distributed algorithms that aim at solving subgraph matching at scale. Existing algorithm-level comparisons failed to provide a systematic view of distributed subgraph matching mainly due to the intertwining of strategy and optimization. In this paper, we identify four strategies and three general-purpose optimizations fr...