Philip S. Yu

Philip S. Yu
University of Illinois at Chicago | UIC · Department of Computer Science

About

1,539
Publications
349,087
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
87,504
Citations

Publications

Publications (1,539)
Conference Paper
This paper proposes a novel Quantum Spatial Graph Convolutional Neural Network (QSGCNN) model that can directly learn a classification function for graphs of arbitrary sizes. The main idea is to define a new quantum-inspired spatial graph convolution associated with pre-transformed fixed-sized aligned grid structures of graphs, in terms of quantum...
Article
This paper proposes a new Quantum Spatial Graph Convolutional Neural Network (QSGCNN) model that can directly learn a classification function for graphs of arbitrary sizes. Unlike state-of-the-art Graph Convolutional Neural Network (GCNN) models, the proposed QSGCNN model incorporates the process of identifying transitive aligned vertices between g...
Article
Most state-of-the-art feature selection methods tend to overlook the structural relationship between a pair of samples associated with each feature dimension, which may encapsulate useful information for refining the performance of feature selection. Moreover, they usually consider candidate feature relevancy equivalent to selected feature relevanc...
Conference Paper
Full-text available
Intent detection is one of the core components of goal-oriented dialog systems, and detecting out-of-scope (OOS) intents is also a practically important skill. Few-shot learning is attracting much attention to mitigate data scarcity, but OOS detection becomes even more challenging. In this paper, we present a simple yet effective approach, discrimi...
Article
Full-text available
Network embedding has been increasingly employed in network analysis as it can learn node representations that encode the network structure resulting from node interactions. In this paper, we propose to embed not only the network structure, but also the interaction content within which each interaction arises. The interaction content should better...
Article
Full-text available
Collaborative filtering (CF) is a widely adopted technique in recommender systems. Traditional CF models mainly focus on predicting the user preference to items in a single domain, such as the movie domain or the music domain. A major challenge for such models is the data sparsity, and especially, CF cannot make accurate predictions for the cold-st...
Preprint
Full-text available
Recent years have witnessed the fast development of the emerging topic of Graph Learning based Recommender Systems (GLRS). GLRS mainly employ the advanced graph learning approaches to model users' preferences and intentions as well as items' characteristics and popularity for Recommender Systems (RS). Differently from conventional RS, including con...
Article
Full-text available
Highly-available datastores are widely deployed for Internet-based applications. However, many Internet-based applications are not contented with the simple data access interface provided by highly-available datastores. Distributed transaction support is demanded by applications such as massive online payment used by Alipay, Paypal or Baidu Wallet....
Preprint
Full-text available
Human knowledge provides a formal understanding of the world. Knowledge graphs that represent structural relations between entities have become an increasingly popular research direction towards cognition and human-level intelligence. In this survey, we provide a comprehensive review on knowledge graph covering overall research topics about 1) know...
Preprint
Within-basket recommendation reduces the exploration time of users, where the user's intention of the basket matters. The intent of a shopping basket can be retrieved from both user-item collaborative filtering signals and multi-item correlations. By defining a basket entity to represent the basket intent, we can model this problem as a basket-item...
Article
Full-text available
Currently, many intelligence systems contain the texts from multi-sources, e.g., bulletin board system posts, tweets and news. These texts can be “comparative” since they may be semantically correlated and thus provide us with different perspectives toward the same topics or events. To better organize the multi-sourced texts and obtain more compreh...
Article
Full-text available
Recently, recommender systems have received an increasing amount of attention from researchers due to their indispensable role in the more and more popular e-commercial websites. Although a lot of methods have been proposed for warm-start recommendation, cold-start recommendation still remains open as one of the major challenges of recommender syst...
Article
Full-text available
Traditional stock market prediction methods commonly only utilize the historical trading data, ignoring the fact that stock market fluctuations can be impacted by various other information sources such as stock-related events. Although some recent works propose event-driven prediction approaches by considering the event data, how to leverage the jo...
Preprint
Ensuring the privacy of sensitive data used to train modern machine learning models is of paramount importance in many areas of practice. One recent popular approach to study these concerns is using the differential privacy via a "teacher-student" model, wherein the teacher provides the student with useful, but noisy, information, hopefully allowin...
Preprint
Full-text available
Information systems have widely been the target of malware attacks. Traditional signature-based malicious program detection algorithms can only detect known malware and are prone to evasion techniques such as binary obfuscation, while behavior-based approaches highly rely on the malware training samples and incur prohibitively high training cost. T...
Preprint
Cross-domain recommendation can alleviate the data sparsity problem in recommender systems. To transfer the knowledge from one domain to another, one can either utilize the neighborhood information or learn a direct mapping function. However, all existing methods ignore the high-order connectivity information in cross-domain recommendation area and...
Conference Paper
Full-text available
In recent decade, utility mining has attracted a great attention, but most of the existing studies are developed to deal with itemset-based data. Different from the itemset-based data, the time-ordered sequence data is more commonly seen in real-world situations. Current utility mining algorithms have the limitation when dealing with sequence data...
Article
Full-text available
The main purpose of data mining and analytics is to find novel, potentially useful patterns that can be utilized in real-world applications to derive beneficial knowledge. For identifying and evaluating the usefulness of different kinds of patterns, many techniques/constraints have been proposed, such as support, confidence, sequence order, and uti...
Article
Recently, recommender systems play a pivotal role in alleviating the problem of information overload. Latent factor models have been widely used for recommendation. The unified latent factor only represents the characteristics of users and the properties of items from the aspect of purchase history. Moreover, the latent factor models usually use th...
Article
Full-text available
As there are various data mining applications involving network analysis, network embedding is frequently employed to learn latent representations or embeddings that encode the network structure. However, existing network embedding models are only designed for a single network scenario. It is common that nodes can have multiple types of relationshi...
Conference Paper
With the increasing popularity and diversity of social media, users tend to join multiple social platforms to enjoy different types of services. User identity linkage, which aims to link identical identities across different social platforms, has attracted increasing research attentions recently. Existing methods usually focus on pairwise identity...
Conference Paper
Full-text available
Network embedding, as a promising way of the network representation learning, is capable of supporting various subsequent network mining and analysis tasks, and has attracted growing research interests recently. Traditional approaches assign each node with an independent continuous vector, which will cause memory overhead for large networks. In thi...
Article
Full-text available
A wide range of complex systems can be modeled as networks with corresponding constraints on the edges and nodes, which have been extensively studied in recent years. Nowadays, with the progress of information technology, systems that contain the information collected from multiple perspectives have been generated. The conventional models designed...
Conference Paper
Full-text available
Nowadays, it is common for one natural person to join multiple social networks to enjoy different kinds of services. Linking identical users across multiple social networks, also known as social network alignment, is an important problem of great research challenges. Existing methods usually link social identities on the pairwise sample level, whic...
Article
Network embedding has been widely employed in networked data mining applications as it can learn low-dimensional and dense node representations from the high-dimensional and sparse network structure. While most existing network embedding methods only model the proximity between two nodes regardless of the order of the proximity, this paper proposes...
Article
Infectious diseases pose a constant and serious threat to human life. One way to prevent infectious disease spread is through active surveillance: monitoring patients to discover disease incidences before they get out of hand. However, active surveillance can be difficult to implement, especially when the monitored area is vast and resources are li...
Article
Full-text available
In recent years, various online social networks offering specific services have gained great popularity and success. To enjoy more online social services, some users can be involved in multiple social networks simultaneously. A challenging problem in social network studies is to identify the common users across networks to gain better understanding...
Preprint
Full-text available
This paper presents a novel framework, MGNER, for Multi-Grained Named Entity Recognition where multiple entities or entity mentions in a sentence could be non-overlapping or totally nested. Different from traditional approaches regarding NER as a sequential labeling task and annotate entities consecutively, MGNER detects and recognizes entities on...
Article
Stationless bike-sharing systems such as Mobike are currently becoming extremely popular in China as well as some other big cities in the world. Compared to traditional bicycle-sharing systems, stationless bike-sharing systems do not need bike stations. Users can rent and return bikes at arbitrary locations through an App installed on their smart p...
Preprint
Full-text available
With the fast development of various positioning techniques such as Global Position System (GPS), mobile devices and remote sensing, spatio-temporal data has become increasingly available nowadays. Mining valuable knowledge from spatio-temporal data is critically important to many real world applications including human mobility understanding, smar...
Preprint
Events are happening in real-world and real-time, which can be planned and organized occasions involving multiple people and objects. Social media platforms publish a lot of text messages containing public events with comprehensive topics. However, mining social events is challenging due to the heterogeneous event elements in texts and explicit and...
Preprint
Full-text available
CNNs, RNNs, GCNs, and CapsNets have shown significant insights in representation learning and are widely used in various text mining tasks such as large-scale multi-label text classification. However, most existing deep models for multi-label text classification consider either the non-consecutive and long-distance semantics or the sequential seman...
Article
Full-text available
With the growing popularity of shared resources, large volumes of complex data of different types are collected automatically. Traditional data mining algorithms generally have problems and challenges including huge memory cost, low processing speed, and inadequate hard disk space. As a fundamental task of data mining, sequential pattern mining (SP...
Preprint
We incorporate self activation into influence propagation and propose the self-activation independent cascade (SAIC) model: nodes may be self activated besides being selected as seeds, and influence propagates from both selected seeds and self activated nodes. Self activation reflects the real-world scenarios such as people naturally share product...
Preprint
Privacy-preserving deep learning is crucial for deploying deep neural network based solutions, especially when the model works on data that contains sensitive information. Most privacy-preserving methods lead to undesirable performance degradation. Ensemble learning is an effective way to improve model performance. In this work, we propose a new me...
Preprint
Full-text available
In the field of data mining and analytics, the utility theory from Economic can bring benefits in many real-life applications. In recent decade, a new research field called utility-oriented mining has already attracted great attention. Previous studies have, however, the limitation that they rarely consider the inherent correlation of items among p...
Article
Mobile app recommendation has been an effective solution to overcoming the information overload in mobile app markets. Recent studies have demonstrated the power of neural network in recommendation tasks which is however rarely exploited for mobile apps. As one of the development of neural network, attention-based models have shown promising result...
Preprint
Full-text available
Online knowledge libraries refer to the online data warehouses that systematically organize and categorize the knowledge-based information about different kinds of concepts and entities. In the era of big data, the setup of online knowledge libraries is an extremely challenging and laborious task, in terms of efforts, time and expense required in t...
Conference Paper
Graph neural network, as a powerful graph representation technique based on deep learning, has shown superior performance and attracted considerable research interest. However, it has not been fully considered in graph neural network for heterogeneous graph which contains different types of nodes and links. The heterogeneity and rich semantic infor...
Conference Paper
Full-text available
Program or process is an integral part of almost every IT/OT system. Can we trust the identity/ID (e.g., executable name) of the program? To avoid detection, malware may disguise itself using the ID of a legitimate program, and a system tool (e.g., PowerShell) used by the attackers may have the fake ID of another common software, which is less sens...
Preprint
Full-text available
Graph neural network (GNN), as a powerful representation learning model on graph data, attracts much attention across various disciplines. However, recent studies show that GNN is vulnerable to adversarial attacks. How to make GNN more robust? What are the key vulnerabilities in GNN? How to address the vulnerabilities and defense GNN against the ad...
Article
Network embedding, as a promising way of node representation learning, is capable of supporting various downstream network mining tasks, and has attracted growing research interests recently. Existing approaches mostly focus on learning the low-dimensional node representations by preserving the local or global topology information of a static netwo...
Preprint
Full-text available
High-utility sequential pattern mining is an emerging topic in the field of Knowledge Discovery in Databases. It consists of discovering subsequences having a high utility (importance) in sequences, referred to as high-utility sequential patterns (HUSPs). HUSPs can be applied to many real-life applications, such as market basket analysis, E-commerc...
Chapter
Radiology report writing is error-prone, time-consuming and tedious for radiologists. Medical reports are usually dominated by a large number of normal findings, and the abnormal findings are few but more important. Current report generation methods often fail to depict these prominent abnormal findings. In this paper, we propose a model named Atte...
Conference Paper
Predicting information diffusion in social networks has attracted substantial research efforts. For a specific user in a social network, whether to forward a contagion is impacted by complex interactions from both her neighboring users and the recent contagions she has been involved in, which is difficult to be modeled in a unified model. To addres...
Preprint
Full-text available
In recent decade, utility mining has attracted a great attention, but most of the existing studies are developed to deal with itemset-based data. Different from the itemset-based data, the time-ordered sequence data is more commonly seen in real-world situations. Current utility mining algorithms have the limitation when dealing with sequence data...
Article
Heterogeneous information networks (e.g. cloud service relation networks and social networks), where multiple-typed objects are interconnected, can be structured by big graphs. A major challenge for clustering in such big graphs is the complex structures that can generate different results, carrying many diverse semantic meanings. In order to gener...
Article
Agent advising is one of the key approaches to improve agent learning performance by enabling agents to ask for advice between each other. Existing agent advising approaches have two limitations. The first limitation is that all the agents in a system are assumed to be friendly and cooperative. However, in the real world, malicious agents may exist...
Article
In this paper, we propose a Distributed Intelligent Video Surveillance (DIVS) system using Deep Learning (DL) algorithms and deploy it in an edge computing environment. We establish a multi-layer edge computing architecture and a distributed DL training model for the DIVS system. The DIVS system can migrate computing workloads from the network cent...
Preprint
Question-answering plays an important role in e-commerce as it allows potential customers to actively seek crucial information about products or services to help their purchase decision making. Inspired by the recent success of machine reading comprehension (MRC) on formal documents, this paper explores the potential of turning customer reviews int...
Conference Paper
Full-text available
Nowadays, more and more customers browse and purchase products in favor of using mobile E-Commerce Apps such as Taobao and Amazon. Since merchants are usually inclined to describe redundant and over-informative product titles to attract attentions from customers, it is important to concisely display short product titles on limited screen of mobile...
Preprint
Nowadays, more and more customers browse and purchase products in favor of using mobile E-Commerce Apps such as Taobao and Amazon. Since merchants are usually inclined to describe redundant and over-informative product titles to attract attentions from customers, it is important to concisely display short product titles on limited screen of mobile...
Article
With the direct input–output connections, a random vector functional link (RVFL) network is a simple and effective learning algorithm for single-hidden layer feedforward neural networks (SLFNs). RVFL is a universal approximator for continuous functions on compact sets with fast learning property. Owing to its simplicity and effectiveness, RVFL has...
Preprint
Full-text available
Currently, many intelligence systems contain the texts from multi-sources, e.g., bulletin board system (BBS) posts, tweets and news. These texts can be ``comparative'' since they may be semantically correlated and thus provide us with different perspectives toward the same topics or events. To better organize the multi-sourced texts and obtain more...
Preprint
Network embedding, as a promising way of the network representation learning, is capable of supporting various subsequent network mining and analysis tasks, and has attracted growing research interests recently. Traditional approaches assign each node with an independent continuous vector, which will cause huge memory overhead for large networks. I...
Preprint
Two-stream convolutional networks have shown strong performance in video action recognition tasks. The key idea is to learn spatiotemporal features by fusing convolutional networks spatially and temporally. However, it remains unclear how to model the correlations between the spatial and temporal structures at multiple abstraction levels. First, th...
Article
Full-text available
With the development of social media, data often come from a variety of sources in different modalities. These data contain complementary information that can be used to produce better learning algorithms. Such data exhibit dual heterogeneity: On the one hand, data obtained from multiple modalities are intrinsically different; on the other hand, fe...
Article
Full-text available
Due to the availability of diverse information, the online social networks are of much heterogeneity, which is formally defined as Heterogeneous Information Network (HIN). Meanwhile, users usually participate in multiple networks simultaneously, but not all of them are well-labeled. By transferring the information from the well-labeled source netwo...
Preprint
Full-text available
Utility-oriented mining which integrates utility theory and data mining is a useful tool for understanding economic consumer behavior. Traditional algorithms for mining high-utility patterns (HUPs) applies a single/uniform minimum high-utility threshold (minutil) to obtain the set of HUPs, but in some real-life circumstances, some specific products...
Preprint
Full-text available
Knowledge extraction from database is the fundamental task in database and data mining community, which has been applied to a wide range of real-world applications and situations. Different from the support-based mining models, the utility-oriented mining framework integrates the utility theory to provide more informative and useful patterns. Time-...