Zhiang Wu

Zhiang Wu
Nanjing Audit University · School of Information Engineering

PhD

About

123
Publications
28,075
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,577
Citations
Introduction
My research interests include: big data computing, data mining and recommender systems. Also, I am interested in various industry applications of big data, especially the audit scenario.
Additional affiliations
May 2020 - December 2020
Nanjing Audit University
Position
  • Professor
December 2009 - present
Nanjing University of Finance and Economics
Position
  • Research
September 2004 - October 2009
Southeast University (China)
Position
  • PhD Student

Publications

Publications (123)
Article
Full-text available
In recent years, short text topic modeling has drawn considerable attentions from interdisciplinary researchers. Various customized topic models have been proposed to tackle the semantic sparseness nature of short texts. Most (if not all) of them follow the bag-of-words assumption, which, however, is not adequate since word order and phrases are of...
Article
Much recent research has shed light on the development of the relation-dependent but content-independent framework for social spammer detection. This is largely because the relation among users is difficult to be altered when spammers attempt to conceal their malicious intents. Our study investigates the spammer detection problem in the context of...
Preprint
Much recent research has shed light on the development of the relation-dependent but content-independent framework for social spammer detection. This is largely because the relation among users is difficult to be altered when spammers attempt to conceal their malicious intents. Our study investigates the spammer detection problem in the context of...
Article
Community detection has long been a fundamental problem in network analysis. A great deal of previous research has regarded community detection as an optimization process, where a variety of internal quality metrics are typically treated as objective functions, such as modularity (Q) and weighted community clustering (WCC). However, purely optimizi...
Article
Full-text available
The association-rule-based approach is one of the most common technologies for building recommender systems and it has been extensively adopted for commercial use. A variety of techniques, mainly including eligible rule selection and multiple rules combination, have been developed to create effective recommendation. Unfortunately, little attention...
Article
Full-text available
As an e-commerce feature, the personalized recommendation is invariably highly-valued by both consumers and merchants. The e-tourism has become one of the hottest industries with the adoption of recommendation systems. Several lines of evidence have confirmed the travel-product recommendation is quite different from traditional recommendations. Tra...
Article
Personalized itinerary recommendation is a complicated and challenging task, which aims to construct and recommend a visit sequence consists of multiple Points of Interest (POIs) with the constraints that maximizing user satisfaction while adhering time budget. User interests, therefore, becomes the most crucial element in the recommendation task,...
Article
In all areas of engineering, catastrophe assessment is an essential prerequisite for remedial action schemes. Modelers constantly push for more accurate models, and often meet goals by using increasingly complex, data mining-based blackbox models. However, system operators tend to favor interpretable models for after-the-fact preventive control (PC...
Article
In this study, we consider the purchase prediction problem in the context of e-tourism, an emerging and prevailing application in e-commerce. Although a wide array of studies have been taken on purchase prediction, little analysis has been done on the purchasing behaviors towards tourism products. Also, the design of the corresponding purchase pred...
Article
Various online contents on Internet platforms or search engines are related to the corporate reputation. Facing the huge amount of online contents, we need a mining method that can automatically extract and analyze a large number of network‐related information and obtain the real reliability of aspect for the content claimed by companies. In this p...
Article
Recent advances have verified ground-truth communities perceive several characteristics. That is, communities are overlapped and densely connected. Not only that, the organization of communities, in a general sense, is hierarchical. To capture all of these characteristics, we propose a framework based on link embedding method. Firstly, we define cl...
Article
Link prediction is an important task in complex network analysis and can be found in many real-world applications such as recommendation systems, information retrieval, and marketing analysis of social networks. This paper focuses on studying the evolution mechanism of real-world temporal networks. Specifically, given a set of temporal links during...
Preprint
Accurate house prediction is of great significance to various real estate stakeholders such as house owners, buyers, investors, and agents. We propose a location-centered prediction framework that differs from existing work in terms of data profiling and prediction model. Regarding data profiling, we define and capture a fine-grained location profi...
Article
Full-text available
In a cloud environment, the primary way to optimize physical resources is to reuse a physical machine (PM) by consolidating complementary multiple virtual machines (VMs) on it. When considering VMs’ dynamically changing resource demands, one hot research topic revolves around reusing VM migration resources more efficiently. The challenge here is fi...
Article
Full-text available
Next-app prediction is the task of predicting the next app that a user will choose to use on the smartphone. It helps to establish a variety of intelligent personalized services, such as fast-launch UI app, intelligent user-phone interactions, etc. Since app names only provide limited semantic information, the intrinsic relation among apps cannot b...
Article
Spammers, who manipulate online reviews to promote or suppress products, are flooding in online commerce. To combat this trend, there has been a great deal of research focused on detecting review spammers, most of which design diversified features and thus develop various classifiers. The widespread growth of crowdsourcing platforms has created lar...
Article
Networks have become increasingly important to model many complex systems. This powerful representation has been employed in different tasks of artificial intelligence including machine learning, expert and intelligent systems. Link prediction, a branch of network pattern recognition, is the most fundamental and essential problem for complex networ...
Article
Obtaining accurate location information of tracked objects is the cornerstone of providing high-quality locationbased services (LBSs). Recently, passive localization has become an increasingly important research theme, attracting the attention from both academic and industrial communities. However, existing Wi-Fi fingerprinting based passive locali...
Conference Paper
This paper aims to analyze the intrinsic characteristics of urban crime in China by quantifying the crime data in the original case record. By comparing the predicted result of the crime situation with the observation, the intrinsic characteristic and its law are validated. Firstly, a quantitative method of case information based on Chinese descrip...
Conference Paper
Full-text available
With the rapid development of tourism e-commerce, a huge amount of online tourists behavioral data is enlarged at an explosive speed. Online purchase analysis by making full use of the behavioral data undoubtedly is crucial to achieve precision marketing. Along this line, this paper offers an empirical analysis on online purchase of tourism product...
Chapter
Full-text available
Since the relation is the main data shape of social networks, social spammer detection desperately needs a relation-dependent but content-independent framework. Some recent detection method transforms the social relations into a set of topological features, such as degree, k-core, etc. However, the multiple heterogeneous relations and the direction...
Article
The effect of different protrusion amounts on various parameters (static pressure, total pressure, sealing efficiency) was experimentally studied by measuring carbon dioxide volume fraction, so as to obtain the change rule of the sealing efficiency and the minimum flow rate of sealing air of different sealing structures. In the experiment, the expe...
Article
Nowadays, online product reviews play a crucial role in the purchase decision of consumers. A high proportion of positive reviews will bring substantial sales growth while negative reviews will cause sales loss. Driven by the immense financial profits, many spammers try to promote their products or demote their competitors’ products by posting fake...
Article
Full-text available
Since the community structure is able to reveal the potential law behind complex networks, mining hiding communities has gained particular attention from various applications. A variety of objective functions, such as Modularity, weighted clustering coefficient (WCC), etc., have been developed to characterize the cohesiveness of a community, and th...
Article
Online social media is able to convey rich and timely information about real-world events. Uncovering events on social media and sensing topics from them can acquire much valuable information, which has attracted significant research effort. However, due to the large scale of data, to detect events or topics in real time is still a challenging prob...
Article
Full-text available
Consensus clustering aims to find a single partition of data that agrees as much as possible with existing basic partitions. Given its robustness and generalizability, consensus clustering has emerged as a promising solution to find cluster structures inside heterogeneous big data rising from various application domains. In the area of fuzzy system...
Article
Full-text available
Travel products recommendation has become one of emerging issues in the realm of recommendation systems. The widely-used collaborative filtering algorithms are usually difficult to be used for recommending travel products due to a number of reasons, including (1) the content of travel products is very complex, (2) the user-item matrix is extremely...
Chapter
This chapter reviews both commercial Ad fraud detection and prevention systems and the ones developed in academia. For commercial systems, they mainly emphasize on the efficiency, so fraud detection can be achieved at pre-auction level (e.g. less than 10 ms). The systems developed in academia are often more sophisticated in their designs and mathem...
Chapter
In this chapter, we first propose a taxonomy to summarize fraud in online digital advertising. The taxonomy provides a complete view of major fraudulent activities in answering questions related to Who does What to Whom, and How. The proposed fraud taxonomy includes three major categories: placement oriented fraud, network traffic oriented fraud, a...
Chapter
This chapter provides a comprehensive review of Ad fraud in three major categories: placement fraud, traffic fraud, and action fraud, which are at different levels of online advertising. Placement fraud mainly focuses on the pages which displaying the Ads. For placement oriented fraudulent activities, they often modify publisher pages or the web pa...
Chapter
In this chapter, we briefly describe the digital advertising ecosystem, mainly from the display advertising perspective. We will first describe the real-time bidding framework for online digital advertising, including technical platforms for publishers, advertisers, and the market place for online Ad inventory buying and selling. After that, we wil...
Chapter
Online advertising fraud represents a significant portion of deceiving actions in digital advertising systems which use numerous technologies to derive illicit returns. Even the most conservative estimation has shown that more than 10% of Ad inventory is consumed by bot or fraud impressions. Despite of the fast growth of the computational advertisi...
Chapter
In this chapter, we discuss measures and benchmark datasets commonly used for Ad fraud detection. The measures include fraud detection accuracy, precision, recall, F-measure, and AUC scores which are commonly used to validate the performance of classifiers for classification. In addition, we also summarize several real-world datasets which are curr...
Article
In researches about mechanical properties of nanowires, welding methods, such as FIB and EBID which are most common methods used to fixed nanowires, reduce devices lifetime and repeatability. In order to overcome the shortcomings of the welding method, a new MEMS device without need of welding or depositing was designed in this paper. Basing on the...
Article
Full-text available
MapReduce is a widely-used programming model in cloud environment for parallel processing large-scale data sets. The combination of the high-level language with a SQL-to-MapReduce translator allows programmers to code using SQL-like declarative language, so that each program can afterwards be complied into a MapReduce jobflow automatically. This wa...
Book
The authors systematically review methods of online digital advertising (ad) fraud and the techniques to prevent and defeat such fraud in this brief. The authors categorize ad fraud into three major categories, including (1) placement fraud, (2) traffic fraud, and (3) action fraud. It summarizes major features of each type of fraud, and also outlin...
Conference Paper
Full-text available
To identify right customers who intend to replace the smartphone can help to perform precision marketing and thus bring significant financial gains to cellphone retailers. In this paper, we provide a study of exploiting mobile app usage for predicting users who will change the phone in the future. We first analyze the characteristics of mobile log...
Conference Paper
Online information has become important data source to analyze the public opinion and behavior, which is significant for social management and business decision. Web crawler systems target at automatically download and parse web pages to extract expected online information. However, as the rapid increasing of web pages and the heterogeneous page st...
Article
Based on the fractal geometry theory, considering the wettability of the fluid and gas-water two-phase flow in the capillary, a gas-water relative permeability model was established to study the gas-water two-phase flow in tight sandstone porous media. The analytic calculation formula of gas-water relative permeability was derived. The results show...
Article
Full-text available
Community detection is a classic and very difficult task in complex network analysis. As the increasingly explosion of social media, scaling community detection methods to large networks has attracted considerable recent interests. In this paper, we propose a novel SIMPLifying and Ensembling (SIMPLE) framework for parallel community detection. It e...
Article
Full-text available
Purpose – With the popularity of e-commerce, shilling attack is becoming more rampant in online shopping websites. Shilling attackers publish mendacious ratings as well as reviews for promoting or suppressing target products. The purpose of this paper is to investigate group shilling, a new typed shilling attack, behavior in a real e-commerce platf...
Article
MapReduce is undoubtedly the most popular framework for large-scale processing and analysis of vast data sets in clusters of machines. To facilitate the easier use of MapReduce, SQL-like declarative languages and SQL-to-MapReduce translators have attracted increasing attentions recently. The SQL-to-MapReduce translator can automatically generate th...
Chapter
Full-text available
Multi-relational networks (in short as MRNs) refer to such networks including one-typed nodes but associated with each other in poly-relations. MRNs are prevalent in the real world. For example, interactions in social networks include various kinds of information diffusion: email exchange, instant messaging services and so on. Community detection i...
Conference Paper
Full-text available
Stock market prediction focus on developing approaches to determine the future price of a stock or other financial product. The key task of stock market prediction is to determine the timing for the buying or selling of stock, undoubtedly, it is very difficult due to the high volatility and nonlinear relationships driven by short-term fluctuations...
Conference Paper
Frequent pattern mining is commonly utilized to generate combined-feature candidates, yet many are non-discriminative and thus might be useless for predictive models. In this paper, we propose to use feature combinations derived from frequent patterns to obtain more accurate multiclass classification models. Specifically, we present a novel mathema...
Conference Paper
Uncovering shilling attackers hidden in recommender systems is very crucial to enhance the robustness and trustworthiness of product recommendation. Many shilling attack detection algorithms have been proposed so far, and they exhibit complementary advantage and disadvantage towards various types of attacks. In this paper, we provide a thorough exp...
Conference Paper
Learning user/item relation is a key issue in recommender system, and existing methods mostly measure the user/item relation from one particular aspect, e.g., historical ratings, etc. However, the relations between users/items could be influenced by multifaceted factors, so any single type of measure could get only a partial view of them. Thus it i...
Article
Full-text available
The backbone is the natural abstraction of a complex network, which can help people understand a networked system in a more simplified form. Traditional backbone extraction methods tend to include many outliers into the backbone. What is more, they often suffer from the computational inefficiency—the exhaustive search of all nodes or edges is often...
Article
Full-text available
Discovering communities can promote the understanding of the structure, function and evolution in various systems. Overlapping community detection in poly-relational networks has gained much more interests in recent years, due to the fact that poly-relational networks and communities with pervasive overlap are prevalent in the real world. A plethor...
Article
Full-text available
Recent years have witnessed the explosive growth of recommender systems in various exciting application domains such as electronic commerce, social networking, and location-based services. A great many algorithms have been proposed to improve the accuracy of recommendation, but until recently the long tail problem rising from inadequate recommendat...
Article
Today, the emergence of web-based communities and hosted services such as social networking sites, wikis and folksonomies, brings in tremendous freedom of web autonomy and facilitate collaboration and knowledge sharing between users. Along with the interaction between users and computers, social media is rapidly becoming an important part of our di...
Article
Ancient language manuscripts constitute a key part of the cultural heritage of mankind. As one of the most important languages, Chinese historical calligraphy work has contributed to not only the Chinese cultural heritage but also the world civilization at large, especially for Asia. To support deeper and more convenient appreciation of Chinese cal...
Conference Paper
The scale of current networked system is becoming increasingly large, which exerts significant challenges to acquire the knowledge of the entire graph structure, and most global community detection methods often suffer from the computational inefficiency. Local community detection aims at finding a community structure starting from a seed vertex wi...