Article

Efficiently Targeted Billboard Advertising Using Crowdsensing Vehicle Trajectory Data

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Different from online promotion, the outdoor billboard advertising industry suffers from a lack of audience-targeted delivery and quantitative dissemination evaluation, which undermine its impact in practice and hinder it from fast development. To bridge this gap, in the paper, we leverage crowdsensing vehicle trajectory data to empower audience-targeted billboard advertising. More specifically, by integrating the information of mobility transition, traffic conditions (traffic volume and average speed) and advertisement semantic topics, we propose a quantitative model to quantify advertisement influence spread, with a special consideration on influence overlapping among mobile users. Based on it, an Influence Maximization-Targeted Billboard Advertising problem is formulated to find $k$ advertising units over spatiotemporal dimensions, with the goal of maximizing the total expected advertisement influence spread. To tackle the efficiency issue for solving large combinatorial optimization problem, we employ a divide-and-conquer mechanism, and propose a utility evaluation-based optimal searching approach. Extensive experiments on real-world taxicab trajectories clearly validate the effectiveness and efficiency of our proposed approach.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The public transit networks move millions of individuals (or potential consumers) daily in Australia, but the share of transit adverting is just around 1% of the total spending in the Australian advertising market (OMA Australia 2018). One of the reasons for the low share of transit advertising is the lack of consumer profiles, i.e. it is difficult to run targeted advertising (Wang et al. 2019). On the other hand, online advertising covers 44% of the total spending in the Australian advertising market, which is because it is easier to track users' characteristics and behaviour on the Internet through developing online targeted advertising techniques. ...
... The existing literature about using smart card data for advertising purposes is limited to a few studies (Paez et al. 2011;Paez et al. 2012;Wang et al. 2019) focused on potentials of smart card data for marketing and advertising purposes. Paez et al. (2011) suggested using the smart card data for marketing purposes. ...
... Also, they discussed the potential applications of the proposed approach. According to Wang et al. (2019), one of the reasons for the failure of the out of home advertising is the lack of consumer profiles, i.e. it is difficult to run the targeted advertising, which could be addressed by using smart card data. Faroqi et al. (2019b) proposed a simple linear targeted advertising model in the public transit network, which only considered advertisements at public transit stops focusing only on the coverage of the advertisements. ...
Article
Full-text available
A great number of urban residents uses public transit network to travel and reach their destination. While the public transit network could perform as a valuable medium for advertising purposes, the share of transit advertising in annual advertising spending is low due to the lack of passengers’ profiles. This paper proposes a targeted advertising model in the public transit network regarding the extracted passengers’ profiles from smart card data. The model exposes advertisements to groups of passengers in the public transit network regarding their activities and trips. A targeted group includes passengers with similar activities (considering type, location, and time of the activity) and trips (considering spatial and temporal dimensions of the trip). An agglomerative hierarchical clustering method is used to discover activity-trip groups of passengers according to the defined activity and trip similarity measures. An optimization problem is formulated to allocate advertisements to all activity-trip groups aiming at maximizing the coverage and minimizing the cost of the advertisements. Non-Dominated Sorting Genetic-II (NSGA-II) algorithm is used to solve the optimization problem. One-day smart card dataset from Brisbane, Australia is used to implement the model and examine the outcomes. Results show that at different cost intervals, solutions with high coverage can be applied to the network targeting all the activity-trip groups of passengers.
... Existing works have primarily focused on the selection of billboard locations by means of visual analytics of large-scale taxi trajectories [8], the allocation of billboard ad space to clients based on user trajectory-driven data [14] and crowd-sensed vehicular trajectory data [13]. Our work fundamentally differs from existing works as follows. ...
... Research efforts are also being made to find the optimal set of billboards (in the billboard advertisement scenario) based on the spatial trajectories of the users [13,14]. The approach proposed in [14] uses a database of trajectories and a budget constraint L to find a set of billboards within a budget such that the ads placed on the selected billboards influence the most significant number of trajectories. ...
... It divides the billboards into a set of clusters based on the overlap and maximizes the unique views (influence) in each cluster with a specific budget using dynamic programming and outputs the combined output from each cluster. The work in [13] constructs a quantitative model for maximizing the total expected advertisement influence spread. However, it does not consider the budget of the user and the billboard's cost. ...
Chapter
Full-text available
Billboard advertisement is among the dominant modes of outdoor advertisements. The billboard operator has an opportunity to improve its revenue by satisfying the advertising demands of an increased number of clients by means of exploiting the user trajectory data. Hence, we introduce the problem of billboard advertisement allocation for improving the billboard operator revenue, and propose an efficient user trajectory-based transactional framework using coverage pattern mining. Our experiments validate the effectiveness of our framework.
... In [3] and [4], the billboard locations are determined by using GPS and phone data. Besides, advertising content on the billboard is determined by the preferences of potential customers and the detour distance in [5] and [6]. In the real world, the potential customers passing the same billboard location change over time, and hence the traditional static roadside billboards do not perform well. ...
... In this paper, we adopt Mobile CrowdSensing (MCS) [9][10][11] to gather the privacy-sensitive customer profiles [5,12] such as their vehicular trajectories and preferences. For example, a MCS application may record a user's vehicular trajectories when the user finishes some sensing tasks. ...
... An et al. design an advertisement system by using Wi-Fi union mechanism in order to enhance the efficiency of advertisement. In [5], L. Wang et al. design a model to quantify advertisement influence spread and propose a utility evaluation-based optimal searching approach so that the total expected advertisement influence spread could be maximized. In [6], H. Zheng et al. investigate a promising application for Vehicular Cyber-Physical Systems (VCPS). ...
Article
Full-text available
As an effective tool, roadside digital billboard advertising is widely used to attract potential customers (e.g., drivers and passengers passing by the billboards) to obtain commercial profit for the advertiser, i.e., the attracted customers' payment. The commercial profit depends on the number of attracted customers, hence the advertiser needs to adopt an effective advertising strategy to determine the advertisement switching policy for each digital billboard to attract as many potential customers as possible. Whether a customer could be attracted is influenced by numerous factors, such as the probability that the customer could see the billboard and the degree of his/her interests in the advertisement. Besides, cooperation and competition among all digital billboards will also affect the commercial profit. Taking the above factors into consideration, we formulate the dynamic advertising problem to maximize the commercial profit for the advertiser. To address the problem, we first extract potential customers' implicit information by using the vehicular data collected by Mobile CrowdSensing (MCS), such as their vehicular trajectories and their preferences. With this information, we then propose an advertising strategy based on multi-agent deep reinforcement learning. By using the proposed advertising strategy, the advertiser could determine the advertising policy for each digital billboard and maximize the commercial profit. Extensive experiments on three real-world datasets have been conducted to verify that our proposed advertising strategy could achieve the superior commercial profit compared with the state-of-the-art strategies.
... W ITH the dramatic proliferation of sensor-equipped portable mobile devices and wireless communication, a novel sensing paradigm named Mobile Crowd Sensing (MCS) [1]- [3] has become an effective way to sense and collect data about physical environment and human society. Instead of deploying static and expensive distributed sensors, MCS utilizes smartphone equipped with a plethora of onboard sensors (e.g., accelerometer, compass, gyroscope, G-PS, camera, etc.) and users' mobility to implement sensing tasks, e.g., air quality, noise level, emergent events, etc [4]- [6]. ...
... If one user has larger social relation strength with user u i , it is likely that the motivation to accept and complete this transferred task would be strong. On the basis ALGORITHM 1: FPSAll Algorithm Input: MCS tasks: T , Available users: U, Historical mobility: T D; Output: Task allocation solution: S; 1 ∀u i ∈ U, t j ∈ T , Compute estimated probability p(u i , t j ); 2 Build bipartite graph G(U, T , E, W ); 3 for each task t j ∈ T do 4 Identify eligible candidate users U * tj ; ...
... Input: Failed MCS task: t, Initial performer: u i , Parameters: ε, σ, w i , 1 i 3; Output: Task successors: u; 1 Identify a subset of participant users U * (t); 2 for each user u j in U * (t) do 3 Calculate social relation strength SR(u i , u j ); 4 Measure the physical distance between user u j and t: DS(u j ); 5 Take u j 's reputation R(u j ); 6 Measure u j 's integrated utility based on Eq.15; 7 end 8 Rank U * and select the one with maximum value; 9 Update involved users' reputation and distribute rewards; ...
Article
Full-text available
As an appealing sensing paradigm, Mobile Crowd Sensing (MCS) which provides a cost-efficient solution for large-scale urban sensing tasks has gained significant attention in recent years. However, in practice, many MCS applications usually suffer from the failure of sensing task execution, ranging from the randomness and autonomous in participant users’ behavior, to lacking of prior experience and monetary reward, etc. To mitigate the impact of these failures, in this paper, we propose and study a novel problem, namely failure-aware mobile crowd sensing. To solve our problem, we devise a two-stages framework, including offline task allocation and online task transfer. Towards enhancing task completion ratio, we propose an indeterminate fitness proportionate based task allocation approach FPSAll, and an utility evaluation-based task transfer approach FTASKTraf, respectively. Through extensive experiments, we demonstrate the efficiency and effectiveness of our proposed approaches on real-world data set.
... However, to launch a location-aware promotion in geo-social networks, the first assumption may not be true due to users' diverse spatial distributions. For instance, users who are close to a target location have a greater probability to adopt the promoted product [6], [7], e.g., a restaurant or a gym and so on. Therefore, it is necessary to differentiate these potential customers from other users, and attach more importance to them; otherwise, it may direct "wrong audiences" who are not profitable. ...
... In geo-social networks, not all the users are potential customers for Q. Intuitively, users who are close to location q and interested in T # have a greater probability to adopt it [6], [7]. As a result, it is natural to attach more importance to those targeted users. ...
... ALGORITHM 3: IS-MOPSO+ Algorithm Input: RR Set: R, population size: pop, maximum number of iterations: mt Output: A set of Pareto solutions: S pareto 1 Initialize particle population S, |S| = pop; 2 Set local best position pos l as initialized particle; 3 Initial velocity vel equals to zeros vector; 4 while not meet iteration mt do 5 foreach particle S i ∈ S do 6 Calculate fitness of objective functions; 7 Update local best position p l for each particle; 8 end 9 Determine Pareto solutions in S; 10 Update external storage set S pareto ; 11 foreach particle S i ∈ S do 12 Select a particle randomly from S pareto as p g ; 13 Update and calibrate newly generated position; ...
Article
Full-text available
As an emerging social dynamic system, geo-social network can be used to facilitate viral marketing through the wide spread of targeted advertising. However, unlike traditional influence spread problem, the heterogeneous spatial distribution has to incorporated into geo-social network environment. Moreover, from the perspective of business managers, it is indispensable to balance the trade-off between the objective of influence spread maximization and objective of promotion cost minimization. Therefore, these two goals need to be seamlessly combined and optimized jointly. In this paper, considering the requirements of real-world applications, we develop a multiobjective optimization based influence spread framework for geo-social networks, revealing the full view of Pareto-optimal solutions for decision makers. Based on the reverse influence sampling (RIS) model, we propose a similarity matching-based RIS sampling method to accommodate diverse users, and then transform our original problem into a weighted coverage problem. Subsequently, to solve this problem, we propose a greedy-based incrementally approximation approach and heuristic-based particle swarm optimization approach. Extensive experiments on two real-world geo-social networks clearly validate the effectiveness and efficiency of our proposed approaches.
... Wetan Loji 10. ...
... (4) There is high awareness of the audience when using the media, interests and motives become an accurate picture of usage. (5) Evaluation of media content can only be done by audiences [8] [9] [10]. Audience satisfaction can be seen from certain aspects, such as Gratification Sought (the search for satisfaction) and Gratification Obtained (obtained satisfaction). ...
Article
Full-text available
This study aimed to provide input and evaluation for the Surakarta City Communication, Information, Statistics, and Encryption Service in measuring the effectiveness of their information dissemination work programs. Conducted as a descriptive quantitative research with a sample size of N = 30, the study assessed the media consumption patterns of Surakarta City residents and the affordability of the promotional media used by the Solo City Government. The findings of this study revealed that the magazine "Solo Berseri" possesses sufficient attractiveness and informational value, meeting the gratification needs of its readers. However, the limited accessibility of this magazine hampers its effectiveness as a promotional medium. These results emphasize the importance of ensuring accessibility and distribution channels for promotional materials. Addressing the limited accessibility of "Solo Berseri" would enhance its impact as a promotional medium for the Solo City Government. Additionally, the findings suggest that the strategic placement of billboards can effectively reach the target audience and disseminate important information. Based on these findings, recommendations can be made for the Surakarta City Communication, Information, Statistics, and Encryption Service to improve their information dissemination strategies. Implementing measures to increase the accessibility of "Solo Berseri" and exploring additional affordable promotional media channels could enhance the effectiveness of their communication efforts. By incorporating these suggestions, the Surakarta City Government can enhance their communication efforts and ensure that important information reaches and resonates with the residents effectively
... Now, in the case of any outdoor or online advertising technique, choosing the right audience for the right advertisement is a very challenging task due to the lack of an audience profile. Now, Wang et al. [19] studied that the outdoor advertising industry suffers from delivery of influence from the targeted audience. They introduced a divide and conquer based search approach to resolve this gap. ...
... We adopt this probability settings in our experiments as well. Although it can be calculated in several ways depending to the needs of applications [1], [19], [9]. ...
Preprint
Full-text available
Billboard advertising is a popular out-of-home advertising technique adopted by commercial houses. Companies own billboards and offer them to commercial houses on a payment basis. Given a database of billboards with slot information, we want to determine which k slots to choose to maximize influence. We call this the INFLUENTIAL BILLBOARD SLOT SELECTION (IBSS) Problem and pose it as a combinatorial optimization problem. We show that the influence function considered in this paper is non-negative, monotone, and submodular. The incremental greedy approach based on the marginal gain computation leads to a constant factor approximation guarantee. However, this method scales very poorly when the size of the problem instance is very large. To address this, we propose a spatial partitioning and pruned submodularity graph-based approach that is divided into the following three steps: preprocessing, pruning, and selection. We analyze the proposed solution approaches to understand their time, space requirement, and performance guarantee. We conduct extensive set of experiments with real-world datasets and compare the performance of the proposed solution approaches with the available baseline methods. We observe that the proposed approaches lead to more influence than all the baseline methods within reasonable computational time.
... Billboard placement aims to find a limited number of billboards to maximize their influence on passengers and further increase profits. Recently, several studies [76,112,164,201,202] have investigated trajectory-driven billboard placement. Guo et al. [76] proposed the top-k trajectory influence maximization problem, which aims to find k trajectories for deploying billboards on buses to maximize the expected influence based on the audience. ...
... Zhang et al. [202] proposed a logistic influence model which solves a key shortcoming in approaches that depend on one-time impressions [201], which did not consider the relationship between the influence effect and the impression counts for a single user. Wang et al. [164] placed billboards in a road network, and applied a divide-and-conquer strategy to accelerate the processing. As we conclude in Table 7, billboard placement problem definition always have a budget, and it is also general for other site selection problem in reality as public resource allocation also needs to be considered in the budget. ...
Preprint
Full-text available
Recent advances in sensor and mobile devices have enabled an unprecedented increase in the availability and collection of urban trajectory data, thus increasing the demand for more efficient ways to manage and analyze the data being produced. In this survey, we comprehensively review recent research trends in trajectory data management, ranging from trajectory pre-processing, storage, common trajectory analytic tools, such as querying spatial-only and spatial-textual trajectory data, and trajectory clustering. We also explore four closely related analytical tasks commonly used with trajectory data in interactive or real-time processing. Deep trajectory learning is also reviewed for the first time. Finally, we outline the essential qualities that a trajectory management system should possess in order to maximize flexibility.
... This allowed for a more focused marketing based on the habits of the groups of people. In [50], the authors leveraged crowdsensing vehicle trajectory data to empower audience-targeted billboard advertising by studying the hypothetical advertisement spread based on the semantic topic. There are several other active fields of research that are capable of supporting MCS in providing individual or collective context awareness. ...
Article
Full-text available
According to KPMG, Internet of Things (IoT) technology was among the top 10 technologies of 2019. It has been growing at a significant pace, influencing and disrupting several application domains. It is expected that by 2025, 75.44 billion devices will be connected to the Internet. These devices generate massive amounts of data which, when harnessed using the power of data science (DS) techniques and approaches such as artificial intelligence (AI) and machine learning (ML), can provide significant benefits to economy, society, and people. Examples of areas that are being disrupted are digital marketing and retail commerce services in smart cities. This paper presents a vision for Marketing 4.0 that is underpinned by disruptive digital technologies such as IoT and DS. We present an analysis of the current state of the art in IoT and DS via the three pillars of marketing: namely, people, products, and places. We propose a blueprint architecture for developing a Marketing 4.0 solution that is underpinned by IoT and DS. We conclude the paper by highlighting the open challenges that need to be addressed in order to realise the Marketing 4.0 blueprint architecture, including supporting the integration of IoT data concerning people, products, and places and using DS to make efficient and effective recommendations.
... An important trend within the OOH literature, which was observed in this bibliometric study, is the increased use of big data to segment audiences, measure campaign effectiveness, and identify optimal ad placement in complex consumer environments. Except for a few studies (e.g., Page et al. 2018;Wilson and Suh 2018), most authors using big data to address these problems are not ad management and public policy scholars per se but rather researchers from engineering, computer science, and geography, who apply their data and contextual expertise to business problems (e.g., Chmielewski and Tompalski 2017;Wang et al. 2020; Zielinska-Dabkowska and Xavia 2019). As visualized in Figure 1, the clusters to which these latter articles are principally attached (e.g., digital signage and social impact) are often located at the edges of the network map and the articles themselves at the furthest edges of their respective cluster. ...
Article
OOH advertising has received significant interest from scholars and practitioners for its synergistic potential with digital media and its lead role in many campaigns. A bibliometric analysis, which uses citation patterns and publication data, quantitatively assesses the intellectual structure of the OOH advertising literature to better facilitate knowledge dissemination for scholars and practitioners. The study reviews 343 articles found within the Scopus database and finds the literature is appreciably multidisciplinary. The impact of authors, articles, and journals within the literature are explored, and its theoretical underpinnings and concepts are identified.
... Some researchers have focused on realistic factors in IM problem. Wang et al. (2020) proposed IM problem in targeted billboard advertising using crowd-sensing vehicle trajectory data. Li et al. (2021) discussed the dynamic nature of propagation on diffusion model. ...
Article
Online Social Network (OSN) is one of the most popular internet services. It also has become the main source of news for many people. Despite all the benefits, OSN significantly increases the rate of rumor spreading among people. Influence Blocking Maximization (IBM) aims to limit the propagation of rumor by broadcasting anti-rumor information. In IBM problem, the users are treated equally, however, they may have dissimilar worthiness. In this paper, we introduce Non-Uniform IBM (NU-IBM) where each user has its own weight. As a case study for NU-IBM, we present Distance-Aware IBM (DA-IBM) where determines the users’ weight based on the geographical distance from the Rumor Targeted Location (RTL). In order to handle non-identical weight for users, we develop a sampling-based method called NU-IBM-Solver. Through attentively analyzing the sample size, our proposed method is able to return a (1−1/e−ϵ) approximation solution similar to greedy algorithm. As well as theoretical analysis, we perform extensive experiments over four real-world networks in various conditions. The evaluations also confirm that NU-IBM-Solver is similar to greedy in terms of effectiveness, while it is thousands of times faster.
... For example, mobile technology, such as Internet usage records, global positioning system (GPS) data, Bluetooth, and social media check-ins, were used to define audience segments and exposure to OOH advertising (e.g., Huang et al. 2020b;Page et al. 2018;Zhang et al. 2020). Secondary data from taxi trips, shared bike rentals, and geotagged Twitter posts near OOH media were similarly used to determine audience measurement, audience segmentation, and media pricing (Lai, Cheng, and Lansley 2017;Sun et al. 2020;Wang et al. 2020). The value of their research lies not only with practitioners interested in new ways of measuring and segmenting audiences and in finding ideal locations for OOH media installations but also for scholars needing to operationalize difficult-tomeasure constructs within OOH advertising experiments. ...
Article
Out-of-home (OOH) advertising is an important and prominent component in many advertising campaigns. Yet the medium is underresearched, and scholarly research is highly fragmented. The purpose of this systematic review is to consolidate the extant OOH advertising literature, identify gaps within and among the disparate and multidisciplinary research streams, and offer a theoretically grounded agenda to stimulate interest in OOH advertising research. The review includes 454 articles spanning 104 years. Research results are divided between ad management and public policy topics, and 20 research questions are presented to move scholarly research forward.
... Urban computing: Urban computing has a very wide research scope, aiming to address specific problems in urban life by utilizing different kinds of data, such as GPS data [9][10][11], Wi-Fi and Bluetooth data [12,13], social network data [14], crowd-sourcing temperature and humidity data [15], etc. We also focus on solving urban problems mainly based on GPS data, which can show more accurate information of the locations and trajectories of people's daily activities than other data sources. ...
Article
Full-text available
The prediction of human mobility can facilitate resolving many kinds of urban problems, such as reducing traffic congestion, and promote commercial activities, such as targeted advertising. However, the requisite personal GPS data face privacy issues. Related organizations can only collect limited data and they experience difficulties in sharing them. These data are in “isolated islands” and cannot collectively contribute to improving the performance of applications. Thus, the method of federated learning (FL) can be adopted, in which multiple entities collaborate to train a collective model with their raw data stored locally and, therefore, not exchanged or transferred. However, to predict long-term human mobility, the performance and practicality would be impaired if only some models were simply combined with FL, due to the irregularity and complexity of long-term mobility data. Therefore, we explored the optimized construction method based on the high-efficient gradient-boosting decision tree (GBDT) model with FL and propose the novel federated voting (FedVoting) mechanism, which aggregates the ensemble of differential privacy (DP)-protected GBDTs by the multiple training, cross-validation and voting processes to generate the optimal model and can achieve both good performance and privacy protection. The experiments show the great accuracy in long-term predictions of special event attendance and point-of-interest visits. Compared with training the model independently for each silo (organization) and state-of-art baselines, the FedVoting method achieves a significant accuracy improvement, almost comparable to the centralized training, at a negligible expense of privacy exposure.
... A number of mobility analysis works that used the trajectory data warehouses (TDW) approach were published. For example, TDW was used for a recommender system for tourists [35], for road traffic analysis [36], and for finding the best location of billboard placement [37]. Further, some researchers added semantic information for TDW, such as for improving nurse productivity [38], mobility analysis [39], and modeling multiple aspect trajectories [8] to support decision making. ...
Article
Full-text available
The accessibility of devices that track the positions of moving objects has attracted many researchers in Mobility Online Analytical Processing (Mobility OLAP). Mobility OLAP makes use of trajectory data warehousing techniques, which typically include a path of moving objects at a particular point in time. The Semantic Web (SW) users have published a large number of moving object datasets that include spatial and non-spatial data. These data are available as open data and require advanced analysis to aid in decision making. However, current SW technologies support advanced analysis only for multidimensional data warehouses and Online Analytical Processing (OLAP) over static spatial and non-spatial SW data. The existing technology does not support the modeling of moving object facts, the creation of basic mobility analytical queries, or the definition of fundamental operators and functions for moving object types. This article introduces the QB4MobOLAP vocabulary, which enables the analysis of mobility data stored in RDF cubes. This article defines Mobility OLAP operators and SPARQL user-defined functions. As a result, QB4MobOLAP vocabulary and the Mobility OLAP operators are evaluated by applying them to a practical use case of transportation analysis involving 8826 triples consisting of approximately 7000 fact triples. Each triple contains nearly 1000 temporal data points (equivalent to 7 million records in conventional databases). The execution of six pertinent spatiotemporal analytics query samples results in a practical, simple model with expressive performance for the enabling of executive decisions on transportation analysis.
... We regard POI categories Γ = {π 1 , π 2 , π 3 , ..., π q } as topics in this study, where each element π i ∈ Γ denotes one POI category tag, such as outdoors, entertainment, and so on. Given the set of POI category tags Γ, task topics is represented as a distribution over the POI category tags [11], [12], e.g., ...
Article
Full-text available
With the increasing prominence of smart mobile devices, an innovative distributed computing paradigm, namely Mobile Crowdsourcing (MCS), has emerged. By directly recruiting skilled workers, MCS exploits the power of the crowd to complete location-dependent tasks. Currently, based on online social networks, a new and complementary worker recruitment mode, i.e., socially aware MCS, has been proposed to effectively enlarge worker pool and enhance task execution quality, by harnessing underlying social relationships. In this paper, we propose and develop a novel worker recruitment game in socially aware MCS, i.e., Acceptance-aware Worker Recruitment (AWR). To accommodate MCS task invitation diffusion over social networks, we design a Random Diffusion model, where workers randomly propagate task invitations to social neighbors, and receivers independently make a decision whether to accept or not. Based on the diffusion model, we formulate the AWR game as a combinatorial optimization problem, which strives to search a subset of seed workers to maximize overall task acceptance under a pre-given incentive budget. We prove its NP hardness, and devise a meta-heuristic-based evolutionary approach named MA-RAWR to balance exploration and exploitation during the search process. Comprehensive experiments using two real-world data sets clearly validate the effectiveness and efficiency of our proposed approach.
... Hence, Zhang et al. [199] further propose a logistic influence model to address it. Wang et al. [200] also consider the constraint of budget, and they use a divide-and-conquer strategy to improve the efficiency of placing billboards on road networks. Taking into account many factors (e.g., the customers' interest, the cooperation and competition among billboards) influencing the benefit of billboards, Lou et al. [201] formulate the dynamic advertising problem to maximize the commercial profit. ...
Article
Full-text available
Intelligent transportation (e.g., intelligent traffic light) makes our travel more convenient and efficient. With the development of mobile Internet and position technologies, it is reasonable to collect spatio-temporal data and then leverage these data to achieve the goal of intelligent transportation, and here, traffic prediction plays an important role. In this paper, we provide a comprehensive survey on traffic prediction, which is from the spatio-temporal data layer to the intelligent transportation application layer. At first, we split the whole research scope into four parts from bottom to up, where the four parts are, respectively, spatio-temporal data, preprocessing, traffic prediction and traffic application. Later, we review existing work on the four parts. First, we summarize traffic data into five types according to their difference on spatial and temporal dimensions. Second, we focus on four significant data preprocessing techniques: map-matching, data cleaning, data storage and data compression. Third, we focus on three kinds of traffic prediction problems (i.e., classification, generation and estimation/forecasting). In particular, we summarize the challenges and discuss how existing methods address these challenges. Fourth, we list five typical traffic applications. Lastly, we provide emerging research challenges and opportunities. We believe that the survey can help the partitioners to understand existing traffic prediction problems and methods, which can further encourage them to solve their intelligent transportation applications.
... The message board selection is considered as a maximization problem because the objective is to maximize the visibility of the message. The idea of billboard advertising can be applied here to maximize the strength of message exposure (L [70]. Also, the major parking slots in a closed campus can be selected to reach the messages to the maximum. ...
Article
Full-text available
A significant amount of research work carried out on traffic management systems, but intelligent traffic monitoring is still an active research topic due to the emerging technologies such as the Internet of Things (IoT) and Artificial Intelligence (AI). The integration of these technologies will facilitate the techniques for better decision making and achieve urban growth. However, the existing traffic prediction methods mostly dedicated to highway and urban traffic management, and limited studies focused on collector roads and closed campuses. Besides, reaching out to the public, and establishing active connections to assist them in decision-making is challenging when the users are not equipped with any smart devices. This research proposes an IoT based system model to collect, process, and store real-time traffic data for such a scenario. The objective is to provide real-time traffic updates on traffic congestion and unusual traffic incidents through roadside message units and thereby improve mobility. These early-warning messages will help citizens to save their time, especially during peak hours. Also, the system broadcasts the traffic updates from the administrative authorities. A prototype is implemented to evaluate the feasibility of the model, and the results of the experiments show good accuracy in vehicle detection and a low relative error in road occupancy estimation. The study is part of the Omani-funded research project, investigating Real-Time Feedback for Adaptive Traffic Signals.
... Zhang et al. (2017) used bus stations' surrounding POIs to characterize the bus stations for transit advertising. Wang et al. (2019) analyzed the categorical distribution of users' destination POIs, which was further utilized to evaluate the matching degree with advertising contents. However, almost all of these studies assumed that the location carries a temporally constant semantic meaning and ignored the temporal dynamics to which the environment is often subjected. ...
Article
Digital billboards, as a new form of outdoor advertising, has gained popularity in recent years per its revolutionized way to control when and where the specific ads appear. However, this development also demands more complicated optimization for strategic deployments: the advertisers have to not only decide on a set of locations to display their ads, but also when to display them. The existing static optimization approaches become insufficient for this dynamic scenario to match advertisement and intended audience. Therefore, this research proposes three models in a workflow to mine mobile phone data and points of interest (POIs) data and to meet advertising needs in various situations. The three optimization models include a dynamic audience model to maximize the coverage of the target users, a dynamic environment model to maximize the coverage of the target environment, and a dynamic integrated model to maximize the coverage of both target audience and environment. A case study using shopping ads in Wuxue, China tests the three optimalization models. The results show that the proposed models are effective for providing an optimal solution for digital billboard configuration with a greater coverage of the target audience and environment compared to the state-of-the-art static models.
... mobile advertising [5] and task computing [6] to the vehicles (i.e., workers) that participate in the system according to their willingness and bidding price. However, in practice, users are not willing to participate in vehicular crowdsensing applications owing to the lack of an appropriate incentive strategy and concerns about the leakage of private information when sharing data. ...
Article
Smart vehicles can cooperate in teams to perform crowdsensing tasks in smart cities. A critical challenge in this regard is to build a secure model for nondeterministic vehicle teams to achieve maximum social welfare. Although several crowdsensing models have been proposed, none of them has focused on real-time vehicle teamwork. In this study, to the best of our knowledge, we propose the first secure model, called Blockchain-based Nondeterministic Teamwork Cooperation(BNTC), for nondeterministic teamwork cooperation in a vehicular crowdsensing system. We model the system as a multi-conditional NP-complete problem by explicitly considering the dynamic features of task issuers and workers. To solve the problem, we propose Winning Teams Selected(WTS) algorithm based on a reverse auction and utilize a knapsack-based method to solve the models. We consider credit of teams for determining the payment. Thus, we propose a Credit-based Team Payment(CTP) algorithm for BNTC to maximize the welfare of the system. We also propose a general blockchain-based framework to address trust issues and security challenges to make the method suitable for use in practical applications. Based on theoretical analyses and extensive simulations, we demonstrate that the proposed model performs better than the baselines and can achieve the maximum social welfare. Implementation with Ethereum suggests our model can operate within a reasonable cost.
... Trajectory data is the mobility data that contains the location as well as temporal [1] information of the moving objects. With the advancement of smart devices, large volumes of mobility data is collected by way of hardware and software applications. ...
Article
Large volumes of mobility data is collected in various application domains. Enterprise applications are designed on the notion of centralised data control where the proprietary of the data rests with the enterprise and not with the user. This has consequences as evident by the occasional privacy breaches. Trajectory mining is an important data mining problem, however, trajectory data can disclose sensitive location information about users. In this work, we propose a decentralised blockchain-enabled privacy-preserving trajectory data mining framework where the proprietary of the data rests with the user and not with the enterprise. We formalise the privacy preservation in trajectory data mining settings, present a proposal for privacy preservation, and implement the solution as a proof-of-concept. A comprehensive experimental evaluation is conducted to assess the applicability of the system. The results show that the proposed system yields promising results for blockchain-enabled privacy preservation in user trajectory data.
... The study in [21] computes the measure hot zone by finding the locations where users stayed more than 3 minutes and within spatial distance 50 meters. Another study [32] uses the TD of moving objects to determine the best location for placing billboard advertising. ...
Article
Advanced technologies in location acquisition allow us to track the movement of moving objects (people, planes, vehicles, animals, ships, ...) in geographical space. These technologies generate a vast amount of trajectory data (TD). Several applications in different fields can utilize such TD, for example, traffic management control, social behavior analysis, wildlife migrations and movements, ship trajectories, shoppers behavior in a mall, facial nerve trajectory, location-based services and many others. Trajectory data can be mainly handled either with Moving Object Databases (MOD) or Trajectory Data Warehouse (TDW). In this paper, we aim to review existing studies on storing, managing, and analyzing TD using data warehouse technologies. We propose a framework that aims to provide the requirements for building the TDW. Furthermore, we discuss different applications using the TDW and how these applications utilize the TDW. We address some issues with existing TDWs and discuss future work in this field.
Article
Hospital Emergency Departments (EDs) are essential for providing emergency medical services, yet often overwhelmed due to increasing healthcare demand. Current methods for monitoring ED queue states, such as manual monitoring, video surveillance, and front-desk registration are inefficient, invasive, and delayed to provide real-time updates. To address these challenges, this paper proposes a novel framework, CrowdQ, which harnesses spatiotemporal crowdsensing data for real-time ED demand sensing, queue state modeling, and prediction. By utilizing vehicle trajectory and urban geographic environment data, CrowdQ can accurately estimate emergency visits from noisy traffic flows. Furthermore, it employs queueing theory to model the complex emergency service process with medical service data, effectively considering spatiotemporal dependencies and event context impact on ED queue states. Experiments conducted on large-scale crowdsensing urban traffic datasets and hospital information system datasets from Xiamen City demonstrate the framework's effectiveness. It achieves an F1 score of 0.93 in ED demand identification, effectively models the ED queue state of key hospitals, and reduces the error in queue state prediction by 18.5%-71.3% compared to baseline methods. CrowdQ, therefore, offers valuable alternatives for public emergency treatment information disclosure and maximized medical resource allocation.
Chapter
Full-text available
Billboard advertisement is one of the dominant modes of traditional outdoor advertisements. A billboard operator manages the ad slots of a set of billboards. Normally, a user traversal is exposed to multiple billboards. Given a set of billboards, there is an opportunity to improve the revenue of the billboard operator by satisfying the advertising demands of an increased number of clients and ensuring that a user gets exposed to different ads on the billboards during the traversal. In this paper, we propose a framework to improve the revenue of the billboard operator by employing transactional modeling in conjunction with pattern mining. Our main contributions are three-fold. First, we introduce the problem of billboard advertisement allocation for improving the billboard operator revenue. Second, we propose an efficient user trajectory-based transactional framework using coverage pattern mining for improving the revenue of the billboard operator. Third, we conduct a performance study with a real dataset to demonstrate the effectiveness of our proposed framework.Keywordsbillboard advertisementdata miningpattern miningtransactional modelinguser trajectoryad revenue
Article
In this paper, we propose and study a novel data-driven framework for Targeted Outdoor Advertising Recommendation with a special consideration of user profiles and advertisement topics. Given an advertisement query and a set of outdoor billboards with different spatial locations and rental prices, our goal is to find a subset of billboards, such that the total targeted influence is maximum under a limited budget constraint. To achieve this goal, we are facing two challenges: 1) it is difficult to estimate targeted advertising influence in physical world; 2) due to NP hardness, many common search techniques fail to provide a satisfied solution with an acceptable time, especially for large-scale problem settings. Taking into account the exposure strength, advertisement matching degree and advertising repetition effect, we first build a targeted influence model, which can characterize that the advertising influence spreads along with users mobility. Subsequently, based on a divide-and-conquer strategy, we develop two effective approaches, i.e., a master-slave based sequential optimization method TOAR-MSS, and a cooperative co-evolution based optimization method TOAR-CC, to solve our studied problem. Extensive experiments on two real-world data sets clearly validate the effectiveness and efficiency of our proposed approaches.
Conference Paper
In urban informatics, traffic congestion prediction is of great importance for travel route planning and traffic management, and has received extensive attention from academia and industry. However, most previous works fail to implement a citywide traffic congestion prediction on fine-grained road segment, and without comprehensively considering strong spatialtemporal correlations. To overcome these concerns, in this paper, we propose a spatial-temporal context embedding and metric learning approach (STE-ML) to predict the traffic congestion level. In particular, our STE-ML consists of a traffic spatialtemporal context embedding component, and a metric learning component. From local and global perspectives, the context embedding component can simultaneously integrate local spatialtemporal correlation features and global traffic statistics information, and compress into an unified and abstract embedding representation. Meanwhile, metric learning component benefits from learning a more suitable distance function tuned to specific task. The combination of these models together could enhance traffic congestion prediction performance. We conduct extensive experiments on real traffic data set to evaluate the performance of our proposed STE-ML approach, and make comparison with other existing techniques. The experimental results demonstrate that the proposed STE-ML outperforms the existing methods.
Article
With the rapid development of smart devices and high-quality wireless technologies, mobile crowdsourcing (MCS) has been drawing increasing attention with its great potential in collaboratively completing complicated tasks on a large scale. A key issue toward successful MCS is participant recruitment, where a MCS platform directly recruits suitable crowd participants to execute outsourced tasks by physically traveling to specified locations. Recently, a novel recruitment strategy, namely Word-of-Mouth(WoM)-based MCS, has emerged to effectively improve recruitment effectiveness, by fully exploring users' mobility traces and social relationships on geo-social networks. Against this background, we study in this paper a novel problem, namely Expected Task Execution Quality Maximization (ETEQM) for MCS in geo-social networks, which strives to search a subset of seed users to maximize the expected task execution quality of all recruited participants, under a given incentive budget. To characterize the MCS task propagation process over geo-social networks, we first adopt a propagation tree structure to model the autonomous recruitment between the referrers and the referrals. Based on the model, we then formalize the task execution quality and devise a novel incentive mechanism by harnessing the business strategy of multi-level marketing. We formulate our ETEQM problem as a combinatorial optimization problem, and analyze its NP hardness and high-dimensional characteristics. Based on a cooperative co-evolution framework, we proposed a divide-and-conquer problem-solving approach named ETEQM-CC. We conduct extensive simulation experiments and a case study, verifying the effectiveness of our proposed approach.
Preprint
Full-text available
Der Treibstoff der Digitalisierung sind Daten. Das Internet lässt sich als die größte Datenfabrik der Menschheitsgeschichte begreifen- und als ihre umfassendste Selbstbeobachtung. Dementsprechend ist es in Online-Marketing und e-Commerce längst zum Standard geworden, die digitalen Handlungen sehr vieler Marktteilnehmer kontinuierlich und umfassend auszuwerten. Eine noch junge Entwicklung ist dagegen die Nutzung von Big Data Streams aus dem Netz, um das Verhalten von Menschen im analogen Raum besser zu verstehen und vorherzusagen-auch in Bereichen, in denen Meinungsbildung und Transaktionen primär offline stattfinden. Anhand einer praxisnahen Übersicht über die verfügbaren Datenquellen und die Arbeitsweise damit stellt der Beitrag vor, wie Unternehmen und Forscher diese Möglichkeiten gewinnbringend nutzen können, illustriert dies an einer Fallstudie aus dem Mittelstand und diskutiert die Auswirkungen auf Marktforschung, Werbung und Vertrieb. Den Schluss bilden 10 Empfehlungen für Praktiker. Dies ist ein Vorabdruck des folgenden Beitrages: Schoenmakers, Jan, “Mit Big Data den Markt verstehen”, veröffentlicht In “Marketing & Innovation 2021 Digitalität – die Vernetzung von digital und analog”, herausgegeben von Naskrent, Julia, Stumpf, Markus und Westphal, Jörg, 2021, Gabler Verlag , vervielfältigt mit Genehmigung von Springer Fachmedien Wiesbaden GmbH. Die finale authentifizierte Version ist online verfügbar unter: http://dx.doi.org/10.1007/978-3-658-29367-3
Chapter
Der Treibstoff der Digitalisierung sind Daten. Das Internet lässt sich als die größte Datenfabrik der Menschheitsgeschichte begreifen – und als ihre umfassendste Selbstbeobachtung. Dementsprechend ist es in Online-Marketing und E-Commerce längst zum Standard geworden, die digitalen Handlungen sehr vieler Marktteilnehmender kontinuierlich und weitläufig auszuwerten. Eine noch junge Entwicklung ist dagegen die Nutzung von Big Data Streams aus dem Netz, um das Verhalten von Menschen im analogen Raum besser zu verstehen und vorherzusagen – auch in Bereichen, in denen Meinungsbildung und Transaktionen primär offline stattfinden. Anhand einer praxisnahen Übersicht über die verfügbaren Datenquellen und die Arbeitsweise damit stellt der Beitrag vor, wie Unternehmen und Forschende diese Möglichkeiten gewinnbringend nutzen können, illustriert dies an einer Fallstudie aus dem Mittelstand und diskutiert die Auswirkungen auf Marktforschung, Werbung und Vertrieb. Den Schluss bilden zehn Empfehlungen für Praktiker.
Article
Recent advances in sensor and mobile devices have enabled an unprecedented increase in the availability and collection of urban trajectory data, thus increasing the demand for more efficient ways to manage and analyze the data being produced. In this survey, we comprehensively review recent research trends in trajectory data management, ranging from trajectory pre-processing, storage, common trajectory analytic tools, such as querying spatial-only and spatial-textual trajectory data, and trajectory clustering. We also explore four closely related analytical tasks commonly used with trajectory data in interactive or real-time processing. Deep trajectory learning is also reviewed for the first time. Finally, we outline the essential qualities that a trajectory data management system should possess to maximize flexibility.
Article
Mobile CrowdSensing is a powerful sensing paradigm, which provides sufficient social data for cognitive analytics in industrial sensing and industrial manufacturing. Sparse MCS, as a variant, only senses the data in a few subareas, and infers the unsensed ones. Existing works usually assume that the data are linearly spatio-temporal dependent. Moreover, in many cases, users not only require inferring the current data, but also interest in predicting the near future. Facing these problems, we propose a deep learning-enabled industrial sensing and prediction scheme based on Sparse MCS, which consists of two parts: matrix completion and future prediction. We first propose a Deep Matrix Factorization method to retain the non-linear temporal-spatial relationship and perform high-precision matrix completion. Then, we further propose a Nonlinear Auto-Regressive neural network and a Stacked Denoising Auto-Encoder to predict the near future. The experiments on industrial sensing data sets show the effectiveness of our methods.
Chapter
Out-door billboard advertising is a traditional method to attract potential customers for making commercial profits, which represent the income from attracted customers’ consumption minus the cost of billboards. Existing billboard selection strategies usually prefer to select the billboards with a large flow of customers without considering many factors, such as customers’ preferences and detour distance. In this paper, a billboard selection optimization problem is formulated to find the appropriate billboards so that advertisers could obtain best commercial profits. First, we adopt the semi-markov model to predict customers’ mobility by using crowdsensing trajectory data. Then, with the consideration of customers’ preferences and detour distance, two advertising strategies are proposed to address the billboard selection problem for two situations. In the end, we conduct extensive simulations based on the widely-used real-world trajectory: epfl. The results of simulations demonstrate that our advertising strategies could achieve the superior commercial profits compared with the state-of-the-art strategies, which could match the analysis of theory .
Article
Public transit networks play a significant role for urban advertisers because a considerable number of residents in urban areas use public transport for their transportation needs. Automated Fare Collection (AFC) systems provide advertisers with valuable records (smart card data) of the boarding and alighting transactions of passengers in the public transit network. While the demographic attributes of passengers are missed by most AFC systems around the world, the systems can still help to reconstruct the activities and trips of passengers. The availability of smart card data has provided a unique opportunity to create detailed models of passengers' travel behaviour. Hence, it is now possible to develop behavioural advertising techniques in the public transit network based on ongoing activities of passengers. Behavioural advertising in the public transit network considers not only location and time of trips, but also duration and type of passengers' activities. This paper proposes and compares two behavioural advertising models based on the smart card data attributes. The first model, a tripbased one, targets trips of passengers, which means it indicates the maximum number of trips on which an advertisement should be viewed, according to the purpose of each trip. The second model, a passenger-based one, targets passengers by maximizing the number of passengers who will view an advertisement relevant to their trip. Both models are formulated as linear programming models. Both models are run on a case study basis to explicitly present the outcomes of the models and the differences between them. Outcomes of each model determine a set of prime time and locations for advertisements in the public transit network. While the trip-based model targets more relevant trips with simpler computations, the passenger-based model displays advertisements to a greater number of passengers, with more complex computations.
Article
The proliferation of mobile smart devices with ever improving sensing capacities means that human-centric Mobile Crowdsensing Systems (MCSs) can economically provide a large scale and flexible sensing solution. The use of personal mobile devices is a sensitive issue, therefore it is mandatory for practical MCSs to preserve private information (the user's true identity, precise location, etc.) while collecting the required sensing data. However, well intentioned privacy protection techniques also conceal autonomous, or even malicious, behaviors of device owners (termed as self-interested), where the objectivity and accuracy of crowdsensing data can therefore be severely threatened. The issue of data quality due to untruthful reporting in privacy-preserving MCSs has been yet to produce solutions. Bringing together game theory, algorithmic mechanism design, and truth discovery, we develop a mechanism to guarantee and enhance the quality of crowdsensing data without jeopardizing the privacy of MCS participants. Together with solid theoretical justifications, we evaluate the performance of our proposal with extensive real-world MCS trace-driven simulations. Experimental results demonstrate the effectiveness of our mechanism on both enhancing the quality of the crowdsensing data and eliminating the motivation of MCS participants, even when their privacy is well protected, to report untruthfully.
Article
Full-text available
With the remarkable proliferation of smart mobile devices, mobile crowdsensing has emerged as a compelling paradigm to collect and share sensor data from surrounding environment. In many application scenarios, due to unavailable wireless network or expensive data transfer cost, it is desirable to offload crowdsensing data traffic on opportunistic device-to-device (D2D) networks. However, coupling between mobile crowdsensing and D2D networks, it raises new technical challenges caused by intermittent routing and indeterminate settings. Considering the operations of data sensing, relaying, aggregating and uploading simultaneously, in this paper, we study collabo-rative mobile crowdsensing in opportunistic D2D networks. Towards the concerns of sensing data quality, network performance and incentive budget, Minimum-Delay-Maximum-Coverage (MDMC) problem and Minimum-Overhead-Maximum-Coverage (MOMC) problem are formalized in order to optimally search a complete set of crowdsensing task execution schemes over user, temporal and spatial three dimensions. By exploiting mobility traces of users, we propose an unified graph-based problem representation framework, and transform MDMC and MOMC problems to a connection routing searching problem on weighted directed graphs. Greedy-based recursive optimization approaches are proposed to address the two problems with a divide-and-conquer mode. Empirical evaluation on both real-world and synthetic data sets validates the effectiveness and efficiency of our proposed approaches.
Conference Paper
Full-text available
In this paper, we study the Multi-Round Influence Maximization (MRIM) problem, where influence propagates in multiple rounds independently from possibly different seed sets, and the goal is to select seeds for each round to maximize the expected number of nodes that are activated in at least one round. MRIM problem models the viral marketing scenarios in which advertisers conduct multiple rounds of viral marketing to promote one product. We consider two different settings: 1) the non-adaptive MRIM, where the advertiser needs to determine the seed sets for all rounds at the very beginning, and 2) the adaptive MRIM, where the advertiser can select seed sets adaptively based on the propagation results in the previous rounds. For the non-adaptive setting, we design two algorithms that exhibit an interesting tradeoff between efficiency and effectiveness: a cross-round greedy algorithm that selects seeds at a global level and achieves $1/2 - \varepsilon$ approximation ratio, and a within-round greedy algorithm that selects seeds round by round and achieves $1-e^-(1-1/e) -\varepsilon \approx 0.46 - \varepsilon$ approximation ratio but saves running time by a factor related to the number of rounds. For the adaptive setting, we design an adaptive algorithm that guarantees $1-e^-(1-1/e) -\varepsilon$ approximation to the adaptive optimal solution. In all cases, we further design scalable algorithms based on the reverse influence sampling approach and achieve near-linear running time. We conduct experiments on several real-world networks and demonstrate that our algorithms are effective for the MRIM task.
Article
Full-text available
Context-aware applications of Vehicular Social Networks (VSNs) play a significant role to achieve the goal of green transportation by sharing driving experience to reduce gasoline consumption. One of the main challenges is to evaluate the performance of these applications, which relies on the underlying VSN mobility model. In this work, we investigate big urban traffic data to characterize essential features of urban mobility and construct large-scale green urban mobility models. We exploit the road and traffic information to enhance trip generation algorithm and traffic assignment technique based on the weighted segments of roads. Besides, we perform extensive observations and corrections on the OpenStreetMap imported to Simulation of Urban Mobility (SUMO) to make it analogous to real-world road topology. The experimental results and validation process show that the generated mobility models reveal realistic behavior required for analysis of context-aware applications of VSNs for green transportation systems.
Article
Full-text available
We present a general framework for geometric model fitting based on a set coverage formulation that caters for intersecting structures and outliers in a simple and principled manner. The multi-model fitting problem is formulated in terms of the optimization of a consensus-based global cost function, which allows to sidestep the pitfalls of preference approaches based on clustering and to avoid the difficult trade-off between data fidelity and complexity of other optimization formulations. Two especially appealing characteristics of this method are the ease with which it can be implemented and its modularity with respect to the solver and to the sampling strategy. Few intelligible parameters need to be set and tuned, namely the inlier threshold and the number of desired models. The summary of the experiments is that our method compares favourably with its competitors overall, and it is always either the best performer or almost on par with the best performer in specific scenarios.
Article
Full-text available
With the rapid development of mobile networks and the proliferation of mobile devices, spatial crowdsourcing, which refers to recruiting mobile workers to perform location-based tasks, has gained emerging interest from both research communities and industries. In this paper, we consider a spatial crowdsourcing scenario: in addition to specific spatial constraints, each task has a valid duration, operation complexity, budget limitation and the number of required workers. Each volunteer worker completes assigned tasks while conducting his/her routine tasks. The system has a desired task probability coverage and budget constraint. Under this scenario, we investigate an important problem, namely heterogeneous spatial crowdsourcing task allocation (HSC-TA), which strives to search a set of representative Pareto-optimal allocation solutions for the multi-objective optimization problem, such that the assigned task coverage is maximized and incentive cost is minimized simultaneously. To accommodate the multi-constraints in heterogeneous spatial crowdsourcing, we build a worker mobility behavior prediction model to align with allocation process. We prove that the HSC-TA problem is NP-hard. We propose effective heuristic methods, including multi-round linear weight optimization and enhanced multi-objective particle swarm optimization algorithms to achieve adequate Pareto-optimal allocation. Comprehensive experiments on both real-world and synthetic data sets clearly validate the effectiveness and efficiency of our proposed approaches.
Article
Full-text available
With the proliferation of sensor-equipped portable mobile devices, Mobile Crowdsensing using smart devices provides unprecedented opportunities for collecting enormous surrounding data. In MCS applications, a crucial issue is how to recruit appropriate participants from a pool of available users to accomplish released tasks, satisfying both resource efficiency and sensing quality. In order to meet these two optimization goals simultaneously, in this paper , we present a novel MCS task allocation framework by aligning existing task sequence with users' moving regularity as much as possible. Based on the process of mobility repetitive pattern discovery, the original task allocation problem is converted into a pattern matching issue, and the involved optimization goals are transformed into pattern matching length and support degree indicators. To determine a trade-off between these two competitive metrics, we propose greedy-based optimal assignment scheme search approaches, namely MLP, MDP, IU1 and IU2 algorithm, with respect to matching length-preferred, support degree-preferred and integrated utility, respectively. Comprehensive experiments on real-world open data set and synthetic data set clearly validate the effectiveness of our proposed framework on MCS task optimal allocation.
Article
Full-text available
Many applications require semantic understanding of short texts, and inferring discriminative and coherent latent topics is a critical and fundamental task in these applications. Conventional topic models largely rely on word co-occurrences to derive topics from a collection of documents. However, due to the length of each document, short texts are much more sparse in terms of word co-occurrences. Recent studies show that the Dirichlet Multinomial Mixture (DMM) model is effective for topic inference over short texts by assuming that each piece of short text is generated by a single topic. However, DMM has two main limitations. First, even though it seems reasonable to assume that each short text has only one topic because of its shortness, the definition of “shortness” is subjective and the length of the short texts is dataset dependent. That is, the single-topic assumption may be too strong for some datasets. To address this limitation, we propose to model the topic number as a Poisson distribution, allowing each short text to be associated with a small number of topics (e.g., one to three topics). This model is named PDMM. Second, DMM (and also PDMM) does not have access to background knowledge (e.g., semantic relations between words) when modeling short texts. When a human being interprets a piece of short text, the understanding is not solely based on its content words, but also their semantic relations. Recent advances in word embeddings offer effective learning of word semantic relations from a large corpus. Such auxiliary word embeddings enable us to address the second limitation. To this end, we propose to promote the semantically related words under the same topic during the sampling process, by using the generalized Pólya urn (GPU) model. Through the GPU model, background knowledge about word semantic relations learned from millions of external documents can be easily exploited to improve topic modeling for short texts. By directly extending the PDMM model with the GPU model, we propose two more effective topic models for short texts, named GPU-DMM and GPU-PDMM. Through extensive experiments on two real-world short text collections in two languages, we demonstrate that PDMM achieves better topic representations than state-of-the-art models, measured by topic coherence. The learned topic representation leads to better accuracy in a text classification task, as an indirect evaluation. Both GPU-DMM and GPU-PDMM further improve topic coherence and text classification accuracy. GPU-PDMM outperforms GPU-DMM at the price of higher computational costs.
Article
Full-text available
Moving destination prediction offers an important category of location-based applications and provides essential intelligence to business and governments. In existing studies, a common approach to destination prediction is to match the given query trajectory with massive recorded trajectories by similarity calculation. Unfortunately, due to privacy concerns, budget constraints, and many other factors, in most circumstances, we can only obtain a sparse trajectory dataset. In sparse dataset, the available moving trajectories are far from enough to cover all possible query trajectories; thus the predictability of the matching-based approach will decrease remarkably. Toward destination prediction with sparse dataset, instead of searching similar trajectories over the sparse records, we alternatively examine the changes of distances from sampling locations to final destination on query trajectory. The underlying idea is intuitive: It is directly motivated by travel purpose, people always get closer to the final destination during the movement. By borrowing the conception of gradient descent in optimization theory, we propose a novel moving destination prediction approach, namely MGDPre. Building upon the mobility gradient descent, MGDPre only investigates the behavior characteristics of query trajectory itself without matching historical trajectories, and thus is applicable for sparse dataset. We evaluate our approach based on extensive experiments, using GPS trajectories generated by a sample of taxis over a 10-day period in Shenzhen city, China. The results demonstrate that the effectiveness, efficiency, and scalability of our approach outperform state-of-the-art baseline methods.. 2017. Moving destination prediction using sparse dataset: A mobility gradient descent approach.
Article
Full-text available
The problem of formulating solutions immediately and comparing them rapidly for billboard placements has plagued advertising planners for a long time, owing to the lack of efficient tools for in-depth analyses to make informed decisions. In this study, we attempt to employ visual analytics that combines the state-of-the-art mining and visualization techniques to tackle this problem using large-scale GPS trajectory data. In particular, we present SmartAdP, an interactive visual analytics system that deals with the two major challenges including finding good solutions in a huge solution space and comparing the solutions in a visual and intuitive manner. An interactive framework that integrates a novel visualization-driven data mining model enables advertising planners to effectively and efficiently formulate good candidate solutions. In addition, we propose a set of coupled visualizations: a solution view with metaphor-based glyphs to visualize the correlation between different solutions; a location view to display billboard locations in a compact manner; and a ranking view to present multi-typed rankings of the solutions. This system has been demonstrated using case studies with a real-world dataset and domain-expert interviews. Our approach can be adapted for other location selection problems such as selecting locations of retail stores or restaurants using trajectory data.
Article
Full-text available
With the recent surge of location based social networks (LBSNs), activity data of millions of users has become attainable. This data contains not only spatial and temporal stamps of user activity, but also its semantic information. LBSNs can help to understand mobile users' spatial temporal activity preference (STAP), which can enable a wide range of ubiquitous applications, such as personalized context-aware location recommendation and group-oriented advertisement. However, modeling such user-specific STAP needs to tackle high-dimensional data, i.e., user-location-time-activity quadruples, which is complicated and usually suffers from a data sparsity problem. In order to address this problem, we propose a STAP model. It first models the spatial and temporal activity preference separately, and then uses a principle way to combine them for preference inference. In order to characterize the impact of spatial features on user activity preference, we propose the notion of personal functional region and related parameters to model and infer user spatial activity preference. In order to model the user temporal activity preference with sparse user activity data in LBSNs, we propose to exploit the temporal activity similarity among different users and apply nonnegative tensor factorization to collaboratively infer temporal activity preference. Finally, we put forward a context-aware fusion framework to combine the spatial and temporal activity preference models for preference inference. We evaluate our proposed approach on three real-world datasets collected from New York and Tokyo, and show that our STAP model consistently outperforms the baseline approaches in various settings.
Article
Full-text available
Diffusion is a fundamental graph process, underpinning such phenomena as epidemic disease contagion and the spread of innovation by word-of-mouth. We address the algorithmic problem of finding a set of k initial seed nodes in a network so that the expected size of the resulting cascade is maximized, under the standard independent cascade model of network diffusion. Runtime is a primary consideration for this problem due to the massive size of the relevant input networks. We provide a fast algorithm for the influence maximization problem, obtaining the near-optimal approximation factor of (1 - 1/e - epsilon), for any epsilon > 0, in time O((m+n)log(n) / epsilon^3). Our algorithm is runtime-optimal (up to a logarithmic factor) and substantially improves upon the previously best-known algorithms which run in time Omega(mnk POLY(1/epsilon)). Furthermore, our algorithm can be modified to allow early termination: if it is terminated after O(beta(m+n)log(n)) steps for some beta < 1 (which can depend on n), then it returns a solution with approximation factor O(beta). Finally, we show that this runtime is optimal (up to logarithmic factors) for any beta and fixed seed size k.
Conference Paper
Full-text available
Given a water distribution network, where should we place sensors to quickly detect contaminants? Or, which blogs should we read to avoid missing important stories? These seemingly different problems share common struc- ture: Outbreak detection can be modeled as selecting nodes (sensor locations, blogs) in a network, in order to detect the spreading of a virus or information as quickly as possible. We present a general methodology for near optimal sensor placement in these and related problems. We demonstrate that many realistic outbreak detection objectives (e.g. ,d e- tection likelihood, population affected) exhibit the prop- erty of "submodularity". We exploit submodularity to de- velop an efficient algorithm that scales to large problems, achieving near optimal placements, while being 700 times faster than a simple greedy algorithm. We also derive on- line bounds on the quality of the placements obtained by any algorithm. Our algorithms and bounds also handle cases where nodes (sensor locations, blogs) have different costs. We evaluate our approach on several large real-world prob- lems, including a model of a water distribution network from the EPA, and real blog data. The obtained sensor place- ments are provably near optimal, providing a constant frac- tion of the optimal solution. We show that the approach scales, achieving speedups and savings in storage of several orders of magnitude. We also show how the approach leads to deeper insights in both applications, answering multicrite- ria trade-off, cost-sensitivity and generalization questions. Categories and Subject Descriptors: F.2.2 Analysis of
Article
Full-text available
Models for the processes by which ideas and influence propagate through a social network have been studied in a number of domains, including the diffusion of medical and technological innovations, the sudden and widespread adoption of various strategies in game-theoretic settings, and the effects of "word of mouth" in the promotion of new products. Recently, motivated by the design of viral marketing strategies, Domingos and Richardson posed a fundamental algorithmic problem for such social network processes: if we can try to convince a subset of individuals to adopt a new product or innovation, and the goal is to trigger a large cascade of further adoptions, which set of individuals should we target? We consider this problem in several of the most widely studied models in social network analysis. The optimization problem of selecting the most influential nodes is NP-hard here, and we provide the first provable approximation guarantees for efficient algorithms. Using an analysis framework based on submodular functions, we show that a natural greedy strategy obtains a solution that is provably within 63% of optimal for several classes of models; our framework suggests a general approach for reasoning about the performance guarantees of algorithms for these types of influence problems in social networks.
Article
In spite of vast business potential, targeted advertising in public transportation systems is a grossly unexplored research area. For instance, SBS Transit in Singapore can reach 1 billion passengers per year but the annual advertising revenue contributes less than $35 million. To bridge the gap, we propose a probabilistic data model that captures the motion patterns and user interests so as to quantitatively evaluate the impact of an advertisement among the passengers. In particular, we leverage hundreds of millions of bus/train boarding transaction records to quantitatively estimate the probability as well as the extent of a user being influenced by an ad. Based on the influence model, we study a top-k retrieval problem for bus/train ad recommendation, which acts as a primitive operator to support various advanced applications. We solve the retrieval problem efficiently to support real-Time decision making. In the experimental study, we use the dataset from SBS Transit as a case study to verify the effectiveness and efficiency of our proposed methodologies.
Article
Given a social network G and a positive integer k, the influence maximization problem aims to identify a set of k nodes in G that can maximize the influence spread under a certain propagation model. As the proliferation of geo-social networks, location-aware promotion is becoming more necessary in real applications. In this paper, we study the distance-aware influence maximization (DAIM) problem, which advocates the importance of the distance between users and the promoted location. Unlike the traditional influence maximization problem, DAIM treats users differently based on their distances from the promoted location. In this situation, the k nodes selected are different when the promoted location varies. In order to handle the large number of queries and meet the online requirement, we develop two novel index-based approaches, MIA-DA and RIS-DA, by utilizing the information over some pre-sampled query locations. MIA-DA is a heuristic method which adopts the maximum influence arborescence (MIA) model to approximate the influence calculation. In addition, different pruning strategies as well as a priority-based algorithm are proposed to significantly reduce the searching space. To improve the effectiveness, in RIS-DA, we extend the reverse influence sampling (RIS) model and come up with an unbiased estimator for the DAIM problem. Through carefully analyzing the sample size needed for indexing, RIS-DA is able to return a 1−1/e−ǫ approximate solution with at least 1 − δ probability for any given query. Finally, we demonstrate the efficiency and effectiveness of proposed methods over real geo-social networks.
Article
In this paper, we study a novel problem of influence maximization in trajectory databases that is very useful in precise location-aware advertising. It finds k best trajectories to be attached with a given advertisement and maximizes the expected influence among a large group of audience. We show that the problem is NP-hard and propose both exact and approximate solutions to find the best set of trajectories. In the exact solution, we devise an expansion-based framework that enumerates trajectory combinations in a best-first manner and propose three types of upper bound estimation techniques to facilitate early termination. In addition, we propose a novel trajectory index to reduce the influence calculation cost. To support large k, we propose a greedy solution with an approximation ratio of (1-1/e), whose performance is further optimized by a new proposed cluster-based method. We also propose a threshold method that can support any approximation ratio 2 (0; 1]. In addition, we extend our problem to support the scenario when there are a group of advertisements. In our experiments, we use real datasets to construct user profiles, motion patterns and trajectory databases. The experimental results verified the efficiency of our proposed methods.
Article
Location-based social networks (LBSNs) provide people with an interface to share their locations and write reviews about interesting places of attraction. The shared locations form the crowdsourced digital footprints, in which each user has many connections to many locations, indicating user preference to locations. In this paper, we propose an approach for personalized travel package recommendation to help users make travel plans. The approach utilizes data collected from LBSNs to model users and locations, and it determines users’ preferred destinations using collaborative filtering approaches. Recommendations are generated by jointly considering user preference and spatiotemporal constraints. A heuristic search-based travel route planning algorithm was designed to generate travel packages. We developed a prototype system, which obtains users’ travel demands from mobile client and generates travel packages containing multiple points of interest and their visiting sequence. Experimental results suggest that the proposed approach shows promise with respect to improving recommendation accuracy and diversity.
Article
Advertising in social network has become a multi-billion- dollar industry. A main challenge is to identify key inuencers who can effectively contribute to the dissemina- tion of information. Although the in uence maximization problem, which finds a seed set of k most in uential users based on certain propagation models, has been well stud ied, it is not target-aware and cannot be directly applied to online advertising. In this paper, we propose a new problem, named Keyword-Based Targeted In uence Max- imization (KB-TIM), to find a seed set that maximizes the expected in uence over users who are relevant to a given advertisement. To solve the problem, we propose a sam- pling technique based on weighted reverse in uence set and achieve an approximation ratio of (1-1=e-ε"). To meet the instant-speed requirement, we propose two disk-based solu tions that improve the query processing time by two orders of magnitude over the state-of-the-art solutions, while keep ing the theoretical bound. Experiments conducted on two real social networks confirm our theoretical findings as well as the efficiency. Given an advertisement with 5 keywords, it takes only 2 seconds to find the most in uential users in a social network with billions of edges.
Article
users in a social network to maximize the expected number of users influenced by the selected users (called influence spread), has been extensively studied, existing works neglected the fact that the location information can play an important role in influence maximization. Many real-world applications such as location-aware word-of-mouth marketing have location-aware requirement. In this paper we study the location-aware influence maximization problem. One big challenge in location-aware influence maximization is to develop an efficient scheme that offers wide influence spread. To address this challenge, we propose two greedy algorithms with 1-1/e approximation ratio. To meet the instant-speed requirement, we propose two efficient algorithms with ε· (1-1/e) approximation ratio for any ε ∈ (0,1]. Experimental results on real datasets show our method achieves high performance while keeping large influence spread and significantly outperforms state-of-the-art algorithms.
Conference Paper
Influence maximization is the problem of finding a small subset of nodes (seed nodes) in a social network that could maximize the spread of influence. In this paper, we study the influence maxi- mization problem from two angles in order to significantly reduce the running time of existing algorithms. One is to improve the orig- inal greedy algorithm of (6) and its improvement (9), and the sec- ond is to propose new degree discount heuristics for the problem. We evaluate our algorithms by experiments on two large academic collaboration graphs obtained from the online archival database arXiv.org. Our experimental results show that (a) our improved greedy algorithm achieves better running time comparing with the improvement of (9) with matching influence spread, (b) our degree discount heuristics achieve much better influence spread than clas- sic degree and centrality-based heuristics, and when tuned for a specific influence cascade model, it achieve almost matching influ- ence thread with the greedy algorithm, and more importantly (c) the degree discount heuristics run only in milliseconds while even the improved greedy algorithms run in hours in our experiment graph with a few tens of thousands of nodes. Base on our results, we believe that fine-tuned heuristics may provide very promising solutions to the influence maximization problem with satisfying influence spread and blazingly fast running time. This is a counter argument to the conclusion of (6) that tra- ditional heuristics cannot compete with the greedy approximation algorithm. All of our experimental data and source code will be made available soon on the first author's web site (http://research.microsoft.com/en-us/people/weic/).
Article
Viral marketing takes advantage of networks of influence among customers to inexpensively achieve large changes in behavior. Our research seeks to put it on a firmer footing by mining these networks from data, building probabilistic models of them, and using these models to choose the best viral marketing plan. Knowledge-sharing sites, where customers review products and advise each other, are a fertile source for this type of data mining. In this paper we extend our previous techniques, achieving a large reduction in computational cost, and apply them to data from a knowledge-sharing site. We optimize the amount of marketing funds spent on each customer, rather than just making a binary decision on whether to market to him. We take into account the fact that knowledge of the network is partial, and that gathering that knowledge can itself have a cost. Our results show the robustness and utility of our approach.
Article
One of the major applications of data mining is in helping companies determine which potential customers to market to. If the expected profit from a customer is greater than the cost of marketing to her, the marketing action for that customer is executed. So far, work in this area has considered only the intrinsic value of the customer (i.e, the expected profit from sales to her). We propose to model also the customer's network value: the expected profit from sales to other customers she may influence to buy, the customers those may influence, and so on recursively. Instead of viewing a market as a set of independent entities, we view it as a social network and model it as a Markov random field. We show the advantages of this approach using a social network mined from a collaborative filtering database. Marketing that exploits the network value of customers -- also known as viral marketing -- can be extremely effective, but is still a black art. Our work can be viewed as a step towards providing a more solid foundation for it, taking advantage of the availability of large relevant databases.
Maximizing the spread of influence through a social network
  • D Kempe
D Kempe, et al. Maximizing the spread of influence through a social network. In Proceedings of 9th SIGKDD international conference on Knowledge discovery and data mining. ACM, 2003: 137-146.
He is currently a Postdoctoral Researcher with Northwestern Polytechnical University, and an Associate Professor with Xi'an University of Science and Technology
Liang Wang received the Ph.D. degree in computer science from Shenyang Institute of Automation (SIA), Chinese Academy of Sciences, Shenyang, China, in 2014. He is currently a Postdoctoral Researcher with Northwestern Polytechnical University, and an Associate Professor with Xi'an University of Science and Technology, Xi'an, China. His research interests include ubiquitous computing, mobile crowd sensing, and data mining.