About
78
Publications
5,295
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,219
Citations
Citations since 2017
Introduction
Skills and Expertise
Publications
Publications (78)
Given the development and abundance of social media, studying the stance of social media users is a challenging and pressing issue. Social media users express their stance by posting tweets and retweeting. Therefore, the homogeneous relationship between users and the heterogeneous relationship between users and tweets are relevant for the stance de...
Air pollution poses a great threat to public health and social stability by influencing multiple emotions. In particular, the air quality in developing countries is deteriorating along with rapid industrialization and urbanization, and multiple emotions may change along with regulation updates and air quality trending. Monitoring changes in public...
Previous top-performing methods for 3D instance segmentation often maintain inter-task dependencies and the tendency towards a lack of robustness. Besides, inevitable variations of different datasets make these methods become particularly sensitive to hyper-parameter values and manifest poor generalization capability. In this paper, we address the...
With more and more news articles appearing on the Internet, discovering causal relations between news articles is very important for people to understand the development of news. Extracting the causal relations between news articles is an inter-document relation extraction task. Existing works on relation extraction cannot solve it well because of...
Text classification is a primary task in natural language processing (NLP). Recently, graph neural networks (GNNs) have developed rapidly and been applied to text classification tasks. As a special kind of graph data, the tree has a simpler data structure and can provide rich hierarchical information for text classification. Inspired by the structu...
The increasing concerns on data security limit the sharing of data distributedly stored at multiple data owners and impede the scale of spatial queries over big urban data. In response, data federation systems have emerged to perform secure queries across multiple data owners leveraging secure multi-party computation. However, existing systems are...
In deep neural networks, better results can often be obtained by increasing the complexity of previously developed basic models. However, it is unclear whether there is a way to boost performance by decreasing the complexity of such models. Intuitively, given a problem, a simpler data structure comes with a simpler algorithm. Here, we investigate t...
Following the success of convolution on non-Euclidean space, the corresponding pooling approaches have also been validated on various tasks regarding graphs. However, because of the fixed compression quota and stepwise pooling design, these hierarchical pooling methods still suffer from local structure damage and suboptimal problem. In this work, i...
In deep neural networks, better results can often be obtained by increasing the complexity of previously developed basic models. However, it is unclear whether there is a way to boost performance by decreasing the complexity of such models. Intuitively, given a problem, a simpler data structure comes with a simpler algorithm. Here, we investigate t...
The boom in social media with regard to producing and consuming information simultaneously implies the crucial role of online user influence in determining content popularity. In particular, understanding behavior variations between the influential elites and the mass grassroots is an important issue in communication. However, how their behavior va...
Robotized warehouses are deployed to automatically distribute millions of items brought by the massive logistic orders from e-commerce. A key to automated item distribution is to plan paths for robots, also known as task planning, where each task is to deliver racks with items to pickers for processing and then return the rack back. Prior solutions...
Advanced deep learning methods have been widely adopted in stock movement prediction with technical analysis (TA), while researchers prefer technical indicators to technical charts due to the divergence in quantification difficulty. In traditional TA, researchers usually utilize chart similarity to solve the quantifying problem, while chart similar...
Federated learning is a new learning paradigm that jointly trains a model from multiple data sources without sharing raw data. For the practical deployment of federated learning, data source selection is compulsory due to the limited communication cost and budget in real-world applications. The necessity of data source selection is further amplifie...
There has been a dramatic growth of shared mobility applications such as ride-sharing, food delivery, and crowdsourced parcel delivery. Shared mobility refers to transportation services that are shared among users, where a central issue is route planning . Given a set of workers and requests, route planning finds for each worker a route, i.e., a se...
Data isolation has become an obstacle to scale up query processing over big data, since sharing raw data among data owners is often prohibitive due to security concerns. A promising solution is to perform secure queries over a federation of multiple data owners leveraging secure multi-party computation (SMC) techniques, as evidenced by recent feder...
Skyline is a primitive operation in multi-objective decision applications and there is a growing demand to support such operations over a data federation, where the entire dataset is separately held by multiple data providers (a.k.a., silos). Data federations notably increase the amount of data available for data-intensive applications such as comm...
Great research efforts have been devoted to exploiting deep neural networks in stock prediction. However, long-term dependencies and chaotic properties are still two major issues that lower the performance of state-of-the-art deep learning models in forecasting future price trends. In this study, we propose a novel framework to address both issues....
We investigated a comprehensive analysis of the mutual exciting mechanism for the dynamic of stock price trends. A multi-dimensional Hawkes-model-based approach was proposed to capture the mutual exciting activities, which take the form of point processes induced by dual moving average crossovers. We first performed statistical measurements for the...
In deep neural networks, better results can often be obtained by increasing the complexity of previously developed basic models. However, it is unclear whether there is a way to boost performance by decreasing the complexity of such models. Here, based on an optimization method, we investigate the feasibility of improving graph classification perfo...
Stock prediction, with the purpose of forecasting the future price trends of stocks, is crucial for maximizing profits from stock investments. While great research efforts have been devoted to exploiting deep neural networks for improved stock prediction, the existing studies still suffer from two major issues. First, the long-range dependencies in...
Finding codes given natural language query isb eneficial to the productivity of software developers. Future progress towards better semantic matching between query and code requires richer supervised training resources. To remedy this, we introduce the CoSQA dataset.It includes 20,604 labels for pairs of natural language queries and codes, each ann...
Pre-trained Transformer-based neural language models, such as BERT, have achieved remarkable results on varieties of NLP tasks. Recent works have shown that attention-based models can benefit from more focused attention over local regions. Most of them restrict the attention scope within a linear span, or confine to certain tasks such as machine tr...
The second moment method has always been an effective tool to lower bound the satisfiability threshold of many random constraint satisfaction problems. However, the calculation is usually hard to carry out and as a result, only some loose results can be obtained. In this paper, based on a delicate analysis which fully exploit the power of the secon...
The problem of session-aware recommendation aims to predict users' next click based on their current session and historical sessions. Existing session-aware recommendation methods have defects in capturing complex item transition relationships. Other than that, most of them fail to explicitly distinguish the effects of different historical sessions...
Dynamic ridesharing refers to services that arrange one-time shared rides on short notice. It underpins various real-world intelligent transportation applications such as car-pooling, food delivery and last-mile logistics. A core operation in dynamic ridesharing is the "insertion operator". Given a worker and a feasible route which contains a seque...
Predicting long-term returns is essential for getting a full view of market efficiency. It is, additionally, more challenging for both human and algorithms, especially at the level of individual stocks, and competent solutions are still missing in previous effort. Considering the profound connections between stock prices and consumer opinions, aski...
Purpose:
To explore the clinical efficacy of ureteroscopic occluder and stone retrieval basket combined with holmium laser in the treatment of upper ureteral calculi.
Materials and methods:
This retrospective study included 103 patients treated with ureteroscopic holmium laser lithotripsy for upper ureteral stones. Patients were divided into two...
In recent years, there have been some platforms that have focused on recommending commodities or events to users using event-based social networks (EBSNs). Some studies have attempted to find the optimal recommendation sequence of these items, assuming that the sequence stops once the user accepts one recommendation or the item list runs out. Howev...
In this paper, we propose Patience-based Early Exit, a straightforward yet effective inference method that can be used as a plug-and-play technique to simultaneously improve the efficiency and robustness of a pretrained language model (PLM). To achieve this, our approach couples an internal-classifier with each layer of a PLM and dynamically stops...
Question Answering (QA) has shown great success thanks to the availability of large-scale datasets and the effectiveness of neural models. Recent research works have attempted to extend these successes to the settings with few or no labeled data available. In this work, we introduce two approaches to improve unsupervised QA. First, we harvest lexic...
In this paper, we introduce DropHead, a structured dropout method specifically designed for regularizing the multi-head attention mechanism, which is a key component of transformer, a state-of-the-art model for various NLP tasks. In contrast to the conventional dropout mechanisms which randomly drop units or connections, the proposed DropHead is a...
The boom in social media with regard to producing and consuming information simultaneously implies the crucial role of online user influence in determining content popularity. In particular, understanding behavior variations between the influential elites and the mass grassroots is an important issue in communication. However, how their behavior va...
Automated evaluation of open domain natural language generation (NLG) models remains a challenge and widely used metrics such as BLEU and Perplexity can be misleading in some cases. In our paper, we propose to evaluate natural language generation models by learning to compare a pair of generated sentences by fine-tuning BERT, which has been shown t...
Automated evaluation of open domain natural language generation (NLG) models remains a challenge and widely used metrics such as BLEU and Perplexity can be misleading in some cases. In our paper, we propose to evaluate natural language generation models by learning to compare a pair of generated sentences by fine-tuning BERT, which has been shown t...
Conventional Generative Adversarial Networks (GANs) for text generation tend to have issues of reward sparsity and mode collapse that affect the quality and diversity of generated samples. To address the issues, we propose a novel self-adversarial learning (SAL) paradigm for improving GANs' performance in text generation. In contrast to standard GA...
Local sequence transduction (LST) tasks are sequence transduction tasks where there exists massive overlapping between the source and target sequences, such as Grammatical Error Correction (GEC) and spell or OCR correction. Previous work generally tackles LST tasks with standard sequence-to-sequence (seq2seq) models that generate output tokens from...
With the development of mobile Internet and the prevalence of sharing economy, spatial crowdsourcing (SC) is becoming more and more popular and attracts attention from both academia and industry. A fundamental issue in SC is assigning tasks to suitable workers to obtain different global objectives. Existing works often assume that the tasks in SC a...
The prevalence of mobile internet techniques stimulates the emergence of various spatial crowdsourcing applications. Certain of the applications serve for the requesters, budget providers, who submit a batch of tasks and a fixed budget to platform with the desire to search suitable workers to complete the tasks in maximum quantity. Platform lays st...
We propose a novel data synthesis method to generate diverse error-corrected sentence pairs for improving grammatical error correction, which is based on a pair of machine translation models of different qualities (i.e., poor and good). The poor translation model resembles the ESL (English as a second language) learner and tends to generate transla...
The aim of the present study was to validate the prognostic effectiveness of Sepsis-3 criteria, including sequential organ failure assessment (SOFA) and quick SOFA (qSOFA), with systemic inflammatory response syndrome (SIRS) criteria among patients with urolithiasis associated sepsis that were transferred to intensive care unit (ICU) facilities fol...
With the emergence of many crowdsourcing platforms, crowdsourcing has gained much attention. Spatial crowdsourcing is a rapidly developing extension of the traditional crowdsourcing, and its goal is to organize workers to perform spatial tasks. Route recommendation is an important concern in spatial crowdsourcing. In this paper, we define a novel p...
Background:
Biomedical named entity recognition (BioNER) is a fundamental and essential task for biomedical literature mining, which affects the performance of downstream tasks. Most BioNER models rely on domain-specific features or hand-crafted rules, but extracting features from massive data requires much time and human efforts. To solve this, n...
Online reviews are feedback voluntarily posted by consumers about their consumption experiences. This feedback indicates customer attitudes such as affection, awareness and faith towards a brand or a firm and demonstrates inherent connections with a company's future sales, cash flow and stock pricing. However, the predicting power of online reviews...
Financial price series trend prediction is an essential problem which has been discussed extensively using tools and techniques of economic physics and machine learning. Time dependence and volatility issues in this problem have made Hidden Markov Model (HMM) a useful tool in predicting the states of stock market. In this paper, we present an appro...
Dimension reduction plays an important role in practical big data analysis and data mining applications. However, popular dimension reduction techniques such as Principal Component Analysis (PCA) are known to be computation-intensive, and are considered as a computation bottleneck for data processing and mining. In this paper, we propose to reduce...
When the input are multiple users’ top-m rankings, aggregating or integrating them into an order sequence (pRankAggreg) provides an interesting and classic research area that can be applied to information search, group recommendation, and web guidance, etc. Since the Kemeny optimal rule based rank aggregation is NP-hard, it poses a great challenge...
The prevalence of mobile internet techniques stimulates the emergence of various spatial crowdsourcing applications. Certain of the applications serve for requesters, budget providers, who submit a batch of tasks and a fixed budget to platform with the desire to search suitable workers to complete the tasks in maximum quantity. Platform lays stress...
There has been a dramatic growth of shared mobility applications such as ride-sharing, food delivery and crowdsourced parcel delivery. Shared mobility refers to transportation services that are shared among users, where a central issue is route planning. Given a set of workers and requests, route planning finds for each worker a route, i.e., a sequ...
The present study reported the clinical experience of using a PolyScope with holmium laser lithotripter in managing renal calculi in senile patients. Between December 2013 and December 2016, 157 senile patients (69.1±6.1 years old) were treated with PolyScope holmium laser lithotripsy for renal calculi at Xin Hua Hospital (Shanghai, China). The mea...
Spatial crowdsourcing emerges as a new computing paradigm with the development of mobile Internet and the ubiquity of mobile devices. The core of many real-world spatial crowdsourcing applications is to assign suitable tasks to proper workers in real time. Many works only assign a set of tasks to each worker without making the plan how to perform t...
With the rapid development of Mobile Internet, spatial crowdsourcing is gaining more and more attention from both academia and industry. In spatial crowdsourcing, spatial tasks are sent to workers based on their locations. A wide kind of tasks in spatial crowdsourcing are specialty-aware, which are complex and need to be completed by workers with d...
With the rapid development of mobile Internet, spatial crowdsourcing is gaining more and more attention from both academia and industry. In spatial crowdsourcing, spatial tasks are sent to workers based on their locations. A wide kind of tasks in spatial crowdsourcing are specialty-aware, which are complex and need to be completed by workers with d...
These days, Online To Offline (O2O) platforms have been developing rapidly because of the popularization of smart phones and Mobile Internet. Spatial crowdsourcing, a burgeoning area in O2O market, is gaining more and more attention. It is a typical spatial crowdsourcing scenario in which an employer publishes a task and some workers will help him...
The ubiquitous deployment of GPS-equipped devices and mobile networks has spurred the popularity of spatial crowdsourcing. Many spatial crowdsourcing tasks require crowd workers to collect data from different locations. Since workers tend to select locations nearby or align to their routines, data collected by workers are usually unevenly distribut...
The rapid development of mobile devices has stimulated the popularity of spatial crowdsourcing. Various spatial crowdsourcing platforms, such as Uber, gMission and Gigwalk, are becoming increasingly important in our daily life. A core functionality of spatial crowdsourcing platforms is to allocate tasks or make plans for workers to efficiently fini...
The popularity of Online To Offline (O2O) service platforms has spurred the need for online task assignment in real-time spatial data, where streams of spatially distributed tasks and workers are matched in real time such that the total number of assigned pairs is maximized. Existing online task assignment models assume that each worker is either a...
With the rapid development of mobile internet and online to offline marketing model, various spatial crowdsourcing platforms, such as Gigwalk and Gmission, are getting popular. Most existing studies assume that spatial crowdsourced tasks are simple and trivial. However, many real crowdsourced tasks are complex and need to be collaboratively finishe...
Recently, crowdsourcing platforms have attracted a number of citizens to perform a variety of location-specific tasks. However, most existing approaches consider the arrangement of a set of tasks for a set of crowd workers, while few consider crowd workers arriving in a dynamic manner. Therefore, how to arrange suitable location-specific tasks to a...
Recently, with the development of mobile Internet and smartphones, the o nline m inimum b ipartite m atching in real time spatial data (OMBM) problem becomes popular. Specifically, given a set of service providers with specific locations and a set of users who dynamically appear one by one, the OMBM problem is to find a maximum-cardinality matching...
Call a sequence of k Boolean variables or their negations a k-tuple. For a set V of n Boolean variables, let Tk(V) denote the set of all 2knk possible k-tuples on V. Randomly generate a set C of k-tuples by including every k-tuple in Tk(V) independently with probability p, and let Q be a given set of q “bad” tuple assignments. An instance I = (C,Q)...
With the rapid development of Mobile Internet and Online To Offline (O2O) marketing model, various spatial crowdsourcing platforms, such as Gigwalk and Gmission, are getting popular. Most existing studies assume that spatial crowdsourced tasks are simple and trivial. However, many real crowdsourced tasks are complex and need to be collaboratively f...
Twitter has become an important platform for reporting breaking news and instant events. However, it is almost impossible to detect events on Twitter manually due to the large volume of data and the noise in them. Though automatic event detection has been studied a lot, most works can only detect events in a fixed time window. In this paper, we pro...
We show that the union closed sets conjecture holds for tree convex sets. The union closed sets conjecture says that in every union closed set system, there is an element to appear in at least half of the members of the system. In tree convex set systems, there is a tree on the elements such that each subset induces a subtree. Our proof relies on t...