Chunxiao Xing's research while affiliated with Tsinghua University and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (56)
The bursting of media sharing platforms like TikTok, YouTube and Kwai enables normal users to create and share content with worldwide audiences. The most popular YouTuber can attract up to 100 million followers. Since there are multiple popular platforms, it’s quite common that a YouTuber publishes the same media to multiple platforms, or replicate...
Link prediction aims to identify potential missing triples in knowledge graphs. To get better results, some recent studies have introduced multimodal information to link prediction. However, these methods utilize multimodal information separately and neglect the complicated interaction between different modalities. In this paper, we aim at better m...
Recent studies have shown that strong Natural Language Understanding (NLU) models are prone to relying on annotation biases of the datasets as a shortcut, which goes against the underlying mechanisms of the task of interest. To reduce such biases, several recent works introduce debiasing methods to regularize the training process of targeted NLU mo...
Entity alignment aims at integrating heterogeneous knowledge from different knowledge graphs. Recent studies employ embedding-based methods by first learning the representation of Knowledge Graphs and then performing entity alignment via measuring the similarity between entity embeddings. However, they failed to make good use of the relation semant...
Document-level sentiment classification is a fundamental task in Natural Language Processing (NLP). Previous studies have demonstrated the importance of personalized sentiment classification by taking user preference and product characteristics on the sentiment ratings into consideration. The state-of-the-art approaches incorporate such information...
Index plays an essential role in modern database engines to accelerate the query processing. The new paradigm of ``learned index'' has significantly changed the way of designing index structures in DBMS. The key insight is that indexes could be regarded as learned models that predict the position of a lookup key in the dataset. While such studies s...
Knowledge graph completion is the task of predicting missing relationships between entities in knowledge graphs. State-of-the-art knowledge graph completion methods are known to be primarily knowledge embedding based models, which are broadly classified as translational models and neural network models. However, both kinds of models are single-task...
This book constitutes the proceedings of the 18th International Conference on Web Information Systems and Applications, WISA 2021, held in Kaifeng, China, in September 2021. The 49 full papers and 18 short papers presented were carefully reviewed and selected from 206 submissions. The papers are grouped in topical sections on world wide web, query...
Estimating the Region of Interest (ROI) for images is a classic problem in the field of computer vision. In a broader sense, the object of ROI estimation can be generalized to the bag containing multiple data instances, i.e., identify the instances that probably arouse our interest. Under the circumstance without instance labels, generalized ROI es...
Text semantics similarity measurement is a crucial problem in many real world applications, such as text mining, information retrieval and natural language processing. It is a complicated task due to the ambiguity and variability of linguistic expression. Previous studies focus on modeling the representation of a sentence in multiple granularities...
In recent years, blockchain technology has received more and more attention. Blockchain is a storage technology for public decentralized databases. The emergence of blockchain technology makes it possible to solve the trust problem of distributed system nodes within the wide area network. This article elaborated on the current advantages and disadv...
Earth Mover’s Distance (EMD) is defined as the minimum cost to transfer the components from one histogram to the other. As a robust similarity measurement, EMD has been widely adopted in many real world applications, like computer vision, machine learning and video identification. Since the time complexity of computing EMD is rather high, it is ess...
Given a graph G, a source node s and a target node t, the discounted hitting time (DHT) of t with respect to s is the expected steps that a random walk starting from s visits t for the first time. For a query node s, the single-source DHT (SSDHT) query returns the top-k nodes with the highest DHT values from all nodes in G. SSDHT is widely adopted...
Nowadays, machine learning methods have been widely used in stock prediction. Traditional approaches assume an identical data distribution, under which a learned model on the training data is fixed and applied directly in the test data. Although such assumption has made traditional machine learning techniques succeed in many real-world tasks, the h...
Stroke patients are often associated with lower levels of heart rate variability, suggesting that autonomic dysfunction is very common in stroke patients. Recent studies have shown that heart rate variability (HRV) is an early predictor of prognosis in patients with acute stroke, but the relationship between HRV and functional status in chronic reh...
Document classification is an essential task in many real world applications. Existing approaches adopt both text semantics and document structure to obtain the document representation. However, these models usually require a large collection of annotated training instances, which are not always feasible, especially in low-resource settings. In thi...
Stock trend prediction, aiming at predicting future price trend of stocks, plays a key role in seeking maximized profit from the stock investment. Recent years have witnessed increasing efforts in applying machine learning techniques, especially deep learning, to pursue more promising stock prediction. While deep learning has given rise to signific...
Distant Supervision is a common technique for relation extraction from large amounts of free texts, but introduces wrong labeled sentences at the same time. Existing deep learning approaches mainly rely on CNN-based models. However, they fail to capture spatial patterns due to the inherent drawback of pooling operations and thus lead to suboptimal...
With the rapid development of mobile networks and smart terminals, mobile crowdsourcing has aroused the interest of relevant scholars and industries. In this paper, we propose a new solution to the problem of user selection in mobile crowdsourcing system. The existing user selection schemes mainly include: (1) find a subset of users to maximize cro...
Given two sets of objects, metric similarity join finds all similar pairs of objects according to a particular distance function in metric space. There is an increasing demand to provide a scalable similarity join framework which can support efficient query and analytical services in the era of Big Data. The existing distributed metric similarity j...
Recently years, traffic prediction has become an important and challenging problem in smart urban traffic computing, which can be used for government for road planning, detecting bottle-neck congestions roads, pollution emissions estimating and so on. However, former data mining algorithms mainly address the problem by using the traditional mathema...
Set similarity search is a fundamental operation in a variety of applications. While many previous studies focus on threshold based set similarity search and join, few efforts have been paid for KNN set similarity search. In this paper, we propose a transformation based framework to solve the problem of KNN set similarity search, which given a coll...
With the development of Global Position System (GPS) technology, the analysis of history trajectory becomes more and more important. The Location Based Service (LBS) can provide the user’s location, the human movement location prediction from the history observations over some period have several potential applications and attract more and more att...
Massive rules processing will play a very important role in the ad hoc computing. In this paper, we first give the massive KOA medical rules processing framework. We propose two kinds of massive rules processing algorithms: Massive Rules processing algorithm without and with external communication. In the Knee Osteoarthritis medical area, lots of r...
Knee Osteoarthritis (KOA) is a common and frequently-occurring chronic disease. The traditional KOA diagnosis lacks personalized and systematic diagnosis and treatment models, and lacks high-quality and large-sample randomized controlled clinical studies. In this paper, we propose a kind of intelligent diagnosis and treatment method for KOA based o...
With the development of the times, the traditional personal credit is facing a severe test. This paper makes an exploratory study on the practical application and development of personal credit evaluation by using the MicroBlog data. According to the previous study of personal credit evaluation literature to dig out the credit-related indicators. W...
Repost cascades play a critical role in information diffusion on social media sites. They are developed by series of reposts and stop eventually. Substantial previous work has studied and predicted various aspects of repost cascades such as growth, burst and recur. However, how or even whether it is possible to predict when a repost cascade will se...
With the quickly increasing of the Question Answering (QA) corpus, the health QA systems provide a convenient way for patients to provide instant service, and the effectiveness of the answer is a very important and challenging problem to be solved. Therefore, this paper proposes a solution based on medical knowledge base. In the process of generati...
In recent years, Data Mining techniques such as classification, clustering, association, regression etc. are widely used in healthcare field to help analyzing and predicting disease and improving the quality and efficiency of medical services. This paper presents a web-based platform for big data analysis of healthcare using Data Mining techniques....
Objective: This study aims to investigate the self-regulation of the autonomic nervous system following the cognitive stress tests after Heart Rate Variability (HRV) biofeedback therapy in post-stroke depression (PSD) patients. Methods: Twenty-four patients with PSD were randomly divided into feedback and control groups. Feedback patients were give...
In “China Cardiovascular Disease Report 2015”, research shows that mortality of Acute Myocardial Infarction (AMI) was rapidly increased since 2005. The mortality in 2014 was 123.92/lakh, which was 4.4 times higher than in 2002. Cardiovascular disease is ranked No. 1 in cause of death in China right now, in both rural and urban areas. This paper pre...
Intel Software Guard Extensions (SGX) is an emerging trusted hardware technology. SGX enables user-level code to allocate regions of trusted memory, called enclaves, where the confidentiality and integrity of code and data are guaranteed. While SGX offers strong security for applications, one limitation of SGX is the lack of system call support ins...
This book constitutes the thoroughly refereed post-conference proceedings of the
International Conference for Smart Health, ICSH 2016, held in Haikou, Hainan, China, in December 2016.
The 23 full papers presented were carefully reviewed and selected from 52 submissions.They are organized around the following topics: big data and smart health; healt...
As cloud computing is widely used in various fields, more and more individuals and organizations are considering outsourcing data to the public cloud. However, the security of the cloud data has become the most prominent concern of many customers, especially those who possess a large volume of valuable and sensitive data. Although some technologies...
Health related Question Answering (QA) systems have proven to be useful to patients. However, most of the QA systems focus on improving system performance against a standard set of questions, but neglect the problem of designing effective user interfaces. We build a health QA with enhanced user interface which includes three formats – single answer...
Message-oriented middleware especially for message queue has been widely used in web applications and services. Performance and scalability are quite essential in these systems however they often become the bottleneck. Existing message queues are not able to scale out elastically very well. This paper presents a decentralized distributed architectu...
Electronic commerce is playing a more and more important role in today's commercial activities. In this chapter, the authors propose a kind of new electronic commerce architecture in the cloud and give two kinds of new electronic commerce models. This chapter opens the discussion of why we need to design a new architecture in the cloud environment....
Citations
... Note that the methods with other enhancing techniques, such as data augmentation [2,17,50,51] or AutoML [18,37,44,47,48] are orthogonal to our approach for comparison. ...
... Such methods only consider the preceding information but ignore the succeeding information, which is insufficient to obtain the optimal result. Different from the greedy strategy, another solution is the context-wise re-ranking strategy [1,6,11,20,25], which uses a context-wise evaluation model to capture the mutual influence among items and re-predict the CTR of each item. Methods like PRM [20], directly take the initial ranking list as input, and generates the optimal permutation based on the predict value given by context-wise model. ...
... Unfortunately, both R-Tree implementations do not disclose their source code. Finally, learned indexes [53] achieve high performance on GPUs, since they heavily depend on linear algebra operations, which have been extensively optimized on GPUs, and some GPUs even offer specialized accelerators for these operations. Regrettably, we were unable to check whether the existing learned index takes advantage of hardware acceleration, as there is no accompanying implementation available. ...
... We employ weak models to capture features useful for stance detection but not useful for intermediate sub-tasks as bias features through adversarial learning (Belinkov et al., 2019). processing, including visual question answering (Niu et al., 2021), natural language inference (Tian et al., 2022), question answering , named entity recognition , and text classification (Wang and Culotta, 2020;Qian et al., 2021;Choi et al., 2022). Different from existing task-agnostic counterfactual frameworks, our model incorporates an adversarial bias learning module that leverages intermediate sub-tasks of stance detection to learn bias more accurately. ...
... Data prefetching or caching are techniques that are highly used mostly in databases [12,13] in order to boost the execution performance by fetching data in the memory before it is needed. These methodologies are widely used in database systems or the prefetching of small-sized data such as websites [14,15] or Internet of Things (IoT) data [16]. ...
... If the distance of u is larger than a(u) + r(u) + l(u) which means the node u cannot be the follower, we set the status of u as closed, and use the Shrink of Algorithm 4 to update its neighbors' distance and status (Lines 12-13). Otherwise, we set the status of u as activated (Line 15) and process each listening node z who is the neighbors of u (Lines [16][17][18][19][20][21]. If node z is not in the k-core and p(z) > p(u) , we push z into Q and update the distance u and z by subtracting 1. ...
... Triple classification task is intended to infer whether the triples of facts in the knowledge base are correct [36]. The classification accuracy rates of three standard data sets WN18, WN18RR and FB15K-237 are shown in Table 7. ...
... There are many applications regarding trajectory similarity computation. Several previous studies [3,8,36] aimed at accelerating the similarity search and join over trajectory data by devising index and pruning techniques following the idea from string similarity search [37][38][39][40]. Specifically, tree-based index structures [22,27] , such as K-D tree or R-tree are employed to organize the trajectories. ...
... First, most studies examined HRV in patients in the subacute (>7 days following stroke onset) or chronic phase of stroke (>6 months), whereas the most severe stroke complications, including autonomic dysfunction, appear in the acute phase of stroke. Second, most studies included only patients with minor stroke with less disability and smaller infarcts with frequent favorable outcomes and a low risk of cardiac complications, which increases proportionally to the severity of ischemic stroke and neurological deficits (Yoshimura et al., 2008;Guan et al., 2019;Zhang et al., 2019;Li et al., 2021). Third, the studies had different follow-ups, and the risk of stroke complications was the highest in the first 3 months after stroke. ...
... The method that might be used is a data clustering technique that has been successfully applied in data mining (Cheng et al., 2014). Data clustering can be done with several techniques or algorithms including single link, average link, minimum spanning tree, k-means (Li et al., 2019;Yang, 2017). Data clustering aims to extract relationships in a dataset and to determine interesting patterns based on sameness of the sample by grouping several data objects into groups or clusters so that objects in a cluster have a high degree of similarity and very different from objects in other clusters. ...