Yanxiang Xu’s research while affiliated with China Academy of Chinese Medical Sciences and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (9)


A Fast Indexing Algorithm Optimization with User Behavior Pattern
  • Conference Paper

November 2012

·

14 Reads

·

2 Citations

Lecture Notes in Computer Science

·

·

Yanxiang Xu

·

[...]

·

Xiang Wang

Internet users' access pattern for objects has been observed to follow Zipf's law. The preference for network resource is showing strong influence on real-time lookup performance in large-scale distributed systems. In order to guarantee search response rate with limited memory space, we develop a new object indexing and locating algorithm called Bloom filter Arrays based on Zipf's-distributed user Preference (ZPBA). The algorithm uses a compact data structure to achieve high accuracy in item lookup. We give the theoretical analysis of ZPBA and then conduct experiments with one million item corpus and 100,000 queries to validate our design. Comparison shows that our solution can be 77% more space efficient than traditional bloom filter based index approaches for applications of concentrated user access preference. The algorithm demonstrates practical application potential in fault tolerant large-scale distributed indexing and item lookup.


Table 1 . Dateset Descriptions
Fig. 2. The Framework of the Model 
Table 2 . Analysis of Experimental Results
Fig. 3. Precision and Recall of Experiment Results 
A Topic-Oriented Syntactic Component Extraction Model for Social Media
  • Chapter
  • Full-text available

January 2012

·

45 Reads

Lecture Notes in Electrical Engineering

Topic-oriented understanding is to extract information from various language instances, which reflects the characteristics or trends of semantic information related to the topic via statistical analysis. The syntax analysis and modeling is the basis of such work. Traditional syntactic formalization approaches widely used in natural language understanding could not be simply applied to the text modeling in the context of topic-oriented understanding. In this paper, we review the information extraction mode, and summarize its inherent relationship with the “Subject- Predicate” syntactic structure in Aryan language. And we propose a syntactic element extraction model based on the “topic-description” structure, which contains six kinds of core elements, satisfying the desired requirement for topic-oriented understanding. This paper also describes the model composition, the theoretical framework of understanding process, the extraction method of syntactic components, and the prototype system of generating syntax diagrams. The proposed model is evaluated on the Reuters 21578 and SocialCom2009 data sets, and the results show that the recall and precision of syntactic component extraction are up to 93.9% and 88%, respectively, which further justifies the feasibility of generating syntactic component through the word dependencies.

Download

Measuring article quality in Wikipedia: Lexical clue model

October 2011

·

70 Reads

·

34 Citations

Wikipedia is the most entry-abundant on-line encyclopedia. Some studies published by Nature proved that the scientific entries in Wikipedia are of good quality comparable to those in the Encyclopedia Britannica which are mainly maintained by experts. But the manual partition of the articles in Wikipedia from a WikiProject implies that high-quality articles are usually reached grade by grade via being repeatedly revised. So many work address to automatically measuring the article quality in Wikipedia based on some assumption of the relationship between the article quality and contributors' reputations, view behaviors, article status, inter-article link, or so on. In this paper, a lexical clue based measuring method is proposed to assess article quality in Wikipedia. The method is inspired the idea that the good articles have more regular statistic features on lexical usage than the primary ones due to the more revise by more people. We select 8 lexical features derived from the statistic on word usages in articles as the factors that can reflect article quality in Wikipedia. A decision tree is trained based on the lexical clue model. Using the decision tree, our experiments on a well-labeled collection of 200 Wikipedia articles shows that our method has more than 83% precise and recall.


Social computing research map

September 2010

·

25 Reads

·

4 Citations

Online society has triggered social computing which is an interdisciplinary that takes a computational approach to the study and modelling of social interactions and communications. How to measure this field's state of the art is an important work to foster its prosperity. In this paper, we figure out a dual-dimension framework of focus and mature-level to characterize the discipline. Aiming at focus, we propose a research focus spectrum lens to make sense of which aspect a study focuses on from social behaviour to computational systems. We give a benchmark to measure the mature level of a social computing study based on Denning's computing paradigm. We use the framework to analyze the 187 paper s published at the SocialCom09. Based the analysis result, we map the social computing field by the distribution on research focus and the distribution on mature level.


The principles of intention computing

August 2009

·

40 Reads

·

1 Citation

Intention is a mental state and a driving force of human behaviors, which associates with other psychological phenomena, such as belief, desire, and goal. Understanding intention's characteristics and working mechanism has caught the attentions from several disciplines, such as philosophy, psychology, computer science. Although the relationship was proven formally, there always exist some risks of introducing weak correlation because of the complexity of psychological activity. A lot of Web applications, such as audience targeting, collaborative filtering, and intrusion detecting etc., take intention concept for interpreting those system's results. But they are lack of a fundamental theoretical framework for tuning their performance. In this paper, we survey some key concepts developed in three disciplines and sum up five assumptions affecting the intention computing. We also propose a framework for modelling intention prediction and describe our I-I-I criteria model which is able to help us verify an intention computing model, and finally, we conclude by highlighting the salient features of this paper.


Collaborative Filtering with Fine-grained Trust Metric

March 2009

·

16 Reads

·

2 Citations

Similarity-based collaborative filtering systems are vulnerable to the data sparsity, cold-start, and robustness problems. Computational trust models are promising alternative solutions to alleviate these problems by replacing similarity metric with trust metric. However, they often have some shortages that rely on users' explicit trust statements. A fine-grained model computing trust from user ratings is more reasonable and gets more nonintrusive for average users. We propose a novel trust-based recommendation model for this purpose. Experiments on a large real dataset show that the proposed model has better performance in terms of MAE, coverage, and F-metric than the conventional collaborative filtering model.


Incorporating Similarity and Trust for Collaborative Filtering

January 2009

·

13 Reads

·

7 Citations

Currently, most recommender systems are using collaborative filtering (CF) techniques. The main idea is to suggest new relevant items for an active user based on the judgements from other members in the like-minded community. However, these CF-based methods encounter the obstacles, such as sparse data, cold-start and robustness. This paper proposes to deal with these issues by associating similarity measurement from users'rating patterns with trust metric. After investigating the large data set from Epinions.com, we find that user similarity and trust are strongly correlated. This fact also explains why using trust (instead of user similarity) could lead to very close mean prediction accuracy in a Pearson correlation coefficient-like recommendation algorithm. Our novel method incorporates these two factors into one unified recommendation algorithm. The experimental results indicate that a good prediction strategy can come from filtering the ratings from the users who have high trust and low similarity or vice versa.


A Services Oriented Framework for Integrated and Customizable Collaborative Environment

September 2007

·

17 Reads

·

8 Citations

How to develop a reusable and reliable framework to integrate the distributed heterogeneous services for collaboratories projects is one of the most important factors that affect their successful deployment and operation. An ideal collaboratories model should deal with integrating and customizing the key components such as task assignment, multimedia communication, data sharing, knowledge management, instruments access and computation services. This paper presents a service oriented framework that overcomes the drawback of one-off hand-crafted approaches for developing collaboratories environment. This framework supports VO (virtual organization) management services and is easy to be tailored. By developing a service bus communicating protocol, ire achieve a lightweight framework with customizability and expandability. It can integrate self-made and third party collaborative tools, then select and customize such tools to meet with specific collaboration platform needs from various domains. The framework has been successfully applied to several multi-disciplinary scientific collaboratories projects in China.


Citations (5)


... For every article,the introductory section was used,a practice common in similar research settings [18]. This allowed us to cover texts in well-written overall quality [19] without restrictions on a specific topic.The retrieved articles were subsequently saved as individual.txt files (further details in Appendix A).The individual texts'lengths range between around 100 and 800 words.These individual articles then served as the input texts to be summarized by ChatGPT. ...

Reference:

A Multimetric Approach for Evaluation of ChatGPT-Generated Text Summaries
Measuring article quality in Wikipedia: Lexical clue model
  • Citing Article
  • October 2011

... This task is often also referred to as intention recognition or intent recognition. However, several authors discourage the use of these terms due to the ambiguity of the words intention and intent, which are defined in very different ways across fields, and even within the same field (e.g., Xu et al., 2009 included the whole plan as part of what they called intention). On the other hand, the term intent recognition (or intent classification) is especially used to refer to the particular task of understanding the intention of a person from a sentence in natural language, making it not the most appropriate to refer to the more general task of goal recognition. ...

The principles of intention computing
  • Citing Article
  • August 2009

... The most popular trust measurement in a trust network is the Jaccard coefficient or its variants [7,29,34] . Without losing generality, here we use a modified normalized Jaccard coefficient as an example to explain the trust measurement. ...

Incorporating Similarity and Trust for Collaborative Filtering
  • Citing Conference Paper
  • January 2009

... This challenge requires frameworks that emphasize reusability and modularity in their components, enhancing integration and connectivity. A reusable framework of this kind to integrate distributed services for collaboration has been proposed by (Tiejian et al., 2007). In this framework a web services "bus protocol" integrated self-made and third party collaborative tools, following a mash-up approach to meet specific framework needs, including security and management. ...

A Services Oriented Framework for Integrated and Customizable Collaborative Environment
  • Citing Conference Paper
  • September 2007