Hao Wu

Yunnan University, Yün-nan, Yunnan, China

Are you Hao Wu?

Claim your profile

Publications (30)4.44 Total impact

  • 2012 4th Electronic System-Integration Technology Conference (ESTC); 09/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Advance reservation is important to guarantee the quality of services of jobs by allowing exclusive access to resources over a defined time interval on resources. It is a challenge for the scheduler to organize available resources efficiently and to allocate them for parallel AR jobs with deadline constraint appropriately. This paper provides a slot-based data structure to organize available resources of multiprocessor systems in a way that enables efficient search and update operations, and formulates a suite of scheduling policies to allocate resources for dynamically arriving AR requests. The performance of the scheduling algorithms were investigated by simulations with different job sizes and durations, system loads and scheduling flexibilities. Simulation results show that job sizes and durations, system load and the flexibility of scheduling will impact the performance metrics of all the scheduling algorithms, and the PE-Worst-Fit algorithm becomes the best algorithm for the scheduler with the highest acceptance rate of AR requests, and the jobs with the First-Fit algorithm experience the lowest average slowdown. The data structure and scheduling policies can be used to organize and allocate resources for parallel AR jobs with deadline constraint in large-scale computing systems.
    The Journal of Supercomputing 03/2012; · 0.92 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In multiprocessor environment, resource reservation technology will split the continuous idle resources and generate resource fragments which would reduce resource utilization and job acceptance rate. In this paper, we defined resource fragments produced by resource reservation and proposed scheduling algorithms based on fragment-aware, the designs of which focus on improve acceptance ability of following-up jobs. Based on resource fragment-aware, we proposed two algorithms, Occupation Rate Best Fit and Occupation Rate Worst Fit, and in combination with heuristic algorithms, PE Worst Fit - Occupation Rate Best Fit and PE Worst Fit - Occupation Rate Worst Fit are put forward. We not only realized and analyzed algorithms in simulation, but also studied relationship between task properties and algorithms' performance. Experiments proved that PE Worst Fit - Occupation Worst Fit provides the best job acceptance rate and Occupation Rate Worst Fit has the best performance on average slowdown.
    Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2012 International Conference on; 01/2012
  • Hao Wu, Yu Hua, Bo Li, Yijian Pei
    [Show abstract] [Hide abstract]
    ABSTRACT: With the tremendous amount of citations available in digital library, how to suggest citations automatically, to meet the information needs of researchers has become an important problem. In this paper, we propose a model which treats citation recommendation as a special retrieval task to address this challenge. First, users provide a target paper with some metadata to our system. Second, the system retrieves a relevant candidate citation set. Then the candidate citations are reranked by well-chosen citation evidence, such as publication time preference, self-citation preference, co-citation preference and publication reputation preference. Especially, various measures are introduced to integrate the evidence. We experimented with the proposed model on an established bibliographic corpus-ACL Anthology Network, the results show that the model is valuable in practice, and citation recommendation can be significantly improved using proposed evidences.
    01/2012;
  • Hao Wu, Yijian Pei, Bo Li
    [Show abstract] [Hide abstract]
    ABSTRACT: Name ambiguity problem brings many challenges to scholar search. This problem has attracted many attentions in research communities, and various disambiguation algorithms combined with different citation features are proposed. However, there is still significant room for improvement. In this paper, we propose an unsupervised two-steps method to deal with the name disambiguation problems as an end user makes a scholar search. In the first step, the returned author's citations are blocked by using co-authorship relation, and then in second step, these blocks are merged by the classical hierarchical agglomerative clustering method. We test various linkage criteria and pairwise distances during hierarchical clustering, and find the best components to disambiguate citations. Also, we propose some approaches to improve the disambiguation performance in each step. According to experiments, our method outperforms 15% a best state-of-the-art work using the same recognized dataset without the need for any training.
    01/2012;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Feature selection is a key step for image registration. The success of feature selection has a fundamental effect on matching image. Corners determine the contours characteristics of the target image, and the number of corners is far smaller than the number of image pixels, thus can be a good feature for image registration. By considering the algorithm speed and registration accuracy of the image registration, the paper proposes an improved Harris corner detection method for effective image registration. This method effectively avoids corner clustering phenomenon occurs during the corner detection process, thus the corner points detected distribute more reasonably, and the image registration become faster. The experiments also showed the effect of image registration is satisfactory, and reaches a reasonable match.
    Information Networking and Automation (ICINA), 2010 International Conference on; 11/2010
  • Conference Paper: none
    Bioinf; 07/2010
  • [Show abstract] [Hide abstract]
    ABSTRACT: Resource co-allocation is one of the crucial technologies affecting the utility and quality of services of large-scale distributed environments by simultaneously allocating multiple resources to one application. This paper concentrated on the problem to guarantee the QoS of co-allocation jobs via advance reservation and investigated the performances of two typical scheduling algorithms with and without advance reservation. Simulations have shown that advance reservation is effective to improve the QoS of co-allocation.
    Computer Modeling and Simulation, International Conference on. 01/2010; 3:24-27.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Backfilling is well known in parallel job scheduling to increase system utilization and user satisfaction over traditional non-backfilling scheduling algorithms, which allow small jobs from the back of the queue to execute before larger jobs arriving earlier, and resources could be reserved to protect the latter from starvation. This paper proposed a relaxed backfill scheduling mechanism supporting multiple reservations, and investigated its effectiveness in reducing the average waiting time and average slowdown of jobs by using simulations with real traces. Different from existing relaxed scheduling, which restrict the maximum number of reservations to one, this new mechanism can support the relaxation of multiple reservations and works efficiently in scheduling by successful avoidance of raising chain reactions in relaxing the start times of multiple already existing reservations. Experimental results suggest that although the performances of both the relax-based backfilling and the strict backfill depend on the accuracy of runtime estimates, reservation depths, traces and system load alike, the former scheduling is more flexible and generally more effective in reducing the average waiting time and average slowdown of jobs, without loss of utilization.
    2010 International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2010, Wuhan, China, 8-11 December, 2010; 01/2010
  • [Show abstract] [Hide abstract]
    ABSTRACT: As a new task of expertise retrieval, finding research communities for scientific guidance and research cooperation has become more and more important. However, the existing community discovery algorithms only consider graph structure, without considering the context, such as knowledge characteristics. Therefore, detecting research community cannot be simply addressed by direct application of existing methods. In this paper, we propose a hierarchical discovery strategy which rapidly locates the core of the research community, and then incrementally extends the community. Especially, as expanding local community, it selects a node considering both its connection strength and expertise divergence to the candidate community, to prevent intellectually irrelevant nodes to spill-in to the current community. The experiments on ACL Anthology Network show our method is effective.
    Advanced Intelligent Computing Theories and Applications, 6th International Conference on Intelligent Computing, ICIC 2010, Changsha, China, August 18-21, 2010. Proceedings; 01/2010
  • Hao Wu, Jun He, Yijian Pei
    [Show abstract] [Hide abstract]
    ABSTRACT: In this article, we propose to apply the topic model and topic-level eigenfactor (TEF) algorithm to assess the relative importance of academic entities including articles, authors, journals, and conferences. Scientific impact is measured by the biased PageRank score toward topics created by the latent topic model. The TEF metric considers the impact of an academic entity in multiple granular views as well as in a global view. Experiments on a computational linguistics corpus show that the method is a useful and promising measure to assess scientific impact. © 2010 Wiley Periodicals, Inc.
    Journal of the American Society for Information Science and Technology 01/2010; 61:2274-2287. · 2.01 Impact Factor
  • Hao Wu, Yijian Pei, Jiang Yu
    [Show abstract] [Hide abstract]
    ABSTRACT: As a retrieval task, expert finding has recently attracted much attention. And various methods have been proposed to rank expert candidates against topical query. The most efficient approach is document-based method that treats supporting documents as a ldquobridgerdquo and ranks the candidates based on the co-occurrences of topic and candidate mentions in the supporting documents.However, such kind of methods models relevance between query and candidates on the much lower and hence less ambiguous level. It lacks of the capability to capture the hidden semantic association between queries and candidates. In this paper, we propose a hidden topic analysis based approach to estimate the relevance between query and candidates. It models query and supporting document as a word-topic-document association instead of the word-document association in language model. In addition, the prior knowledge of supporting document is considered to favor expert ranking. The empirical results on metadata corpus have demonstrated the model can effectively catch the semantic association between queries and candidates, thus improves the performance of expert finding.
    8th IEEE/ACIS International Conference on Computer and Information Science, IEEE/ACIS ICIS 2009, June 1-3, 2009, Shanghai, China; 01/2009
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper introduces a simulated joint robot-hand system based on Java3D, which can be used as a Java application in an independent computer, or as a Java Applet in a network environment. When the system is executed, the authorized user can control the simulated robot hand with flexible input device like joystick, and the status of the robot hand can be displayed in 3D-simulation mode dynamically. After the operation, the operation data was recorded into database automatically, and a log is built, so that the past operation can be checked or analyzed.
    01/2009;
  • Hao Wu, Yijian Pei
    [Show abstract] [Hide abstract]
    ABSTRACT: The studies of citations are comprehensively carried out with the increasing electronically citation data on the Web. Most of the metrics observe scientific quality in a global view instead of in multiple fine-grained views. In this paper, we suggest to apply Topic Model and adaptive PageRank algorithm to assess the relative importance of scientific objects including articles, authors, conferences and journals. The scientific quality is measured by an aggregation PageRank metric towards some topics. This metric considers the impact of a paper both in global view and local view. The experiments on ACL Anthology bibliographic corpus show our method is a useful measure to observe scientific quality on multi-views.
    Proceedings of the 2nd International Conference on BioMedical Engineering and Informatics, BMEI 2009, October 17-19, 2009, Tianjin, China; 01/2009
  • Hao Wu, Yijian Pei, Jiang Yu
    [Show abstract] [Hide abstract]
    ABSTRACT: The problem of academic expert finding is concerned with finding the experts on a named research field. It has many real-world applications and has recently attracted much attention. However, the existing methods are not versatile and suitable for the special needs from academic areas where the co-authorship and the citation relation play important roles in judging researchers’ achievements. In this paper, we propose and develop a flexible data schema and a topic-sensitive co-pagerank algorithmcombined with a topic model for solving this problem. The main idea is to measure the authors’ authorities by considering topic bias based on their social networks and citation networks, and then, recommending expert candidates for the questions. To infer the association between authors and topics, we draw a probability model from the latent Dirichlet allocation (LDA) model. We further propose several techniques such as reasoning the interested topics of the query and integrating ranking metrics to order the practices. Our experiments show that the proposed strategies are all effective to improve the retrieval accuracy.
    Frontiers of Computer Science in China 01/2009; 3:445-456. · 0.27 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Intelligent retrieval for best satisfying users search intensions still remains a challenging problem due to the inherent complexity of real-world semantic web applications. Usually, a search request contains not only vagueness or imprecision, but also personalized information goals. This paper presents a novel approach which formulates one’s search request through tightly combining fuzziness together with the user’s subjective weighting importance over multiple search properties. A special ranking mechanism based on the weighed fuzzy query representation is proposed. The ranking method generates a set of “degree of relevance” – an overall score which reflects both fuzzy predicates and the user’s personalized preferences in the search request. Moreover, the ranking method is general and unique rather than arbitrary. Hence, search results shall be properly ordered in terms of their relevance with respect to best matching the search intension. The experimental results show that our approach can effectively capture users information goals and produce much better search results than existing approaches.
    Knowledge-Based Systems. 10/2008;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The semantic web not only contains resources but also includes the heterogeneous relationships among them, which is sharply distinguished from the current web. As the growth of the semantic web, specialized search techniques are of significance. In this paper, we present RSS—a framework for enabling ranked semantic search on the semantic web. In this framework, the heterogeneity of relationships is fully exploited to determine the global importance of resources. In addition, the search results can be greatly expanded with entities most semantically related to the query, thus able to provide users with properly ordered semantic search results by combining global ranking values and the relevance between the resources and the query. The proposed semantic search model which supports inference is very different from traditional keyword-based search methods. Moreover, RSS also distinguishes from many current methods of accessing the semantic web data in that it applies novel ranking strategies to prevent returning search results in disorder. The experimental results show that the framework is feasible and can produce better ordering of semantic search results than directly applying the standard PageRank algorithm on the semantic web.
    Information Processing & Management. 01/2008;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Web services are service endpoints in Service Oriented Architecture (SOA). If the SOA paradigm succeeds there will be soon several thousand services, which can be used for composing required applications. For this, these services must first be discovered. However, the existing infrastructure for services publishing and management are built on the back of centralized registry (mostly as UDDI). They lack the flexibility to support personalized user requirements, known as services evaluation, services recommendation. In this paper, we will suggest a framework exploiting link analysis mechanism to assist web services discovery. The main idea is measuring the importance of a service within services network, and then recommending the services to end users. By pass, the user’s private requirements will be considered.
    Advances in Grid and Pervasive Computing, Third International Conference, GPC 2008, Kunming, China, May 25-28, 2008. Proceedings; 01/2008
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A novel approach, called Aeneas, which is based on the execution state of distributed programs, is proposed in this paper. It is for the real-time performance analysis of distributed programs with reliability-constrains. In Aeneas, there are two important factors, the available data files and the transmission paths of each available data file. Some algorithms are designed to find all the transmission paths of each data file needed while the program executes, count the transmission time for each transmission path, then get the aggregate expression of transmission time, calculate the fastest response time and the slowest response time of distributed programs with reliability-constrains. In order to justify the feasibility and the availability of this approach, a series of experiments have been done. The results show that it is feasible and efficient to evaluate the real-time performance for distributed software with reliability-constrains.
    Cluster Computing 01/2007; 10:175-186. · 0.78 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this short paper, we present our solution to index, store and retrieve the domain knowledge. The main principle exploits Lucene to index the domain knowledge under guide of the domain schema. The method to map domain knowledge structure into Lucene index structure, store and update the indices, and to transfer RDF-based query into Lucene's query are presented.
    Proceedings of the 2007 ACM Symposium on Applied Computing (SAC), Seoul, Korea, March 11-15, 2007; 01/2007