Hao Wu

Yunnan University, Yün-nan, Yunnan, China

Are you Hao Wu?

Claim your profile

Publications (26)1.52 Total impact

  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Advance reservation is important to guarantee the quality of services of jobs by allowing exclusive access to resources over a defined time interval on resources. It is a challenge for the scheduler to organize available resources efficiently and to allocate them for parallel AR jobs with deadline constraint appropriately. This paper provides a slot-based data structure to organize available resources of multiprocessor systems in a way that enables efficient search and update operations, and formulates a suite of scheduling policies to allocate resources for dynamically arriving AR requests. The performance of the scheduling algorithms were investigated by simulations with different job sizes and durations, system loads and scheduling flexibilities. Simulation results show that job sizes and durations, system load and the flexibility of scheduling will impact the performance metrics of all the scheduling algorithms, and the PE-Worst-Fit algorithm becomes the best algorithm for the scheduler with the highest acceptance rate of AR requests, and the jobs with the First-Fit algorithm experience the lowest average slowdown. The data structure and scheduling policies can be used to organize and allocate resources for parallel AR jobs with deadline constraint in large-scale computing systems.
    03/2012;
  • [show abstract] [hide abstract]
    ABSTRACT: In multiprocessor environment, resource reservation technology will split the continuous idle resources and generate resource fragments which would reduce resource utilization and job acceptance rate. In this paper, we defined resource fragments produced by resource reservation and proposed scheduling algorithms based on fragment-aware, the designs of which focus on improve acceptance ability of following-up jobs. Based on resource fragment-aware, we proposed two algorithms, Occupation Rate Best Fit and Occupation Rate Worst Fit, and in combination with heuristic algorithms, PE Worst Fit - Occupation Rate Best Fit and PE Worst Fit - Occupation Rate Worst Fit are put forward. We not only realized and analyzed algorithms in simulation, but also studied relationship between task properties and algorithms' performance. Experiments proved that PE Worst Fit - Occupation Worst Fit provides the best job acceptance rate and Occupation Rate Worst Fit has the best performance on average slowdown.
    Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2012 International Conference on; 01/2012
  • Hao Wu, Yijian Pei, Bo Li
    [show abstract] [hide abstract]
    ABSTRACT: Name ambiguity problem brings many challenges to scholar search. This problem has attracted many attentions in research communities, and various disambiguation algorithms combined with different citation features are proposed. However, there is still significant room for improvement. In this paper, we propose an unsupervised two-steps method to deal with the name disambiguation problems as an end user makes a scholar search. In the first step, the returned author's citations are blocked by using co-authorship relation, and then in second step, these blocks are merged by the classical hierarchical agglomerative clustering method. We test various linkage criteria and pairwise distances during hierarchical clustering, and find the best components to disambiguate citations. Also, we propose some approaches to improve the disambiguation performance in each step. According to experiments, our method outperforms 15% a best state-of-the-art work using the same recognized dataset without the need for any training.
    01/2012;
  • Hao Wu, Yu Hua, Bo Li, Yijian Pei
    [show abstract] [hide abstract]
    ABSTRACT: With the tremendous amount of citations available in digital library, how to suggest citations automatically, to meet the information needs of researchers has become an important problem. In this paper, we propose a model which treats citation recommendation as a special retrieval task to address this challenge. First, users provide a target paper with some metadata to our system. Second, the system retrieves a relevant candidate citation set. Then the candidate citations are reranked by well-chosen citation evidence, such as publication time preference, self-citation preference, co-citation preference and publication reputation preference. Especially, various measures are introduced to integrate the evidence. We experimented with the proposed model on an established bibliographic corpus-ACL Anthology Network, the results show that the model is valuable in practice, and citation recommendation can be significantly improved using proposed evidences.
    01/2012;
  • [show abstract] [hide abstract]
    ABSTRACT: Feature selection is a key step for image registration. The success of feature selection has a fundamental effect on matching image. Corners determine the contours characteristics of the target image, and the number of corners is far smaller than the number of image pixels, thus can be a good feature for image registration. By considering the algorithm speed and registration accuracy of the image registration, the paper proposes an improved Harris corner detection method for effective image registration. This method effectively avoids corner clustering phenomenon occurs during the corner detection process, thus the corner points detected distribute more reasonably, and the image registration become faster. The experiments also showed the effect of image registration is satisfactory, and reaches a reasonable match.
    Information Networking and Automation (ICINA), 2010 International Conference on; 11/2010
  • [show abstract] [hide abstract]
    ABSTRACT: Bioinformatics has grown about thirty years. Especially in the past ten years, the field developed in leaps and bounds and emerged many research works. Whether as a novice, or a famous scholar, would like to be able to glimpse the research situation of this field, and get an intuitive and quantified understanding. This article aims to use new machine learning techniques to mine the literatures in the field of bioinformatics and discover the important research topics, to quantify the evolution of these themes and show the trend.
    Bioinformatics and Biomedical Engineering (iCBBE), 2010 4th International Conference on; 07/2010
  • [show abstract] [hide abstract]
    ABSTRACT: Resource co-allocation is one of the crucial technologies affecting the utility and quality of services of large-scale distributed environments by simultaneously allocating multiple resources to one application. This paper concentrated on the problem to guarantee the QoS of co-allocation jobs via advance reservation and investigated the performances of two typical scheduling algorithms with and without advance reservation. Simulations have shown that advance reservation is effective to improve the QoS of co-allocation.
    Computer Modeling and Simulation, International Conference on. 01/2010; 3:24-27.
  • [show abstract] [hide abstract]
    ABSTRACT: Backfilling is well known in parallel job scheduling to increase system utilization and user satisfaction over traditional non-backfilling scheduling algorithms, which allow small jobs from the back of the queue to execute before larger jobs arriving earlier, and resources could be reserved to protect the latter from starvation. This paper proposed a relaxed backfill scheduling mechanism supporting multiple reservations, and investigated its effectiveness in reducing the average waiting time and average slowdown of jobs by using simulations with real traces. Different from existing relaxed scheduling, which restrict the maximum number of reservations to one, this new mechanism can support the relaxation of multiple reservations and works efficiently in scheduling by successful avoidance of raising chain reactions in relaxing the start times of multiple already existing reservations. Experimental results suggest that although the performances of both the relax-based backfilling and the strict backfill depend on the accuracy of runtime estimates, reservation depths, traces and system load alike, the former scheduling is more flexible and generally more effective in reducing the average waiting time and average slowdown of jobs, without loss of utilization.
    2010 International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2010, Wuhan, China, 8-11 December, 2010; 01/2010
  • [show abstract] [hide abstract]
    ABSTRACT: As a new task of expertise retrieval, finding research communities for scientific guidance and research cooperation has become more and more important. However, the existing community discovery algorithms only consider graph structure, without considering the context, such as knowledge characteristics. Therefore, detecting research community cannot be simply addressed by direct application of existing methods. In this paper, we propose a hierarchical discovery strategy which rapidly locates the core of the research community, and then incrementally extends the community. Especially, as expanding local community, it selects a node considering both its connection strength and expertise divergence to the candidate community, to prevent intellectually irrelevant nodes to spill-in to the current community. The experiments on ACL Anthology Network show our method is effective.
    Advanced Intelligent Computing Theories and Applications, 6th International Conference on Intelligent Computing, ICIC 2010, Changsha, China, August 18-21, 2010. Proceedings; 01/2010
  • Hao Wu, Jun He, Yijian Pei
    JASIST. 01/2010; 61:2274-2287.
  • Hao Wu, Yijian Pei, Jiang Yu
    [show abstract] [hide abstract]
    ABSTRACT: As a retrieval task, expert finding has recently attracted much attention. And various methods have been proposed to rank expert candidates against topical query. The most efficient approach is document-based method that treats supporting documents as a ldquobridgerdquo and ranks the candidates based on the co-occurrences of topic and candidate mentions in the supporting documents.However, such kind of methods models relevance between query and candidates on the much lower and hence less ambiguous level. It lacks of the capability to capture the hidden semantic association between queries and candidates. In this paper, we propose a hidden topic analysis based approach to estimate the relevance between query and candidates. It models query and supporting document as a word-topic-document association instead of the word-document association in language model. In addition, the prior knowledge of supporting document is considered to favor expert ranking. The empirical results on metadata corpus have demonstrated the model can effectively catch the semantic association between queries and candidates, thus improves the performance of expert finding.
    8th IEEE/ACIS International Conference on Computer and Information Science, IEEE/ACIS ICIS 2009, June 1-3, 2009, Shanghai, China; 01/2009
  • Hao Wu, Yijian Pei
    [show abstract] [hide abstract]
    ABSTRACT: The studies of citations are comprehensively carried out with the increasing electronically citation data on the Web. Most of the metrics observe scientific quality in a global view instead of in multiple fine-grained views. In this paper, we suggest to apply Topic Model and adaptive PageRank algorithm to assess the relative importance of scientific objects including articles, authors, conferences and journals. The scientific quality is measured by an aggregation PageRank metric towards some topics. This metric considers the impact of a paper both in global view and local view. The experiments on ACL Anthology bibliographic corpus show our method is a useful measure to observe scientific quality on multi-views.
    Proceedings of the 2nd International Conference on BioMedical Engineering and Informatics, BMEI 2009, October 17-19, 2009, Tianjin, China; 01/2009
  • Hao Wu, Yijian Pei, Jiang Yu
    [show abstract] [hide abstract]
    ABSTRACT: The problem of academic expert finding is concerned with finding the experts on a named research field. It has many real-world applications and has recently attracted much attention. However, the existing methods are not versatile and suitable for the special needs from academic areas where the co-authorship and the citation relation play important roles in judging researchers’ achievements. In this paper, we propose and develop a flexible data schema and a topic-sensitive co-pagerank algorithmcombined with a topic model for solving this problem. The main idea is to measure the authors’ authorities by considering topic bias based on their social networks and citation networks, and then, recommending expert candidates for the questions. To infer the association between authors and topics, we draw a probability model from the latent Dirichlet allocation (LDA) model. We further propose several techniques such as reasoning the interested topics of the query and integrating ranking metrics to order the practices. Our experiments show that the proposed strategies are all effective to improve the retrieval accuracy.
    Frontiers of Computer Science in China 01/2009; 3:445-456. · 0.27 Impact Factor
  • [show abstract] [hide abstract]
    ABSTRACT: This paper introduces a simulated joint robot-hand system based on Java3D, which can be used as a Java application in an independent computer, or as a Java Applet in a network environment. When the system is executed, the authorized user can control the simulated robot hand with flexible input device like joystick, and the status of the robot hand can be displayed in 3D-simulation mode dynamically. After the operation, the operation data was recorded into database automatically, and a log is built, so that the past operation can be checked or analyzed.
    01/2009;
  • [show abstract] [hide abstract]
    ABSTRACT: Web services are service endpoints in Service Oriented Architecture (SOA). If the SOA paradigm succeeds there will be soon several thousand services, which can be used for composing required applications. For this, these services must first be discovered. However, the existing infrastructure for services publishing and management are built on the back of centralized registry (mostly as UDDI). They lack the flexibility to support personalized user requirements, known as services evaluation, services recommendation. In this paper, we will suggest a framework exploiting link analysis mechanism to assist web services discovery. The main idea is measuring the importance of a service within services network, and then recommending the services to end users. By pass, the user’s private requirements will be considered.
    Advances in Grid and Pervasive Computing, Third International Conference, GPC 2008, Kunming, China, May 25-28, 2008. Proceedings; 01/2008
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: A novel approach, called Aeneas, which is based on the execution state of distributed programs, is proposed in this paper. It is for the real-time performance analysis of distributed programs with reliability-constrains. In Aeneas, there are two important factors, the available data files and the transmission paths of each available data file. Some algorithms are designed to find all the transmission paths of each data file needed while the program executes, count the transmission time for each transmission path, then get the aggregate expression of transmission time, calculate the fastest response time and the slowest response time of distributed programs with reliability-constrains. In order to justify the feasibility and the availability of this approach, a series of experiments have been done. The results show that it is feasible and efficient to evaluate the real-time performance for distributed software with reliability-constrains.
    Cluster Computing 01/2007; 10:175-186. · 0.78 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: In this short paper, we present our solution to index, store and retrieve the domain knowledge. The main principle exploits Lucene to index the domain knowledge under guide of the domain schema. The method to map domain knowledge structure into Lucene index structure, store and update the indices, and to transfer RDF-based query into Lucene's query are presented.
    Proceedings of the 2007 ACM Symposium on Applied Computing (SAC), Seoul, Korea, March 11-15, 2007; 01/2007
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Access to scientific literature information is a very important, as well as time-consuming daily work for scientific researchers. Current methods of retrieval are usually limited to keyword-based searching using information retrieval techniques. In this paper, we present SemreX which implements efficient large-scale literature retrieval and browsing with a single access point based on semantic Web technologies. The concept of semantic association is proposed to reveal explicit or implicit relationships between semantic entities, combining with the ontology-based information visualization technique so as to facilitate researchers retrieving semantically relevant information, as well as context relationships which can capture user's current search intentions while preserving an overall picture of scientific knowledge
    e-Business Engineering, 2006. ICEBE '06. IEEE International Conference on; 11/2006
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Service Oriented Architecture (SOA) and Peer-to-Peer (P2P) computing share many common characteristics. It is believed that the combination of the two emerging techniques is a very promising method in promoting the web services (WS). Because the service discovery plays a key role in the integration, here a P2P-based framework to manage the knowledge of service and locating services is proposed. In this paper, the details of the principle, constructing and maintaining of service semantic overlay architecture have been described, and the way how the semantic overlay facilitates discovery of service resources is illustrated. To enable the semantic web service superiority, Service Ontology, which is considered as the service semantic model, is employed to depict service. The service discovery includes two phases: searching on the service semantic overlay; and local discovery in peer’s service repository. Various solutions have been proposed to realize those two phases. Furthermore, tests are carried out to evaluate service discovery on the architecture.
    Journal of Computer Science and Technology 01/2006; 21(4):582-591. · 0.48 Impact Factor
  • Hao Wu, Hai Jin
    [show abstract] [hide abstract]
    ABSTRACT: Peer-to-Peer (P2P) systems are a new paradigm for information sharing and some systems have successfully been deployed. It has been argued that current P2P systems suffer from the lack of semantics. Therefore combining P2P solutions with Semantic Web technologies for knowledge sharing become a new trend. SemreX is a P2P based semantic-enabled knowledge management system for sharing references metadata. In SemreX, we need to handle heterogeneous literature formats, and present a shared understanding about publications knowledge. Meanwhile, peers in SemreX require some compromises with respect to the use of semantic knowledge models for self-description. In this paper, we propose metadata models that combine features of ontology, for encoding and aligning semantic information from references, and for a flexible description of knowledge located in a peer. We describe these models and discuss the roles of the models in the SemreX environment as well as their creations and applications.
    Advances in Grid and Pervasive Computing, First International Conference, GPC 2006, Taichung, Taiwan, May 3-5, 2006, Proceedings; 01/2006