Zheng Xu

Sichuan Fire Research Institute of the Ministry of Public Security, Hua-yang, Sichuan, China

Are you Zheng Xu?

Claim your profile

Publications (25)15.95 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, the problem of generating temporal semantic context for concepts is studied. The goal of the proposed problem is to annotate a concept with temporal, concise, and structured information, which can reflect the explicit and faceted meanings of the concept. The temporal semantic context can help users learn and understand unfamiliar or newly emerged concepts. The proposed temporal semantic context structure integrates the features from dictionary, Wikipedia, and LinkedIn web sites. A general method to generate temporal semantic context of a concept by constructing its associated words, associated concepts, context sentences, context graph, and context communities is proposed. Empirical experiments on three different datasets including Q-A dataset, LinkedIn dataset, and Wikipedia dataset show that the proposed algorithm is effective and accurate. Different from manually generated context repositories such as LinkedIn and Wikipedia, the proposed method can automatically generate the context and does not need any prior knowledge such as ontology or a hierarchical knowledge base. The proposed method is used on some applications such as trend analysis, faceted exploration, and query suggestion. These applications prove the effectiveness of the proposed temporal semantic context problem in many web mining tasks.
    Journal of Network and Computer Applications 08/2014; · 1.77 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Big data is an emerging paradigm applied to datasets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Especially, the data volume of all video surveillance devices in Shanghai, China is up to 1 TB every day. Thus, it is important to accurately describe the video content and enable the organizing and searching potential videos in order to detect and analyze related surveillance events. Unfortunately, raw data and low level features cannot meet the video based task. In this paper, a semantic based model is proposed for representing and organizing video big data. The proposed surveillance video representation method defines a number of concepts and their relations, which allows users to use them to annotate related surveillance events. The defined concepts include person, vehicles, and traffic sighs, which can be used for annotating and representing video traffic events unambiguous. In addition, the spatial and temporal relation between objects in an event is defined, which can be used for annotating and representing the semantic relation between objects in related surveillance events. Moreover, semantic link network is used for organizing video resources based on their associations. In the application, one case study is presented to analyse the surveillance big data.
    Journal of Systems and Software 07/2014; · 1.14 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we study the problem of mining temporal semantic relations between entities. The goal of the studied problem is to mine and annotate a semantic relation with temporal, concise, and structured information, which can release the explicit, implicit, and diversity semantic relations between entities. The temporal semantic annotations can help users to learn and understand the unfamiliar or new emerged semantic relations between entities. The proposed temporal semantic annotation structure integrates the features from IEEE and Renlifang. We propose a general method to generate temporal semantic annotation of a semantic relation between entities by constructing its connection entities, lexical syntactic patterns, context sentences, context graph, and context communities. Empirical experiments on two different datasets including a LinkedIn dataset and movie star dataset show that the proposed method is effective and accurate. Different from the manually generated annotation repository such as Wikipedia and LinkedIn, the proposed method can automatically mine the semantic relation between entities and does not need any prior knowledge such as ontology or the hierarchical knowledge base. The proposed method can be used on some applications, which proves the effectiveness of the proposed temporal semantic relations on many web mining tasks.
    Future Generation Computer Systems 07/2014; 37:468–477. · 2.64 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: An explosive growth in the volume, velocity, and variety of the data available on the Internet is witnessed recently. The data originated from multiple types of sources including mobile devices, sensors, individual archives, social networks, Internet of Things, enterprises, cameras, software logs, health data has led to one of the most challenging research issues of the big data era. In this paper, Knowle—an online news management system upon semantic link network model is introduced. Knowle is a news event centrality data management system. The core elements of Knowle are news events on the Web, which are linked by their semantic relations. Knowle is a hierarchical data system, which has three different layers including the bottom layer (concepts), the middle layer (resources), and the top layer (events). The basic blocks of Knowle system—news collection, resources representation, semantic relations mining, semantic linking news events are given. Knowle does not require data providers to follow semantic standards such as RDF or OWL, which is a semantics-rich self-organized network. It reflects various semantic relations of concepts, news, and events. Moreover, in the case study, Knowle is used for organizing and mining health news, which shows the potential on forming the basis of designing and developing big data analytics based innovation framework in health domain.
    Future Generation Computer Systems 04/2014; · 2.64 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Relatedness measurement between multimedia such as images and videos plays an important role in computer vision, which is a base for many multimedia related applications including clustering, searching, recommendation, and annotation. Recently, with the explosion of social media, users can upload media data and annotate content with descriptive tags. In this paper, we aim at measuring the semantic relatedness of Flickr images. Firstly, four information theory based functions are used to measure the semantic relatedness of tags. Secondly, the integration of tags pair based on bipartite graph is proposed to remove the noise and redundancy. Thirdly, the order information of tags is added to measure the semantic relatedness, which emphasizes the tags with high positions. The data sets including 1000 images from Flickr are used to evaluate the proposed method. Two data mining tasks including clustering and searching are performed by the proposed method, which shows the effectiveness and robustness of the proposed method. Moreover, some applications such as searching and faceted exploration are introduced using the proposed method, which shows that the proposed method has broad prospects on web based tasks.
    The Scientific World Journal 01/2014; 2014:758089. · 1.73 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Association relations between concepts are a class of simple but powerful regularities in binary data, which play important roles in enterprises and organizations with huge amounts of data. However, although there can be easily large number of association relation mined from databases, since existing objective and subjective methods scarcely take semantics into consideration, it has been recognized early in the knowledge discovery literature that most of them are of no interest to the user. In this paper, the semantic discrimination capability (SDC) of association relation is measured based on discrimination value model first. The formula of SDC integrating both statistical and graph features is proposed from five different strategies. The high correlation coefficient of the proposed method against discrimination value shows that the proposed SDC measure is accuracy. Moreover, an application using SDC on document clustering is carried out, which shows that SDC has broad prospects on data‐related task such as document clustering. Copyright 2013 John Wiley © Sons, Ltd.
    Concurrency and Computation Practice and Experience 01/2014; 26(2). · 0.85 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Big data is an emerging paradigm applied to datasets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Especially, the data volume of all video surveillance devices in Shanghai, China is up to 1 TB every day. Thus, it is important to accurately describe the video content and enable the organizing and searching potential videos in order to detect and analyze related traffic events. Unfortunately, raw data and low level features cannot meet the video based tasks. In this paper, we propose a semantic based model for representing and organizing video big data. The proposed method defines a number of concepts and their relations, which allow users to use them to annotate video traffic events. The defined concepts including people, vehicle, and traffic sigh, which can be used by users for annotating and representing video traffic events unambiguous. In addition, we define the spatial and temporal relations in event and concepts definitions, which can be used by users for annotating and representing the semantic relations between objects in video traffic events. Moreover, semantic link network is used for organizing video resources based on their associations. In the application, we illustrate two systems using the proposed method for annotating and searching video resources.
    Proceedings of the 2013 IEEE 16th International Conference on Computational Science and Engineering; 12/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Online popular events, which are constructed from news stories using the techniques of Topic Detection and Tracking (TDT), bring convenience to users who intend to see what is going on through the Internet. Recently, the web is becoming an important event information provider and poster due to its real-time, open, and dynamic features. However, it is difficult to detect events since the huge scale and dynamics of the internet. In this paper, we define the novel problem of investigating impact factors for event detection. We give the definitions of five impact factors including the number of increased web pages, the number of increased keywords, the number of communities, the average clustering coefficient, and the average similarities of web pages. These five impact factors contain statistic and content information of an event. Empirical experiments on real datasets including Google Zeitgeist and Google Trends show that that the number of web pages and the average clustering coefficient can be used to detect events. Some strategies integrating the number of web pages and the average clustering coefficient are also employed. The evaluations on real dataset show that the proposed function integrating the number of web pages and the average clustering coefficient can be used for event detection efficiently and correctly.
    Proceedings of the 2013 IEEE 16th International Conference on Computational Science and Engineering; 12/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Recent research shows that videos "in the wild" are growing at a staggering rate. The rapid increase number of video resources has brought an urgent need to develop intelligent methods to organize the video events. In this paper, we use the Association Link Network model for organizing video resources from Web. Association Link Network is a kind of semantic link network, which is designed to establish associated relations among various resources (e.g., Web pages or documents in digital library) aiming at extending the loosely connected network of no semantics (e.g., the Web) to an association-rich network. Since the theory of cognitive science considers that the associated relations can make one resource more comprehensive to users, the motivation of ALN is to organize the associated resources loosely distributed in the Web for effectively supporting the Web intelligent activities such as browsing, knowledge discovery and publishing, etc. The tags and surrounding texts of video resources are used to represent the semantic content. The relatedness between tags and surrounding texts are implemented in the Association Link Network model. Two data sets from YouTube and Flickr1 are used to evaluate the proposed method. The experimental results show that the proposed method can measure the association relations accurately and robustly.
    2013 Ninth International Conference on Semantics, Knowledge and Grids (SKG); 10/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Association Link Network (ALN) aims to establish associated relations among various resources. By extending the hyperlink network World Wide Web to an association-rich network, ALN is able to effectively support Web intelligence activities such as Web browsing, Web knowledge discovery, and publishing, etc. Since existing methods for building semantic link on Web resources cannot effectively and automatically organize loose Web resources, effective Web intelligence activities are still challenging. In this paper, a discovery algorithm of associated resources is first proposed to build original ALN for organizing loose Web resources. Second, three schemas for constructing kernel ALN and connection-rich ALN (C-ALN) are developed gradually to optimize the organizing of Web resources. After that, properties of different types of ALN are discussed, which show that C-ALN has good performances to support Web intelligence activities. Moreover, an evaluation method is presented to verify the correctness of C-ALN for semantic link on documents. Finally, an application using C-ALN to organize Web services is presented, which shows that C-ALN is an effective and efficient tool for building semantic link on the resources of Web services.
    IEEE Transactions on Automation Science and Engineering 08/2011; · 1.67 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Queries to Web search engines are usually short and ambiguous, which provides insufficient information needs of users for effectively retrieving relevant Web pages. To address this problem, query suggestion is implemented by most search engines. However, existing methods do not leverage the contradiction between accuracy and computation complexity appropriately (e.g. Google's ‘Search related to’ and Yahoo's ‘Also Try’). In this paper, the recommended words are extracted from the search results of the query, which guarantees the real time of query suggestion properly. A scheme for ranking words based on semantic similarity presents a list of words as the query suggestion results, which ensures the accuracy of query suggestion. Moreover, the experimental results show that the proposed method significantly improves the quality of query suggestion over some popular Web search engines (e.g. Google and Yahoo). Finally, an offline experiment that compares the accuracy of snippets in capturing the number of words in a document is performed, which increases the confidence of the method proposed by the paper. Copyright © 2010 John Wiley & Sons, Ltd.
    Concurrency and Computation Practice and Experience 07/2011; 23:1101-1113. · 0.85 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Semantic similarity measures play important roles in many Web-related tasks such as Web browsing and query suggestion. Because taxonomy-based methods can not deal with continually emerging words, recently Web-based methods have been proposed to solve this problem. Because of the noise and redundancy hidden in the Web data, robustness and accuracy are still challenges. In this paper, we propose a method integrating page counts and snippets returned by Web search engines. Then, the semantic snippets and the number of search results are used to remove noise and redundancy in the Web snippets (‘Web-snippet’ includes the title, summary, and URL of a Web page returned by a search engine). After that, a method integrating page counts, semantics snippets, and the number of already displayed search results are proposed. The proposed method does not need any human annotated knowledge (e.g., ontologies), and can be applied Web-related tasks (e.g., query suggestion) easily. A correlation coefficient of 0.851 against Rubenstein–Goodenough benchmark dataset shows that the proposed method outperforms the existing Web-based methods by a wide margin. Moreover, the proposed semantic similarity measure significantly improves the quality of query suggestion against some page counts based methods. Copyright © 2011 John Wiley & Sons, Ltd.
    Concurrency and Computation: Practice and Experience. 01/2011; 23:2496-2510.
  • IEEE T. Automation Science and Engineering. 01/2011; 8:482-494.
  • [Show abstract] [Hide abstract]
    ABSTRACT: With the increase of information scale of web events on the time, it is extremely difficult and challenging to grasp the semantics of web events artificially, because of the limitation of the time and energy of human beings. Herein, we propose a method to map the web event to keyword level association link network (KALN) for deep analysis of the semantics of web events, such as the evolution semantics of web events. Firstly, the original KALN is constructed at a given time by traditional data mining technologies. Then, the hierarchical KALN, consisted of Theme Layer Network, Backbone Layer Network and Tidbit Layer Network, is built based on the original KALN by information entropy to identify the different semantic levels of the web event, including stable semantics, sub-stable semantics and unstable semantics. With the semantic analysis of hierarchical KALN, human could easily gain a thorough understanding of the web event. Finally, experiments show that our method can effectively capture the different level semantics of web events.
    IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing, DASC 2011, 12-14 December 2011, Sydney, Australia; 01/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: Association Link Network (ALN) is proposed to establish association relations among Web resources, aiming at extending the hyperlink-based Web to an association-rich network and effectively supporting Web intelligence activities, such as Web-based learning. However, it is difficult to build the ALN one-off by direct computing since the huge number and quickly increasing learning resources on the Web. Thus, how to rapidly and accurately acquire the association relations between the new coming and existing learning resources has become a challenge in the incrementally building process of ALN. In this paper, a new algorithm is developed for incrementally updating ALN to cater for the dynamic management of learning resources increasing with time.
    Advances in Web-Based Learning - ICWL 2011 - 10th International Conference, Hong Kong, China, December 8-10, 2011. Proceedings; 01/2011
  • International Journal of Web Services Research 01/2011; 8:29-46. · 0.18 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The semantically associated network on the Web is a Semantic Link Network built by mining the associated relation between Web pages. The associated link from page A to page B indicates that users who have browsed page A is likely to also browse page B. This paper explores the statistical properties of the associated network on the Web. Web pages of a specific domain are automatically downloaded by a Web crawler to build an associated network. We analyze the associated network at different domain thresholds and classify the topology into three states, that is, the original state, the kernel state and the final state. A mathematical model is built to study the in-degree distribution, the out-degree distribution and the total-degree distribution for both the kernel state and the final state. By tuning the model parameters to reasonable values, we obtain the distinct power-law forms for the three degree distributions with exponents that agree well with the statistical data. The proposed model can not only describe the evolving processes of the associated network on the Web, but also provides theory basis for complex applications such as semantic community discovery, intelligent browsing and recommendation. Copyright © 2009 John Wiley & Sons, Ltd.
    Concurrency and Computation Practice and Experience 01/2010; 22:767-787. · 0.85 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Web personalized services alleviate the burden of information overload by providing right information which meets individual user’s needs. How to obtain and represent knowledge needed by users is a key issue. This paper presents Web Knowledge Flow (WKF) to represent the specific knowledge on Web pages and a model of Interactive Computing with Semantics (ICS) to provide a feasible means of generating WKF. Objective WKF (OWKF) and Real-time WKF (RWKF) are firstly proposed to satisfy staged and real-time user interests. Secondly, the generation algorithm of WKF is proposed based on Semantics Link Network. Thirdly, “interactive point” is introduced to detect the moment of user interests change to ensures the dynamics of WKF. Experimental results demonstrate that ICS can effectively capture the change of user interests and the generated WKF can satisfy user requirements accurately.
    New Generation Computing 01/2010; 28:113-120. · 0.80 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Key-word based researches of service discovery focus on direct match of userpsilas requirements and often neglect relations between services. While techniques based on conventional semantic offer many kinds of relations, considerable time is spent on reasoning. In this paper, we utilizes E-FCM (Element Fuzzy Cognitive Map) to describe services for the reason that E-FCM can keep the semantic information as much as possible and E-FCMs can be automatically created for web services. Furthermore, instead of reasoning, the semantic relations among E-FCM are built based on computation, therefore semantic relations among services can be found out quickly. We focus on the associated semantic relations among services because complex applications always comprise of services with associated functions. The associated link network (ALN) is constructed upon associated relations to generate associated web service flows, which can be used to create complex applications, thus to facilitate discovery efficiency and improve utilization of services.
    Web Services, 2009. ICWS 2009. IEEE International Conference on; 08/2009
  • [Show abstract] [Hide abstract]
    ABSTRACT: Similarity Knowledge Flow (SKF) is a kind of scientific workflow, providing an effective technique and theoretical support for intelligent browsing in the Web and e-Science environment. In this paper, a Semantic Link Networks (SLN) based SKF generation method is proposed. First, the topics are represented by Element Fuzzy Cognitive Maps then the semantic values of concepts-keywords and relations are calculated. Third, semantic similarity degrees between topics are calculated to build SLN-based semantic values of concepts and their relations in Element Fuzzy Cognitive Maps. In this way, similar relations at the keyword level are extended to the topic level. With the help of SLN and based on user's demand, SKF is generated as the browsing path of topics to guide user browsing behaviors. Finally, the semantic value of SKF is defined as a criterion to evaluate the browsing path of topics. Experimental results show that the browsing path of topics is easy to be activated by SKF which is generated by SLN. The proposed method has been proved to have a very good prospect in the fields of Web services and e-Science applications. Copyright © 2009 John Wiley & Sons, Ltd.
    Concurrency and Computation Practice and Experience 01/2009; 21:2018-2032. · 0.85 Impact Factor