Yan Zhang’s research while affiliated with Shandong University of Finance and Economics and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (3)


Novel Method of Web Database Redundancy Computing for Web Data Sources Selection
  • Article

December 2013

·

11 Reads

Information Technology Journal

Yan Zhang

·

·

Rui Zhang

·

Peiguang Lin

With the fast increasing number of Web databases (WDBs), it is core issue in the study of Web data integration that we should select the most appropriate composition of databases to query and obtain more targeted data at a smaller cost. In this study, in order to reduce redundant data from different sources, we propose a novel method of Web databases redundancy computing to select proper Web data sources for given keywords. To solve the problem, we propose a web database feature representation model, and based on sample data from the sources, we put forward the deep web redundancy computing method considering three different data types: text attribute, numeric attribute and categorical attribute. Experiments show that this method can achieve the desired objectives and can meet the demand to the integrated system very well.


Enrich Web Entity Schema Based on Integrated Annotation

November 2013

·

17 Reads

·

2 Citations

Web integration systems (WIS) need to collect web objects belong to a specific domain from different websites effectively. Most WIS defines entity schemas beforehand by domain experts. Due to the essence of diversity and variability of web, it is hard to model the web entity comprehensively beforehand, furthermore, wrong annotations happen when align object values from different websites into the WIS. In order to avoid the limitations, we propose an integrated annotating method combining the matching strategy and machine learning technology to dynamically discover synonyms for predefined attribute labels and new attribute labels for a specified type of web entity. Experimental results using real-world data in book and job domains show that the proposed approach is effective in enriching web entity schema to enhance the performance of data collection process in a WIS.