[show abstract][hide abstract] ABSTRACT: Collaborative filtering technology is one major method used in recommendation systems. Most existing collaborative filtering algorithms merely use rating data as their prediction input. Social tags have become widely used in web applications which not only reflect the user's personality but also item's properties and semantic meanings. We design an algorithmic framework by extending item-based collaborative filtering with social tags which we call IBeST. IBeST contains the whole lifecycle of the item similarity measurement based on social tags and improves item-based algorithmic results in four phases: dataset preprocessing, metadata injection, algorithm selection and optimization, and similarity weight selection. The calculated similarity is then used in item-based algorithm. MovieLens 10M ratings 100k tags dataset is used in our experiment. IBeST generates improved recommendation ratings than baseline item-based algorithms, and provides a feasible and loosely coupled solution to use social tags in item-based recommendation system.
[show abstract][hide abstract] ABSTRACT: users of a digital library often have difficulty in formulating query expression that could represent his or her information requirements exactly. Query suggestion can provide some recommended query expression and help the user build a proper expression. Query suggestion for digital libraries requires higher precision ratio and novelty ratio than that of web search engines. Based on case studies, we found that there are four main types of query suggestion within digital library environments, namely spelling suggestion, hot keyword suggestion, personalized suggestion and semantic suggestion. These approaches are, however, hardly to ensure high precision ratio and novelty ratio expected by digital library users to date. The paper proposed an improved query suggestion approach for digital library and its main advantages lie in computing semantic relations, finding hot concepts and ranking candidate concepts. Semantic similarity between user's input and a candidate keyword is calculated by Relative Information Loss (RIL). Hot keywords are indentified by a new algorithm, which involving clicked time, novelty clicked time, result record number, novelty result record. The rank degree of a keyword is evaluated by its RIL and hot degree. Finally, a software component that takes advantage of DBpedia Ontology, Jena, Ajax and SPARQL in align with these new improvements is developed and deployed on a digital library named iDLib. Better user experience with our new query suggestion software component proves the feasibility and efficiency of the improvement.
Workshop Proceedings of the 35th Annual IEEE International Computer Software and Applications Conference, COMPSAC Workshops 2011, Munich, Germany, 18-22 July 2011; 01/2011
[show abstract][hide abstract] ABSTRACT: A design is said to be super-simple if the intersection of any two blocks has at most two elements. In statistical planning of experiments, super-simple designs are the ones providing samples with maximum intersection as small as possible. Super-simple GDDs are useful in constructing super-simple BIBDs. The existence of super-simple (4,λ)‐GDDs has been determined for λ=2–6. In this paper, we investigate the existence of a super-simple (4,9)-GDD of group type gu and show that such a design exists if and only if u≥4, g(u−2)≥18 and u(u−1)g2≡0 (mod 4).
Fuel and Energy Abstracts 01/2011; 141(9):3231-3243.
[show abstract][hide abstract] ABSTRACT: Massive rules processing has attracted more attention in recently years. Firstly, we propose a rule description language that can express all kind of rules by structured nature language. We design a set of graphical symbols for rule nodes. We also propose a rule traffic flow model and a rule cost model. Thought these models, it is easier to process massive numbers rules and optimize them.
[show abstract][hide abstract] ABSTRACT: In distributed file systems, the integral management of large and small files is very important for the performance of applications. Based on Google File System, we present a framework of distributed file system to improve the management of geospatial objects. By adopting the access locality and spatial relationships among geospatial objects, we extend the metadata in the master node and add spatial indices in the data nodes. An optimized strategy is also proposed to unify the management of spatial data from multi-sources. Our experience shows that the method is available for managing geospatial data.
[show abstract][hide abstract] ABSTRACT: Log data is critical to applications and the management and analysis of log data is a hot research topic. Existing log managements are normally tightly integrated with applications themselves, which may lead to problems including performance, local storage efficiency, security and non realtime functionality. To solve these problems, we present a SaaS method which shifts writing log data from local disk to clouds, at the same time the log management and analysis functionalities are also done by a cloud. We analyze two architectures to implement this method which are Shift-Log-by-WebService and Shift-Log-by-ActiveMQ. Initial experiments show the efficiency of later one. In the future, we can apply this tool to application systems which are based on web and database systems to improve their performances.
Workshop Proceedings of the 34th Annual IEEE International Computer Software and Applications Conference, COMPSAC Workshops 2010, Seoul, Korea, 19-23 July 2010; 01/2010
[show abstract][hide abstract] ABSTRACT: Collaborative Filtering (CF) is widely used in the Internet for recommender systems to find items that fit users' interest by exploring users' opinion expressed on other items. However there are two challenges for CF algorithm, which are recommendation accuracy and data sparsity. In this paper, we try to address the accuracy problem with an approach of deviation adjustment in item-based CF. Its main idea is to add a constant value to every prediction on each user or each item to modify the uniform error between prediction and actual rating of one user or one item. Our deviation adjustment approach can be also used in other kinds of CF algorithms. For data sparsity, we improve similarity computation by filling some blank rating with a user's average rating to help decrease the sparsity of data. We run experiments with our optimization of similarity computation and deviation adjustment by using MovieLens data set. The result shows these methods can generate better predication compared with the baseline CF algorithm.
[show abstract][hide abstract] ABSTRACT: Web news articles play an important role in stock market. Sentiment classification of news articles can help the investors make investment decisions more efficiently. In this paper, we implemented an approach of Chinese new words detection by using N-gram model and applied the result for Chinese word segmentation and sentiment classification. Appraisal theory was introduced into sentiment analysis and Naive Bayes, K-nearest Neighbor and Support Vector Machine were used as classification algorithms. Our method was used for a Chinese stock news data set. The best accuracy reaches 82.9% in all experiments. Additionally, we developed a prototype system to demonstrate our work.
[show abstract][hide abstract] ABSTRACT: This study investigated the prevalence of the precore G1896A mutation in Chinese patients with hepatitis B e antigen (HBeAg) negative HBV infection and its relation to serum HBV pre-S1 antigen. The overall prevalence of the precore G1896A mutation was 72.6% in HBeAg-negative Chinese patients with detectable serum HBV DNA. The prevalence of the precore G1896A is significantly higher in Chinese HBeAg-negative patients with chronic hepatitis B than that in inactive HBV carriers with detectable serum HBV DNA. Serum pre-S1 and the precore G1896A mutation were simultaneously detected in most of Chinese HBeAg-negative patients.
Brazilian Journal of Microbiology 10/2009; 40(4):965-71. · 0.76 Impact Factor
[show abstract][hide abstract] ABSTRACT: In statistical planning of experiments, super-simple designs are the ones providing samples with maximum intersection as small as possible. Super-simple designs are also useful in constructing codes and designs such as superimposed codes and perfect hash families. The existence of super-simple (v,4,λ)-BIBDs have been determined for λ=2–6. In this paper, we investigate the existence of a super-simple (v,4,9)-BIBD and show that such a design exists if and only if v≡0,1(mod4) and v⩾20. Applications of the results to optical orthogonal codes are also mentioned.
Journal of Statistical Planning and Inference - J STATIST PLAN INFER. 01/2009; 139(10):3612-3624.
[show abstract][hide abstract] ABSTRACT: Wiki technology is used popularly not only on Internet but also in intranet of enterprises. However with the rapid growth
of pages in wikis, lots of useful information may be hidden in them and difficult to be grasped. If a wiki has some functional
modules to catch and take a good use of the information, it will become more powerful and suitable for enterprises and benefit
the enterprises a lot. We choose the method of external program as better solution and implement it. It can empirically help
wiki have some computational capabilities, monitoring mechanism and better collaboration by distilling, processing and centralizing
the wiki information.
Advances in Data and Web Management, Joint International Conferences, APWeb/WAIM 2009, Suzhou, China, April 2-4, 2009, Proceedings; 01/2009
[show abstract][hide abstract] ABSTRACT: The bloom of Internet and the convenience of e-Commerce bring a lot of demands of Internet booking applications. There are two challenges in Internet Booking Applications: high volume concurrent processing and complex business processing. Facing these two challenges, we designed an extensible framework for Internet booking applications, which includes an Internet Booking Engine, a rule engine, and three kinds of servers. The architectures of e-Commerce applications usually consist of web server, application server and database server. The efficiency of the system is deeply influenced by the distribution of the functions among these servers and the communication methods between them. Three different architectures were designed and implemented. The Internet Booking Engine includes two kernel online booking modules: searching tickets and booking tickets. While dealing with complex and flexible business logics, the rule engine can reason with the business facts to get the result and take the corresponding action for specific applications by following the business rules. In vertical and horizontal directions, Internet Booking Engine can be extended to different Internet booking applications.
Web Information Systems and Applications Conference. 01/2009;
[show abstract][hide abstract] ABSTRACT: Internet brings more flexibility and possibilities to software. How to develop and maintain the software conveniently and easily is a big challenge for the developers of Internet applications. The purpose of this paper is to create a systemic framework for object oriented design (OOD) and map between the program and the reality. This paper investigates the things, relations and interactions between things, and defines a set of concepts: universal, particular, perspective, perspective transfer, nature, feature, rule and triggering mechanism. These concepts are used to explain the basic structure of the framework. Some important problems in object oriented programming (OOP) and aspect oriented programming (AOP) are investigated in detail. A prototype system has been implemented and tested. The experiments prove that the new thought is realistic and efficient.
[show abstract][hide abstract] ABSTRACT: To compare the clinical performance of a real-time PCR assay with the COBAS Amplicor Hepatitis B Virus (HBV) Monitor test for quantitation of HBV DNA in serum samples.
The reference sera of the Chinese National Institute for the Control of Pharmaceutical and Biological Products and the National Center for Clinical Laboratories of China, and 158 clinical serum samples were used in this study. The linearity, accuracy, reproducibility, assay time, and costs of the real-time PCR were evaluated and compared with those of the Cobas Amplicor test.
The intra-assay and inter-assay variations of the real-time PCR ranged from 0.3% to 3.8% and 1.4% to 8.1%, respectively. The HBV DNA levels measured by the real-time PCR correlated very well with those obtained with the COBAS Amplicor test (r = 0.948). The real-time PCR HBV DNA kit was much cheaper and had a wider dynamic range.
The real-time PCR assay is an excellent tool for monitoring of HBV DNA levels in patients with chronic hepatitis B.
World Journal of Gastroenterology 02/2008; 14(3):479-83. · 2.55 Impact Factor
[show abstract][hide abstract] ABSTRACT: The traditional annotation technologies focus on the content of the resources which is normally done by the system managers. There are some Web applications which allow users to annotate the resources besides the content; we call this phenomenon as pervasive annotation. This paper defines the pervasive annotation as a correlative procedure of tag assignation, rating, description and usage status marking to the Web resources. This paper investigates these pervasive annotation technologies and discusses their usage in the personalized services such as recommendation and presentation. We design and implement a pervasive annotation system (PAS) using service component architecture (SCA). The application of this system shows that PAS can help user manage the Web resources more conveniently and efficiently by pervasive annotation, and get required Web resource recommended through information filtering.
[show abstract][hide abstract] ABSTRACT: Generally, every Website provides its own resource upload and download modules for users to share resources. However, the resources uploaded to Websites' servers are often lack of unified and professional management. Whatpsilas more, when the amount of user using the upload and download modules increases, it will put a big pressure on the Websitepsilas server in term of bandwidth. To solve these problems, this paper designed and implemented a Web resource supporting platform-WFMS (Web File Management System) which can provide a unified interface for multiple Wbsites to upload, download and manage resources professionally, and decrease the developing costs and bandwidth pressure as well.
[show abstract][hide abstract] ABSTRACT: Collaborative filtering has been very successful in both applications and researches. In real situation, different users may
have different influences on other users’ decisions. Those authoritative users usually play more important roles. But few
existing collaborative filtering algorithms consider the authorities of users. In this paper, we present the concepts of global
and domain authorities of users, and apply them in collaborative filtering algorithms. This paper designs the experiments
and discusses the effects of global and domain authorities. The initial results show our method can improve the performance
of collaborative filtering algorithm.
Digital Libraries: Universal and Ubiquitous Access to Information, 11th International Conference on Asian Digital Libraries, ICADL 2008, Bali, Indonesia, December 2-5, 2008. Proceedings; 01/2008
[show abstract][hide abstract] ABSTRACT: During the development of programming languages, there always have some problems such as the AOP in the procedure-oriented languages and object-oriented languages. These problems appear frequently in software and are difficult to be coped with. Many problems are due to the limitation of theoretical foundations of these programming languages, thus they cannot be solved by simply applying some specific technologies. It is necessary to build a new theoretical framework to solve these problems from the bottom. Our goal is to create a theoretical framework to discuss causes of these problems, find a way to solve them and provide some instructional suggestions. This paper first introduces the universal theory, analyzes the phenomena brought by misuse of universals, and then gives some feasible solutions. After that, an experimental but practical platform is presented. Finally, we give out related work and then make conclusion.
Information Science and Engieering, International Symposium on. 01/2008; 2:49-52.