[Show abstract][Hide abstract] ABSTRACT: Collaborative filtering is one of the most popular recommendation techniques. While the quality of the recommendations has been significantly improved in the last years, most approaches present poor efficiency and scalability. In this paper, we study several factors that affect the performance of a k-Nearest Neighbors algorithm, and we propose a distributed architecture that significantly improves both throughput and response time. Two techniques for distributing recommender systems, user and item partition, were proposed and evaluated using that simulation model. We have found that user partition is generally better, with a faster response time and higher throughput.
[Show abstract][Hide abstract] ABSTRACT: The performance evaluation of an IR system is a key point in the development of any search engine, and specially in the Web. In order to get the performance we are used to, Web search engines are based on large-scale distributed systems and to optimise its performance is an important aspect in the literature. The main methods, that can be found in the literature, to analyse the performance of a distributed IR system are: the use of an analytical model, a simulation model and a real search engine. When using an analytical or simulation model some details could be missing and this will produce some differences between the real and estimated performance. When using a real system, the results obtained will be more precise but the resources required to build a large-scale search engine are excessive. In this paper we propose to study the performance by building a scaled-down version of a search engine using virtualization tools to create a realistic distributed system. Scaling-down a distributed IR system will maintain the behaviour of the whole system and, at the same time, the computer requirements will be softened. This allows the use of virtualization tools to build a large-scale distributed system using just a small cluster of computers.
[Show abstract][Hide abstract] ABSTRACT: In this work we present a series of collaborative filtering algorithms known for their simplicity and efficiency. The efficiency of this algorithm was compared with that of other more representative collaborative filtering algorithms. The results demonstrate that the response times are better than those of the rest (at least two orders of magnitude), in the training as well as when making predictions. Furthermore, when determining the quality of the predictions, the behavior of our algorithms is similar to that of the other algorithms, and even better when dealing with low-density training sets.
[Show abstract][Hide abstract] ABSTRACT: Most today's web sources do not provide suitable interfaces for software programs to interact with them. Many researchers have proposed highly effective techniques to address this problem. Nevertheless, ad-hoc solutions are still frequent in real-world web automation applica- tions. Arguably, one of the reasons for this situation is that most proposals have focused on query wrappers, which transform a web source into a special kind of database in which some queries can be executed using a query form and return resultsets that are composed of structured data records. Although the query wrapper model is often useful, it is not appropriate for applications that make decisions according to the data retrieved or processes that use forms that can be mod- elled as insert/update/delete operations. This article proposes a new language for defining web automation processes that is based on a wide range of real-world web automation tasks that are being used by corporations from different business areas.
Preview · Article · Jan 2008 · JOURNAL OF UNIVERSAL COMPUTER SCIENCE
[Show abstract][Hide abstract] ABSTRACT: A substantial subset of Web data has an underlying structure. For instance, the pages obtained in response to a query executed through a Web search form are usually generated by a program that accesses structured data in a local database, and embeds them into an HTML template. For software programs to gain full benefit from these “semi-structured” Web sources, wrapper programs must be built to provide a “machine-readable” view over them. Since Web sources are autonomous, they may experience changes that invalidate the current wrapper, thus automatic maintenance is an important issue. Wrappers must perform two tasks: navigating through Web sites and extracting structured data from HTML pages. While several works have addressed the automatic maintenance of data extraction tasks, the problem of maintaining the navigation sequences remains unaddressed to the best of our knowledge. In this paper, we propose a set of novel techniques to fill this gap.
No preview · Article · Dec 2007 · Data & Knowledge Engineering
[Show abstract][Hide abstract] ABSTRACT: Although there are quite a few Open Source monitoring applications, they have not reached yet the necessary maturity level. Many users have to face important problems when deploying a monitoring system for their networks. In this paper we compare the most popular open source monitoring tools, and we analyze their main limitations. As a solution for these problems we propose a new monitoring tool, that incorporates several outstanding improves, such as a centralized configuration via web, support for monitoring templates, a hierarchical structure of objects to handle the management information, and support for centralized and distributed monitoring schemes. We describe in detail its architecture, and show its use in a real environment, which makes it possible to verify the importance of the improvements that have been developed.
[Show abstract][Hide abstract] ABSTRACT: Wrapping existing Web applications into portals allows to protect investment and improves user experience. Most current portlet-based portal servers provide a bridge portlet that allows to "portletize" a single Web page, that is, wrapping the whole page or a set of regions as a portlet. They use an annotation-based approach to specifying the page's regions that must be extracted. This approach does not scale well when a whole application is to be portletized, since it requires to manually annotate each page. This paper describes the design of a bridge portlet that automatically adapts pages according to the space available in the portlet's window. The bridge portlet delegates page adaptation to a framework that uses a chain of user-configurable "transformers". Each transformer implements an automatic page adaptation technique. Experiments show that our approach is effective.
[Show abstract][Hide abstract] ABSTRACT: The present document describes the VAIN architecture (Virtual Active IP Node), which enables users to deploy new network services based on virtual active networks, and how it solves the challenge of segmenting the incoming traffic that crosses nodes towards the services, conserving the original objective of independence of the protocol (Tennenhouse,1996). Our solution is based on using network expressions that use all the semantic contained in each incoming packet, which does not need to know the inner structure of the protocols. VAIN architecture has been development to response to challenges outlined by electronic commerce, specifically those regarding to collaborative environments and marketplaces. To achieve this objective we have considered the following goals: first, a three layer conceptualization; second, a transparent implantation and its integration with existing infrastructures; and third, a strategy of network traffic distribution based in all the information within the input packets, which is named "expressions based distribution".
Full-text · Article · Jan 2005 · International Journal of Web Based Communities
[Show abstract][Hide abstract] ABSTRACT: In this study, we present the analysis of the interconnection network of a distributed Information Retrieval (IR) system, by simulating a switched network versus a shared access network. The results show that the use of a switched network improves the performance, especially in a replicated system because the switched network prevents the saturation of the network, particu- larly when using a large number of query servers.
Preview · Article · Jan 2005 · Lecture Notes in Computer Science
[Show abstract][Hide abstract] ABSTRACT: Modern communication infrastructures are formed by devices which are able to process the traffic that crosses them. In this context an important work inside each active device is to classify the incoming traffic and send it towards its service. We present a report in advance regarding other implementations which is able to perform an independent packet classification with an excellent performance using runtime code generation techniques. This classifier is being used inside VAIN (virtual active independent node) our platform to build active networks. We also introduce some aspects about VAIN to understand how ITCAN performs its tasks in this environment
[Show abstract][Hide abstract] ABSTRACT: This report will describe the VAIN (Virtual Active IP Network) architecture development to response challenges outlined by electronic commerce, specifically at the collaborative environments and marketplaces. For this development we have considered following goals: a three layer conceptualization, a transparent implantation and its integration with existing infrastructures; and a strategy of network traffic distribution based on whole information from input packets, by means of the named "patterns based distribution". Mainly VAIN uses as guest code an interpreter of intermediate code from .NET architecture, although the possibility is open to use other intermediate codes. VAIN is immediately over link layer, being able to be extended to any other similar protocol, and independent of upper protocols existing or not at the present time. Our architecture presents, also, a polymorphic character since it allows changing its behavior in a transparent way and virtually emulating other architectures concurrently without affect its functionality.
[Show abstract][Hide abstract] ABSTRACT: We introduce an alarm management system based on management by
delegation paradigm (MbD) which provides the operator with an integrated
and homogeneous environment in which different types of alarms exist.
The platform chosen was Java owing to its special features (code
mobility, platform independency, distributed capabilities, etc.). This
system provides the programmer with a flexible, modular and robust
environment where the functionality of the system can be increased
dynamically without having to alter any part of it. This system
overcomes most of the limitations inherent to centralised systems. Some
of the key characteristics of the system are: protocol integration, the
use of an RDBMS to enhance the information about alarms, multi-user
monitoring through an intuitive GUI applet with several permission
[Show abstract][Hide abstract] ABSTRACT: We address the problem of integrating proprietary managed
technology in a corporate TMN system by using TMN-based platforms
support facilities. A prototype that integrates a proprietary managed
PDH network in a fully TMN corporate management system has been designed
and developed using three different TMN platforms in parallel: Solstice
Enterprise Manager, NetView/6000 TMN Support Facility and OpenView DM.
This experimental prototype helped us (1) to understand how the new
emerging management platforms support the engineering of solutions to
integrate proprietary protocols and (2) to identify potential problems
that can arise when trying to apply the platform functionality to the
real network elements