Conference PaperPDF Available

Complex-network theoretic clustering for identifying groups of similar listeners in p2p systems

Authors:

Abstract and Figures

This article presents an approach to automatically create virtual communities of users with similar music preferences in a distributed system. Our goal is to create personalized music channels for these communities using the content shared by its members in peer-to-peer networks for each community. To extract these communities a complex network theoretic approach is chosen. A fully connected graph of users is created using epidemic protocols. We show that the created graph sufficiently converges to a graph created with a centralized algorithm after a small number of protocol iterations. To find suitable techniques for creating user communities, we analyze graphs created from real-world recommender datasets and identify specific properties of these datasets. Based on these properties, different graph-based community-extraction techniques are chosen and evaluated. We select a technique that exploits identified properties to create clusters of music listeners. The suitability of this technique is validated using a music dataset and two large movie datasets. On a graph of 6,040 peers, the selected technique assigns at least 85% of the peers to optimal communities, and obtains a mean classification error of less than 0.05% over the remaining peers that are not assigned to the best community.
Content may be subject to copyright.
A preview of the PDF is not available
... In this way, (i) the availability of the resource profiles increases, (ii) nodes in A may contact a random node b ∈ B to perform queries without compromising the efficiency of the results; and (iii) the possible liability of providing access to a resource is shared among different nodes. There are many existing proposals of an epidemic protocol for locating resources in distributed systems [2,[5][6][7]. Indeed, nodes in B save the resource description of nodes in A, but since they have a user profile as well, it is possible to use this profile to organize nodes in B according to their interest, as in [7]. ...
... For the sake of clarity, we include next an example of this process. As a first approach, consider a vector without encryptions Equations (3)(4)(5)(6)(7) show these same operations with a encrypted selection vector, and then products and additions are on the encrypted text, as (5) shows. The result, finally, is the integer b j that only the user that owns the private key of the Paillier's cryptosystem is able to decrypt. ...
Article
In this chapter, we describe nodes in the Internet of Things can configure themselves automatically and offer personalized services to the users while protecting their privacy. We will show how privacy protection can be achieved by means of a use case. We describe DocCloud, a recommender system where users get content recommended by other users based on their personal affinities. To do this, their things connect together based on the affinities of their owners, creating a social network of similar things, and then provide the recommender system on top of this network. We present the architecture of DocCloud and analyze the security mechanisms that the system includes. Specifically, we study the properties of plausible deniability and anonymity of the recommenders and intermediate nodes. In this way, nodes can recommend products to the customers while deny any knowledge about the product they are recommending or their participation in the recommendation process.
... In online rating systems the users can rate the objects. Many online rating systems actually have two-mode nature, and recommender systems based on online rating systems usually are modeled as user-object bipartite networks [1][2][3][4]. Moreover, according to experience, in online rating systems high rating means the user likes the object and low rating means the user dislikes the object. ...
... Others have proposed methods using user-item interactions to build clusters of users and/or items, but only used them to analyze the data, not to produce recommendations [11], [12]. ...
... In this study we set out to explore social network analytical methods that might capture users' reading diversity. While network theoretic methods have been widely used in peer-to-peer network to identify similar peers for recommendation purpose [8], [9] , to the best of our knowledge, they have not been applied to items within an individual user's bookshelf for the purpose of representing his/her preference structure. In this study five different similarity measures and a simple clustering method, components identification, were used, which is admittedly a rather basic network clustering method. ...
Conference Paper
Full-text available
Usage data available through social media provides a great many opportunities to capture users’ preference. Using books saved in users’ online bookshelves, the study set out to explore social network analytical methods to capture the diversity of a reader’s reading interests. “Reading diversity” denotes how widely scattered one’s reading interests are. Drawing data from aNobii, a social networking site for booklovers, users’ reading diversity was defined by the number of components created by the book co-ownership network of the books in their bookshelves. Five book-book similarity measures were proposed and their clustering results were tested against users’ self-assessed reading diversity in order to identify the best suited similarity measure and threshold for such a task. One of the proposed similar measures produce a clustering results that is significantly correlated with users’ self-assessed diversity. Furthermore, a multiple regression analysis showed that the proposed measure was able to provide explanatory power for reading diversity over and above mere counting the number of books in the bookshelf.
... The power law coefficient is about 1.22. Another property of BA network is that it has larger clustering coefficient and smaller average distance [21]. So, we can also compute the clustering coefficient and average distance of the network. ...
... In this study we set out to explore social network analytical methods that might capture users' reading diversity. While network theoretic methods have been widely used in peer-to-peer network to identify similar peers for recommendation purpose [8], [9] , to the best of our knowledge, they have not been applied to items within an individual user's bookshelf for the purpose of representing his/her preference structure. In this study five different similarity measures and a simple clustering method, components identification, were used, which is admittedly a rather basic network clustering method. ...
Article
The ratings in many user-object online rating systems can reflect whether users like or dislike the objects, and in some online rating systems, users can directly choose whether to like an object. So these systems can be represented by signed bipartite networks, but the original unsigned node evaluation algorithm cannot be directly used on the signed networks. This paper proposes the Signed PageRank algorithm for signed bipartite networks to evaluate the object and user nodes at the same time. Based on the global information, the nodes can be sorted by the Signed PageRank values in descending order, and the result is SR Ranking. The authors analyze the characteristics of top and bottom nodes of the real networks and find out that for objects, the SR Ranking can provide a more reasonable ranking which combines the degree and rating of node, and the algorithm also can help us to identify users with specific rating patterns. By discussing the location of negative edges and the sensitivity of object SR Ranking to negative edges, the authors also explore that the negative edges play an important role in the algorithm and explain that why the bad reviews are more important in real networks.
Chapter
This chapter presents the different evaluation methods for a recommender system. We introduce the existing metrics, as well as the pros and cons of each method. This chapter is the background for the following Chaps. 6 and 7, where the proposed metrics are used in real, large size, recommendation datasets.
Conference Paper
Full-text available
We present an approach to automatically create virtual communities of users with similar music tastes. Our goal is to create personalized music channels for these com- munities in a distributed way, so that they can for example be used in peer-to-peer networks. To find suitable tech- niques for creating these communities we analyze graphs created from real-world recommender datasets and iden- tify specific properties of these datasets. Based on these properties we select and evaluate different graph-based community-extraction techniques. We select a technique that exploits identified properties to create clusters of mu- sic listeners. We validate the suitability of this technique using a music dataset and a large movie dataset. On a graph of 6,040 peers, the selected technique assigns at least 85% of the peers to optimal communities, and ob- tains a mean classification error of less than 0.05 over the remaining peers that are not assigned to the best commu- nity.
Article
Full-text available
We introduce personalization on Tribler, a peer-to-peer (P2P) television system. Personalization allows users to browse programs much more efficiently according to their taste. It also enables to build social networks that can improve the performance of current P2P systems considerably, by increasing content availability, trust and the realization of proper incentives to exchange content. This paper presents a novel scheme, called BuddyCast, that builds such a social network for a user by exchanging user interest profiles using exploitation and exploration principles. Additionally, we show how the interest of a user in TV programs can be predicted from the zapping behavior by the introduced user-item relevance models, thereby avoiding the explicit rating of TV programs. Further, we present how the social network of a user can be used to realize a truly distributed recommendation of TV programs. Finally, we demonstrate a novel user interface for the personalized peer-to-peer television system that encompasses a personalized tag-based navigation to browse the available distributed content. The user interface also visualizes the social network of a user, thereby increasing community feeling which increases trust amongst users and within available content and creates incentives of to exchange content within the community.
Article
Systems as diverse as genetic networks or the World Wide Web are best described as networks with complex topology. A common property of many large networks is that the vertex connectivities follow a scale-free power-law distribution. This feature was found to be a consequence of two generic mech-anisms: (i) networks expand continuously by the addition of new vertices, and (ii) new vertices attach preferentially to sites that are already well connected. A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.
Article
According to a fundamental result of Erdös and Rényi, the structure of a random graph GMG_M changes suddenly when Mn/2:M \sim n/2: if M=cnM = \lfloor cn \rfloor and c12a.e.c \frac{1}{2} a.e. G_Mhasagiantcomponent:acomponentoforder has a giant component: a component of order (1 - \alpha_c + o(1))n where \alpha_c
Article
Collaborative filers help people make choices based on the opinions of other people. GroupLens is a system for collaborative filtering of netnews, to help people find articles they will like in the huge stream of available articles. News reader clients display predicted scores and make it easy for users to rate articles after they read them. Rating servers, called Better Bit Bureaus, gather and disseminate the ratings. The rating servers predict scores based on the heuristic that people who agreed in the past will probably agree again. Users can protect their privacy by entering ratings under a pseudonym, without reducing the effectiveness of the score prediction. The entire architecture is open: alternative software for news clients and Better Bit Bureaus can be developed independently and can interoperate with the components we have developed.