Article

Automatic news recommendations via aggregated profiling

Authors:
  • iMinds - Ghent University
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Today, people have only limited, valuable leisure time at their hands which they want to fill in as good as possible according to their own interests, whereas broadcasters want to produce and distribute news items as fast and targeted as possible. These (developing) news stories can be characterised as dynamic, chained, and distributed events in addition to which it is important to aggregate, link, enrich, recommend, and distribute these news event items as targeted as possible to the individual, interested user. In this paper, we show how personalised recommendation and distribution of news events, described using an RDF/OWL representation of the NewsML-G2 standard, can be enabled by automatically categorising and enriching news events metadata via smart indexing and linked open datasets available on the web of data. The recommendations—based on a global, aggregated profile, which also takes into account the (dis)likings of peer friends—are finally fed to the user via a personalised RSS feed. As such, the ultimate goal is to provide an open, user-friendly recommendation platform that harnesses the end-user with a tool to access useful news event information that goes beyond basic information retrieval. At the same time, we provide the (inter)national community with standardised mechanisms to describe/distribute news event and profile information.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... We have already mentioned the commercial VLX-Stories [14] system. Reference [44] extends the news production workflow at VRT (Vlaamse Radio-en Televisieomroep), a national Belgian broadcaster, to support personalised news recommendation and dissemination via RSS feeds. A semantic version of the IPTC's NewsML-G2 A72 standard is proposed as a unifying (meta-)data model for dynamic distributed news event information. ...
... Case studies and examples: Other papers present realistic examples based on industrial experience. For example, the MediaLoep project [10] (involving many of the authors behind Reference [44], and Reference [9], to be presented later) discusses how to improve retrieval and increase reuse of previously broadcast multimedia news items at VRT, the national Belgian broadcaster, both as background information and as reusable footage. The paper reports experiences with collecting descriptive metadata from different news production systems; integrating the metadata using a semantic data model; and connecting the data model to other semantic data sets to enable more powerful semantic search. ...
... Social media and the Web: Several main papers use social media and other web resources as input, such as Twitter, A110 Wikinews, A48 Wikipedia, A49 and regular HTML-based web sites. To support personalised news recommendation and dissemination, the extension of VRT's news workflow mentioned earlier [44] uses OpenID A41 and OAuth A9 for identification and authentication. In this way, the system can compile user profiles based on data from multiple social-media accounts, using ontologies such as FOAF and SIOC A23 to interoperate user data. ...
Article
Full-text available
ICT platforms for news production, distribution, and consumption must exploit the ever-growing availability of digital data. These data originate from different sources and in different formats; they arrive at different velocities and in different volumes. Semantic knowledge graphs (KGs) is an established technique for integrating such heterogeneous information. It is therefore well-aligned with the needs of news producers and distributors, and it is likely to become increasingly important for the news industry. This paper reviews the research on using semantic knowledge graphs for production, distribution, and consumption of news. The purpose is to present an overview of the field; to investigate what it means; and to suggest opportunities and needs for further research and development.
... Content-based filtering (CBF) uses the item as a reference point. The user's profile or other data, such as her preferences stated in search terms, constitute the second reference point [3,4]. The user's data are matched to descriptions of various items. ...
... Collaborative filtering (CF) has another rationale. The list of recommended items derives from the analysis of preferences of other people with interests similar to those of the user [3,4,6,9,10,[13][14][15]. Preferences in the community of users with similar interests constitute the single reference point in CF. ...
... The 'cold start' problem refers to the scarcity and sparsity of information. At any given moment, a significant number of relevant items simply remain unrated by the user or her reference group [3,15,17]. The user and her fellows are either unaware of the existence of many potentially relevant items or unwilling to invest their effort and time in forming an opinion of them. ...
Article
Full-text available
This paper discusses the feasibility and benefits of incorporating coefficients of inter-coder agreement (Krippendorff’s α, Bennett, Alpert and Goldstein’s S, Scott’s π and Cohen’s κ) into recommender systems. It is argued that with their help it is possible to increase the accuracy of users’ assessment of various items (texts, but also potentially images, movies, music and goods). Chance-corrected measures of similarity also allow for the detection of similarly minded users in a more accurate manner. Results of small-scale empirical tests inform the discussion. Predictions made using chance-corrected measures of similarity are compared with those based on more conventional measures of similarity, the cosine coefficient and the Jaccard index.
... Although these systems filter the relevant content and improve the user's satisfaction, some challenges remain; for instance, nondisturbing the user with the feedback collecting mechanism [11], monotonous recommendations [12], or the cold-start problem [2,13]. Regarding the first challenge, currently many recommending solutions still rely on the explicit-feedback provided by the user, typically the user ratings [14][15][16][17][18]. However, not every user is willing to provide ratings, especially when ratings are optional. ...
... Bagherifard et al. [20] also use both implicit and explicitfeedback in a hybrid approach for movie recommendations, though their solution uses ontologies. A hybrid and ontology-based approach is surveyed in [17] using ratings for news recommendations. Also, Agarwal and Singhal [21] propose a solution based on a domain ontology, which uses explicit and implicit data of users; the registered user provides the explicit information, while the implicit information includes mouse behavior and user session data. ...
... However, collaborative filtering alone suffers from the sparsity problem. On the other hand, content-based filtering is not very effective in recommending news items, because usually users only read a small part of these [17]. erefore, our approach proposes combining collaborative-based filtering, which provides diversity, with content-based filtering, which mitigates the cold-start problem, and addresses the sparsity problem through aggregation and clustering techniques. ...
Article
Full-text available
This paper addresses the problem of automatically customizing the sending of notifications in a non-disturbing way. For this purpose, we never ask preferences to the user, but instead we use only implicit-feedback. Then we build a hybrid filter that combines text mining content-filtering and collaborative filtering to predict the top-N notifications that are of most interest for each user. The content-based filter clusters notifications to find content with topics for which the user has shown interest. The collaborative filter increases diversity by discovering new topics of interest for the user, because these are of interest to other users with similar concerns. The paper reports the result of measuring the performance of this recommender, and includes a validation of the topics-based approach used for content selection. Finally, we demonstrate how the recommender uses implicit-feedback to unnoticeably personalize the content to be delivered to each user.
... Use cases such as security services or internal organizations that create profiles to evaluate various characteristics of their employees are mentioned. Profiling individuals for content recommendation, such as news recommendations, has been used for many years (Mannens et al., 2013). Automatic detection of fake profiles on social media platforms such as Instagram and Twitter is another widespread use case for people profiling using data mining and clustering techniques (Khaled et al., 2018). ...
Conference Paper
Full-text available
Creating employee questionnaires, surveys or evaluation forms for people to understand various aspects such as motivation, improvement opportunities, satisfaction, or even potential cybersecurity risks is a common practice within organizations. These surveys are usually not tailored to the individual and have a set of predetermined questions and answers. The objective of this paper is to design AI agents that are flexible and adaptable in choosing the survey content for each individual according to their personality. The developed framework is open source, generic and can be adapted to many use cases. For the evaluation, we present a real-world use case of detecting potentially inappropriate behavior in the workplace. In this case, the AI agents that create the personalized surveys act similarly to a human recruiter. The results obtained are promising and suggest that the decision algorithms for content selection approaches are similar to a real human resource manager in our use case.
... Goodreads and LibraryThing online platforms aim at identifying communities of readers using book titles in their reading lists as an indication of how similar readers' interests are (Han et al. 2019;Thelwall and Kousha 2017). Two major approaches guide those efforts, collaborative filtering and content-based methods (Mannens et al. 2013;Oleinik 2022;Wang et al. 2018;Yang 2018). Collaborative filtering has the premise that if A and B have similar interests and A likes book X, then B may like the same title. ...
Article
Full-text available
The article discusses a Bayesian measure of association, B-index, and compares it with the other existing measures of agreement, association, and similarity, both chance-corrected and non-corrected: Scott’s π, Krippendorff’s α, Cohen’s κ, Bennett, Alpert & Goldstein’s S, Cosine similarity, and the Jaccard similarity coefficient. PageRank adapted to particularities of annotation is also added to this list. Two versions of B-index are considered: with the informative and non-informative priors. An algorithm for calculating B-index written in pseudocode is provided. Particular attention is devoted to the uses of those measures in content analysis, communication studies, computational linguistics, psychology, computer science and network science. Real-world data gathered using an online platform for content analysis allowed comparing the behavior of all eight measures included in the scope of analysis. Three short texts (164 data points/sentences in total) were coded by 66 annotators. The behaviors of B-index with the non-informative prior and Bennett, Alpert & Goldstein’s S have some common patterns.
... Mannens et al. [127], for example, propose a recommendation method that mitigates sparsity by complementing binary consumption values in the matrix with "potential consumption" values between 0 and 1 based on a collaborative filtering algorithm, which is then re-executed until the matrix is dense enough. Furthermore, to reduce the uncertainty introduced by this probabilistic approach, they also suggest to post-filter the news articles based on Linked Data sources. ...
Article
More and more people read the news online, e.g., by visiting the websites of their favorite newspapers or by navigating the sites of news aggregators. However, the abundance of news information that is published online every day through different channels can make it challenging for readers to locate the content they are interested in. The goal of News Recommender Systems (NRS) is to make reading suggestions to users in a personalized way. Due to their practical relevance, a variety of technical approaches to build such systems have been proposed over the last two decades. In this work, we review the state-of-the-art of designing and evaluating news recommender systems over the last ten years. One main goal of the work is to analyze which particular challenges of news recommendation (e.g., short item life times and recency aspects) have been well explored and which areas still require more work. Furthermore, in contrast to previous surveys, the paper specifically discusses methodological questions and today's academic practice of evaluating and comparing different algorithmic news recommendation approaches based on accuracy measures.
... Other works combine Linked Data based algorithms with other techniques of recommendation to improve the results. These techniques include collaborative filtering [75][76][77][78], information aggregation [79][80][81] and statistical methods like Vector Space Model (VSM) [62,77], Random Indexing (RI) [72], implicit feedback [77], Latent Dirichlet Allocation (LDA) [82], and structure-based statistical semantics [83]. De Graaff et al. [84] proposed a knowledge-based recommender system that derives the user interests from the user's social media profile, which is enriched with information from DBpedia. ...
Thesis
Full-text available
Nowadays, people can easily obtain a huge amount of information from the Web, but often they have no criteria to discern it. This issue is known as information overload. Recommender systems are software tools to suggest interesting items to users and can help them to deal with a vast amount of information. Linked Data is a set of best practices to publish data on the Web, and it is the basis of the Web of Data, an interconnected global dataspace. This thesis discusses how to discover information useful for the user from the vast amount of structured data, and notably Linked Data available on the Web. The work addresses this issue by considering three research questions: how to exploit existing relationships between resources published on the Web to provide recommendations to users; how to represent the user and his context to generate better recommendations for the current situation; and how to effectively visualize the recommended resources and their relationships. To address the first question, the thesis proposes a new algorithm based on Linked Data which exploits existing relationships between resources to recommend related resources. The algorithm was integrated into a framework to deploy and evaluate Linked Data based recommendation algorithms. In fact, a related problem is how to compare them and how to evaluate their performance when applied to a given dataset. The user evaluation showed that our algorithm improves the rate of new recommendations, while maintaining a satisfying prediction accuracy. To represent the user and their context, this thesis presents the Recommender System Context ontology, which is exploited in a new context-aware approach that can be used with existing recommendation algorithms. The evaluation showed that this method can significantly improve the prediction accuracy. As regards the problem of effectively visualizing the recommended resources and their relationships, this thesis proposes a visualization framework for DBpedia (the Linked Data version of Wikipedia) and mobile devices, which is designed to be extended to other datasets. In summary, this thesis shows how it is possible to exploit structured data available on the Web to recommend useful resources to users. Linked Data were successfully exploited in recommender systems. Various proposed approaches were implemented and applied to use cases of Telecom Italia.
... Other works combine Linked Data based algorithms with other techniques of recommendation in order to improve the results. These techniques include collaborative filtering [10,14,16,18], information aggregation [2,9,12] and statistical methods like Random Indexing (RI) [23], Vector Space Model (VSM) [1,16], Latent Dirichlet Allocation (LDA) [11], implicit feedback [16] and structure-based statistical semantics [3]. De Graaff et al. [5] proposed a knowledgebased recommender system that derives the user interests from the users social media profile, which is enriched with information from DBpedia. ...
Conference Paper
Full-text available
The Web of Data is an interconnected global dataspace in which discovering resources related to a given resource and recommend relevant ones is still an open research area. This work describes a new recommendation algorithm based on structured data published on the Web (Linked Data). The algorithm exploits existing relationships between resources by dynamically analyzing both the categories to which they belong to and their explicit references to other resources. A user study conducted to evaluate the algorithm showed that our algorithm provides more novel recommendations than other state-of-the-art algorithms and keeps a satisfying prediction accuracy. The algorithm has been applied in a mobile application to recommend movies by relying on DBpedia (the Linked Data version of Wikipedia), although it could be applied to other datasets on the Web of Data.
... In [24] a semantic version of the NewsML-G2 standard -a news XML format-is presented which enriches news events meta-data via smart indexing and linked open datasets. ...
Conference Paper
This article is a review on news retrieval and mining research areas in recent years based on a qualitative approach. It addresses news retrieval and mining in four main categories of News Retrieval and Extraction, News Content Analysis, News Propagation Analysis, and News Visualization. Each indicated category entails various research areas that have been investigated through several studies. This study depicts the immense extent of news retrieval and mining, the interconnected methods, tools, and theoretical foundations as well as the evaluation methods and the results. The study helps to gain a better understanding of news mining research areas.
... Type Domain LOD A B C D [26] RS POI [27] SPARQL queries Expert Search [28] RS Movies, Music [29] Exploratory Search Videos [30] RS Keywords [31] RS Movies, Stars [32] RS Movies [33] RS Museal objects [10] RS Music ( ) [34] RS Preservation [35] RS News [36] RS Web Services [20] Exploratory Search General Purpose [37] RS Events [38] Exploratory Search Keywords ( ) [23] SPARQL queries Music [39] RS Movies [9] RS Books [40] RS Travel [7], [8] RS Movies [41] RS Music [42] RS News [24] RS Movies, Real Estate [43] RS Music [44] RS Music [45] RS Music [25], [46] RS POI, Music ( ) [47] RS Citations [48] RS Movies, Music, Actors [49] RS Movies [50] RS Movies [51] RS Music [52] RS Movies, Person [53] Exploratory Search General Purpose ( ) [54] Query-based RS Movies [55] Query-based RS Data Integr., Mapping [56] RS Scientific Publications [11], [13] Query-based RS General Purpose [12] Query-based RS University Courses ( ) 5 SKOSRec prototype ...
Technical Report
Full-text available
Recommender systems (RS) have become an important tool in many e-commerce applications. They help users navigate through rich information spaces. While they often improve user experience and search results, many content based systems work with insufficient data. Thus, recent research has focused on enhancing item feature information with data from the Linked Open Data (LOD) cloud. The generation of Linked Data-enhanced item descriptions requires a considerable amount of pre-processing which can be a barrier for short update cycles. In addition, item similarity computation is often carried out in offline mode. Thus, Linked Data recommender systems (LDRS) are usually bound to a predefined set of item features and offer limited opportunities to tune the recommendation model on a frequent basis. This is a considerable drawback since knowledge bases on the web offer rich and up-to-date information sources, which are well suited for personalization tasks. Above that, they contain statistical information on occurrences of resources, which can be used to measure relevance. This paper introduces the prototype SKOS Recommender (SKOSRec), which produces scalable on-the-fly recommendations through SPARQL-like queries from Linked Data repositories. The SKOSRec query language enables users to obtain constraint-based, aggregation-based and cross-domain recommendations, such that results can be adapted to specific business or customer requirements.
... Similarly, recommendation-based systems in which users are provided directed content based on their preferences could be efficiently built using graph databases. As an example, news broadcasters could create an aggregated global profile of a user, link it with their preferences for events and news, and effectively feed personalized RSS feeds to users using a graph database like Allegrograph [68]. ...
Article
Full-text available
Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have resulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that promises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on data management in cloud environments. Traditional relational databases were designed in a different hardware and software era and are facing challenges in meeting the performance and scale requirements of Big Data. NoSQL and NewSQL data stores present themselves as alternatives that can handle huge volume of data. Because of the large number and diversity of existing NoSQL and NewSQL solutions, it is difficult to comprehend the domain and even more challenging to choose an appropriate solution for a specific task. Therefore, this paper reviews NoSQL and NewSQL solutions with the objective of: (1) providing a perspective in the field, (2) providing guidance to practitioners and researchers to choose the appropriate data store, and (3) identifying challenges and opportunities in the field. Specifically, the most prominent solutions are compared focusing on data models, querying, scaling, and security related capabilities. Features driving the ability to scale read requests and write requests, or scaling data storage are investigated, in particular partitioning, replication, consistency, and concurrency control. Furthermore, use cases and scenarios in which NoSQL and NewSQL data stores have been used are discussed and the suitability of various solutions for different sets of applications is examined. Consequently, this study has identified challenges in the field, including the immense diversity and inconsistency of terminologies, limited documentation, sparse comparison and benchmarking criteria, and nonexistence of standardized query languages.
Chapter
Background information about Russia’s war in Ukraine can be found in this chapter. The reader will be directed to the Revolution of Dignity of 2013–2014 in Ukraine that immediately preceded Russia’s invasion. Three ‘stories’ of the war, national (the Ukraine story), imperial (the Russia story), and geopolitical (it has several versions), can be better understood by considering the belligerents’ history of court durée, the period from 2014 to 2022, and their history of long durée. The case of Putin, the historian (his frequent historical references and arguments), is discussed. The chapter also contains a brief overview of the political and media landscapes in the countries covered by this study.
Article
Investments into new energy solution systems, and for example into producing carbon-neutral fuels, are increasing, but tools for the capital investments' feasibility studies are limited. Various contemporaneous attempts to reduce the dependence on fossil energy sources are needed, and a power-to-x (P2X) solution, which is part of the hydrogen economy, can be seen as one opportunity. However, many hydrogen economy solutions have not yet been proven to be economically profitable, but they could be if the investment projects were considered from a broader perspective than from company level and an economic perspective. In previous research, a three-stage economic and technology emphasized feasibility study (FS) framework has been created, and the early results indicate that the P2X investments can meet economic feasibility with over 12% of the investor IRR, and could offer profitable solutions towards a carbon-neutral future. However, the framework did not recognize the full potential of P2X through sustainability, and therefore a new extended version of the framework is needed. The objective of the paper is to create an expanded sustainable feasibility study (SFS) framework from the FS framework to support the P2X investments. As a result, an SFS framework is created, considering the investment projects’ feasibility beyond the economic perspective by adding all three dimensions of sustainability: economic, environmental, and social. The three stages of the framework are ecosystem profiling, business model description, and profitability modelling. This paper was made by utilizing the design science research (DSR) methodology and a literature review.
Article
Full-text available
The algorithms underpinning information retrieval shape its outcomes and have epistemological, social and political consequences. On the one hand, the Web search algorithms place a specific actor—the Web librarian (cataloguer), the document’s creator, the expert (“authority”), the user or the service provider (developer and operator of a search engine)—in the position of a decision-maker. Each of them has distinctive criteria of relevance in information retrieval. On the other hand, the application of those criteria determines what information the user receives. Content-based search places emphasis on the contents of retrievable documents whereas collaborative search shifts the focus of attention to opinions of experts and other users. The outcomes of content-based and collaborative searches diverge as a result. Depending on the information provided to the user, the development of her knowledge and socialization proceeds differently. A plea for customized Web search is made. It is argued that the user should be given an opportunity for selecting a combination of content-based and collaborative search that matches her interests and the context of a search query.
Book
Full-text available
Linked Data principles have led to semantically interlink and connect different resources at data level regardless the structure, authoring, location etc. Data available on the Web using Linked Data has resulted in a global data space called the Web of Data. Moreover, thanks to the efforts of the scientific community and the W3C Linked Open Data –LOD– project1, more and more data have been published on the Web of Data, helping its growth and evolution. This book studied RS that use Linked Data as a source for generating recommendations exploiting the large amount of available resources and the relationships between them. First, a comprehensive state of the art is presented in order to identify and study frameworks and algorithms for RS that rely on Linked Data. Second a framework named AlLied that makes available implementations of the most used algorithms for resource recommendation based on Linked Data is described. This framework is intended to use and test the recommendation algorithms in various domains and contexts, and to analyze their behavior under different conditions. Accordingly, the framework is suitable to compare the results of these algorithms both in performance and relevance, and to enable the development of innovative applications on top of it.
Thesis
Backgrounds: The increase in the amount of structured data published using the principles of Linked Data, means that now it is more likely to find resources in the Web of Data that describe real life concepts. However, discovering resources related to any given resource is still an open research area. This thesis studies Recommender Systems (RS) that use Linked Data as a source for generating recommendations exploiting the large amount of available resources and the relationships among them. Aims: The main objective of this study was to propose a recommendation technique for resources considering semantic relationships between concepts from Linked Data. The specific objectives were: - Define semantic relationships derived from resources taking into account the knowledge found in Linked Data datasets. - Determine semantic similarity measures based on the semantic relationships derived from resources. - Propose an algorithm to dynamically generate automatic rankings of resources according to defined similarity measures. Methodology: It was based on the recommendations of the Project management Institute and the Integral Model for Engineering Professionals (Universidad del Cauca). The first one for managing the project, and the second one for developing the experimental prototype. Accordingly, the main phases were: - Conceptual base generation for identifying the main problems, objectives and the project scope. A Systematic Literature Review was conducted for this phase, which highlighted the relationships and similarity measures among resources in Linked Data, and the main issues, features, and types of RS based on Linked Data. - Solution development is about designing and developing the experimental prototype for testing the algorithms studied in this thesis. The main results obtained were: - The first Systematic Literature Review on RS based on Linked Data. - A framework to execute and analyze recommendation algorithms based on Linked Data. - A dynamic algorithm for resource recommendation based on on the knowledge of Linked Data relationships. - A comparative study of algorithms for RS based on Linked Data. - Two implementations of the proposed framework. One with graph-based algorithms and other with machine learning algorithms. - The application of the framework to various scenarios to demonstrate its feasibility within the context of real applications. Conclusions: - The proposed framework demonstrated to be useful for developing and evaluating different configurations of algorithms to create novel RS based on Linked Data suitable to users' requirements, applications, domains and contexts. - The layered architecture of the proposed framework is also useful towards the reproducibility of the results for the research community. - Linked data based RS are useful to present explanations of the recommendations, because of the graph structure of the datasets. - Graph-based algorithms take advantage of intrinsic relationships among resources from Linked Data. Nevertheless, their execution time is still an open issue. Machine Learning algorithms are also suitable, they provide functions useful to deal with large amounts of data, so they can help to improve the performance (execution time) of the RS. However most of them need a training phase that require to know a priory the application domain in order to obtain reliable results. - A logical evolution of RS based on Linked Data is the combination of graph-based with machine learning algorithms to obtain accurate results while keeping low execution times. However, research and experimentation is still needed to explore more techniques from the vast amount of machine learning algorithms to determine the most suitable ones to deal with Linked Data.
Conference Paper
Full-text available
As more people prefer to read news on-line, the newspa- pers are focusing on personalized news presentation. In this study, we investigate the prediction of article’s position based on the analysis of article’s content using different text analytics methods. The evaluation is performed in 4 main scenarios using articles from different time frames. The re- sult of the analysis shows that the article’s freshness plays an important role in the prediction of a new article’s position. Also, the results from this work provides insight on how to find an optimised solution to automate the process of as- signing new article the right position. We believe that these insights may further be used in developing content based news recommender algorithms.
Conference Paper
Los periódicos trabajan con un gran volumen de información que necesita ser descrita adecuadamente. Para ello, las etiquetas “title”, “keywords” y “description” son muy utilizadas en el código fuente de las noticias online. Sin embargo, estas no resultan suficientemente descriptivas. Así, surgen estándares de metadatos, con el fin de facilitar la interoperabilidad y profundizar en la descripción. Actualmente, las etiquetas HTML y diversos estándares conviven en el sector periodístico, con diversos grados de implantación. Se analiza el código fuente de diarios internacionales de información general y se lleva a cabo una profunda revisión bibliográfica sobre estándares de metadatos. El propósito es conocer qué estándares existen, y evaluar su uso en los códigos fuente de una muestra de periódicos. Para ello se identifican los metadatos de contenido semántico de los códigos fuente. Además se desarrolla el software MetadadosHTML. Como conclusiones destacan la gran distancia entre los estándares recogidos en la bibliografía y los mostrados en los código fuente. En el primer caso, los más referenciados son los formatos NewsML y NITF, implementados por algunos medios y agencias de prensa, al menos a nivel interno. Por el contrario, en el código fuente los más habituales son schema.org y dos esquemas para mostrar información en redes sociales, Open Graph Protocol (usado por Facebook) y Twitter Cards. Esto, evidencia la convivencia de diversos estándares de metadatos en el ámbito de los medios de comunicación y pone de relieve la falta de uniformidad en su uso. Para alcanzar el ideal de interoperabilidad de los contenidos, es preciso utilizar tecnologías de la Web Semántica. En este sentido, se debería tender a definir ontologías o vocabularios RDF para las diferentes propuestas analizadas Disponible en: http://www.iskoiberico.org/wp-content/uploads/2015/11/43_Ba%C3%B1os.pdf
Article
Full-text available
With the rapid growth of user-created contents and wide use of community-based websites, content recommendation systems have attracted the attention of users. However, most recommendation systems have limitations in properly reflecting each user’s characteristics, and difficulty in recommending appropriate contents to users. Therefore, we propose a content recommendation method using Friend-Of-A-Friend (FOAF) and Social Network Analysis (SNA). First, we extract us-er tags and characteristics using FOAF, and generate graphs with the collected data, with the method. Next, we extract common characteristics from the contents, and hot tags using SNA, and recommend the appropriate contents for users. For verification of the method, we analyzed an experimental social network with the method. From the experiments, we verified that the more users that are added into the social network, the higher the quality of recommendation increases, with comparison to an item-based method. Additionally, we can provide users with more relevant recommendation of contents.
Article
Full-text available
Recommender systems leverage product and community information to target products to consumers. Researchers have developed collaborative recommenders, content-based recommenders, and a few hybrid systems. We propose a unified probabilistic framework for merging collaborative and content-based recommendations. We extend Hofmann’s (1999) aspect model to incorporate three-way co-occurrence data among users, items, and item content. The relative influence of collaboration data versus content data is not imposed as an exogenous parameter, but rather emerges naturally from the given data sources. However, global probabilistic models coupled with standard EM learning algorithms tend to drastically overfit in the sparse data situations typical of recommendation applications. We show that secondary content information can often be used to overcome sparsity. Experiments on data from the ResearchIndex library of Computer Science publications show that appropriate mixture models incorporating secondary data produce significantly better quality recommenders than k-nearest neighbors (k-NN). Global probabilistic models also allow more general inferences than local methods like k-NN.
Article
Full-text available
The term Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions-the Web of Data. In this article we present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. We describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward.
Article
Full-text available
Collaborative filtering or recommender systems use a database about user preferences to predict additional topics or products a new user might like. In this paper we describe several algorithms designed for this task, including techniques based on correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods. We compare the predictive accuracy of the various methods in a set of representative problem domains. We use two basic classes of evaluation metrics. The first characterizes accuracy over a set of individual predictions in terms of average absolute deviation. The second estimates the utility of a ranked list of suggested items. This metric uses an estimate of the probability that a user will see a recommendation in an ordered list. Experiments were run for datasets associated with 3 application areas, 4 experimental protocols, and the 2 evaluation metrics for the various algorithms. Results indicate that for a wide range of conditions, Bayesian networks with decision trees at each node and correlation methods outperform Bayesian-clustering and vector-similarity methods. Between correlation and Bayesian networks, the preferred method depends on the nature of the dataset, nature of the application (ranked versus one-by-one presentation), and the availability of votes with which to make predictions. Other considerations include the size of database, speed of predictions, and learning time.
Article
Full-text available
We describe a graphical model for probabilistic relationships--an alternative to the Bayesian network--called a dependency network. The graph of a dependency network, unlike a Bayesian network, is potentially cyclic. The probability component of a dependency network, like a Bayesian network, is a set of conditional distributions, one for each node given its parents. We identify several basic properties of this representation and describe a computationally efficient procedure for learning the graph and probability components from data. We describe the application of this representation to probabilistic inference, collaborative filtering (the task of predicting preferences), and the visualization of acausal predictive relationships.
Article
Full-text available
Recommender systems leverage product and community information to target products to consumers. Researchers have developed collaborative recommenders, content-based recommenders, and (largely ad-hoc) hybrid systems. We propose a unified probabilistic framework for merging collaborative and content-based recommendations. We extend Hofmann's [1999] aspect model to incorporate three-way co-occurrence data among users, items, and item content. The relative influence of collaboration data versus content data is not imposed as an exogenous parameter, but rather emerges naturally from the given data sources. Global probabilistic models coupled with standard Expectation Maximization (EM) learning algorithms tend to drastically overfit in sparse-data situations, as is typical in recommendation applications. We show that secondary content information can often be used to overcome sparsity. Experiments on data from the ResearchIndex library of Computer Science publications show that appropriate mixture models incorporating secondary data produce significantly better quality recommenders than k-nearest neighbors (k-NN). Global probabilistic models also allow more general inferences than local methods like k-NN.
Article
Full-text available
The term "Linked Data" refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions-the Web of Data. In this article, the authors present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. They describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward.
Conference Paper
Full-text available
For easing the exchange of news, the International Press Telecommunication Council (IPTC) has developed the NewsML Architecture (NAR), an XML-based model that is specialized into a number of languages such as NewsML G2 and EventsML G2. As part of this architecture, specific controlled vocabularies, such as the IPTC News Codes, are used to categorize news items together with other industry-standard thesauri. While news is still mainly in the form of text-based stories, these are often illustrated with graphics, images and videos. Media-specific metadata formats, such as EXIF, DIG35 and XMP, are used to describe the media. The use of different metadata formats in a single production process leads to interoperability problems within the news production chain itself. It also excludes linking to existing web knowledge resources and impedes the construction of uniform end-user interfaces for searching and browsing news content. In order to allow these different metadata standards to interoperate within a single information environment, we design an OWL ontology for the IPTC News Architecture, linked with other multimedia metadata standards. We convert the IPTC NewsCodes into a SKOS thesaurus and we demonstrate how the news metadata can then be enriched using natural language processing and multimedia analysis and integrated with existing knowledge already formalized on the Semantic Web. We discuss the method we used for developing the ontology and give rationale for our design decisions. We provide guidelines for re-engineering schemas into ontologies and formalize their implicit semantics. In order to demonstrate the appropriateness of our ontology infrastructure, we present an exploratory environment for searching and browsing news items.
Conference Paper
Full-text available
There exist a number of similarity-based recommendation communities, within which similar users' opinions are collected by users' agents to make predictions of their opinions on a new item. Similarity-based recommendation communities suffer from some significant limitations, such as scalability and susceptibility to the noise. In this paper, we propose a trust-based community to overcome these limitations. The trust-based recommendation community incorporates trust into the domain of item recommendation. Experimental results based on a real dataset show that trust-based community manages to outperform its similarity-based counterpart in terms of prediction accuracy, coverage, and robustness in the presence of noise.
Conference Paper
Full-text available
Most existing recommender systems employ collaborative filtering (CF) techniques in making projections about which items an e- service user is likely to be interested in, i.e. they identify correlations between users and recommend items which similar users have liked in the past. Traditional CF techniques, however, have difficulties when confronted with sparse rating data, and cannot cope at all with time-specific items, like events, which typically receive their ratings only after they have finished. Content-based (CB) algorithms, which consider the internal structure of items and recommend items similar to those a user liked in the past can partly make up for that drawback, but the collaborative feature is totally lost on them. In this paper, modelling user and item similarities as fuzzy relations, which allow to flexibly reflect the graded/uncertain information in the domain, we develop a novel, hybrid CF-CB approach whose rationale is concisely summed up as "recommending future items if they are similar to past ones that similar users have liked", and which surpasses related work in the same spirit.
Conference Paper
Full-text available
Collaborative Filtering (CF), the prevalent recommendation approach, has been successfully used to identify users that can be characterized as "similar" according to their logged history of prior transactions. However, the applicability of CF is limited due to the sparsity problem, which refers to a situation that transactional data are lacking or are insufficient. In an attempt to provide high-quality recommendations even when data are sparse, we propose a method for alleviating sparsity using trust inferences. Trust inferences are transitive associations between users in the context of an underlying social network and are valuable sources of additional information that help dealing with the sparsity and the cold-start problems. A trust computational model has been developed that permits to define the subjective notion of trust by applying confidence and uncertainty properties to network associations. We compare our method with the classic CF that does not consider any transitive associations. Our experimental results indicate that our method of trust inferences significantly improves the quality performance of the classic CF method.
Article
Full-text available
Collaborative filtering is one of the most widely adopted and successful recommendation approaches. Unlike approaches based on intrinsic consumer and product characteristics, CF characterizes consumers and products implicitly by their previous interactions. The simplest example is to recommend the most popular products to all consumers. Researchers are advancing CF technologies in such areas as algorithm design, human- computer interaction design, consumer incentive analysis, and privacy protection.
Article
Full-text available
Newsproductionischaracterisedbycomplexand dynamic workflows as it is important to produce and distri- bute news as soon as possible and in an audiovisual quality as good as possible. In this paper, we present news produc- tion as it has been implemented at the Flemish Radio and Television (Vlaamse radio en televisie, VRT). Driven by the dynamic nature of news content, the VRT news department is optimized for short cycle times and characterised by a highly parallel production process, i.e. product engineering (newsbulletincompositionor"organise"andstory-editingor "construct message"),variousmaterialprocurement("create media asset") processes, mastering ("publish"), and the dis-
Conference Paper
Full-text available
Adaptive web sites may offer automated recommendations generated through any number of well-studied techniques including collaborative, con- tent-based and knowledge-based recommendation. Each of these techniques has its own strengths and weaknesses. In search of better performance, researchers have combined recommendation techniques to build hybrid recommender sys- tems. This chapter surveys the space of two-part hybrid recommender systems, comparing four different recommendation techniques and seven different hy- bridization strategies. Implementations of 41 hybrids including some novel combinations are examined and compared. The study finds that cascade and augmented hybrids work well, especially when combining two components of differing strengths.
Article
Full-text available
Item-based Collaborative Filtering (CF) algorithms have been designed to deal with the scalability problems associated with traditional user-based CF approaches without sacrificing recommendation or prediction accuracy. Item-based algorithms avoid the bottleneck in computing user-user correlations by first considering the relationships among items and performing similarity computations in a reduced space. Because the computation of item similarities is independent of the methods used for generating predictions, multiple knowledge sources, including structured semantic information about items, can be brought to bear in determining similarities among items. The integration of semantic similarities for items with rating- or usage-based similarities allows the system to make inferences based on the underlying reasons for which a user may or may not be interested in a particular item. Furthermore, in cases where little or no rating (or usage) information is available (such as in the case of newly added items, or in very sparse data sets), the system can still use the semantic similarities to provide reasonable recommendations for users. In this paper, we introduce an approach for semantically enhanced collaborative filtering in which structured semantic knowledge about items, extracted automatically from the Web based on domain-specific reference ontologies, is used in conjunction with user-item mappings to create a combined similarity measure and generate predictions. Our experimental results demonstrate that the integrated approach yields significant advantages both in terms of improving accuracy, as well as in dealing with very sparse data sets or new items.
Article
Full-text available
We investigate the use of dimensionality reduction to improve performance for a new class of data analysis software called "recommender systems". Recommender systems apply knowledge discovery techniques to the problem of making product recommendations during a live customer interaction. These systems are achieving widespread success in E-commerce nowadays, especially with the advent of the Internet. The tremendous growth of customers and products poses three key challenges for recommender systems in the E-commerce domain. These are: producing high quality recommendations, performing many recommendations per second for millions of customers and products, and achieving high coverage in the face of data sparsity. One successful recommender system technology is collaborative filtering , which works by matching customer preferences to other customers in making recommendations. Collaborative filtering has been shown to produce high quality recommendations, but the performance degrades with ...
Article
Full-text available
Information filtering agents and collaborative filtering both attempt to alleviate information overload by identifying which items a user will find worthwhile. Information filtering (IF) focuses on the analysis of item content and the development of a personal user interest profile. Collaborative filtering (CF) focuses on identification of other users with similar tastes and the use of their opinions to recommend items. Each technique has advantages and limitations that suggest that the two could be beneficially combined. This paper shows that a CF framework can be used to combine personal IF agents and the opinions of a community of users to produce better recommendations than either agents or users can produce alone. It also shows that using CF to create a personal combination of a set of agents produces better results than either individual agents or other combination mechanisms. One key implication of these results is that users can avoid having to select among ag...
Article
Full-text available
The explosive growth of the world-wide-web and the emergence of e-commerce has led to the development of recommender systems---a personalized information filtering technology used to identify a set of N items that will be of interest to a certain user. User-based Collaborative filtering is the most successful technology for building recommender systems to date, and is extensively used in many commercial recommender systems. Unfortunately, the computational complexity of these methods grows linearly with the number of customers that in typical commercial applications can grow to be several millions. To address these scalability concerns item-based recommendation techniques have been developed that analyze the user-item matrix to identify relations between the different items, and use these relations to compute the list of recommendations. In this paper we present one such class of item-based recommendation algorithms that first determine the similarities between the various ite...
Article
Full-text available
,.-0/012323-04657-98:<;7:<=>-023:?@@AB;C:<=!?=>DE:<=>DE/F?AG?465IHJ41FKAE-F57L-M5DE:<N /91O-98P;Q=>-0/!R4D+SUT-0:=>1V=>R-@78>1WAE-02X1Y#2Z?HJDE4L[@78>1J5T/=]8>-0/012QN 23-9465?^=>DE14:57T8>DE4L_?`AEDEO-/0T:<=>123-98D(4U=>-98!?/9=>DE14a?4653=>R-9;b?^8>- ?/RDE-0OJD(4LQKD(5-0:P@78>-F?5 :PT/0/0-0:P:DE4acdN<e]12323-98>/0-[41^K?5?F;7:0fgh4 =>RDE:`@6?@i-98FjdK-ZDE4JO-0:<=>DEL?^=>-C:P-9O-98!?Ak=>-0/R4D(SUT-0:[Yl183?46?AE;7m0DE4L A(?^8>L-9Nn:P/F?A(-`@T8>/!R6?:P-M?465 @78>-9Yo-8>-04/0-M5?=!?QYo18p=>R-V@T8>@i1:P-V1Y @78>1J5T/0DE4L%T:P-YoTA8>-0/012323-0465?=>DE14:3=>1q/0T:<=>123-98>:0frg4s@?8PN =>DE/0TA(?8Fj]K]- ?@@AB;t?u/01AEAE-0/9=>DE14s1Y?AEL18>DB=>R23:Q:PT/Rv?:`=P8!?5DBN =>DE146?Aw5?=!?Q23DE4D(4Lj4-F?^8>-0:<=PNn4-0DELR7Wi18p/91AEA(?Wi18!?^=>DEO-_xAE=>-8>D(4Lj ?465V5DE23-04:PDE146?AEDE=;[8>-F5T/9=>DE14V14[=K1y5DBz-98>-04U=d5?=!?:P-9=...
Chapter
Adaptive web sites may offer automated recommendations generated through any number of well-studied techniques including collaborative, content-based and knowledge-based recommendation. Each of these techniques has its own strengths and weaknesses. In search of better performance, researchers have combined recommendation techniques to build hybrid recommender systems. This chapter surveys the space of two-part hybrid recommender systems, comparing four different recommendation techniques and seven different hybridization strategies. Implementations of 41 hybrids including some novel combinations are examined and compared. The study finds that cascade and augmented hybrids work well, especially when combining two components of differing strengths.
Chapter
This chapter discusses content-based recommendation systems, i.e., systems that recommend an item to a user based upon a description of the item and a profile of the user's interests. Content-based recommendation systems may be used in a variety of domains ranging from recommending web pages, news articles, restaurants, television programs, and items for sale. Although the details of various systems differ, content-based recommendation systems share in common a means for describing the items that may be recommended, a means for creating a profile of the user that describes the types of items the user likes, and a means of comparing items to the user profile to determine what to re commend. The profile is often created and updated automatically in response to feedback on the desirability of items that have been presented to the user.
Article
Recommender systems assist and augment a natural social process. In a typical recommender system people, provide recommendations as inputs, which tile system then aggregates and directs to appropriate recipients. In some cases, the primary transformation is in the aggregation; in others, the system's value lies in its ability to make good matches between recommenders and those seeking recommendations. This special section includes descriptions of five recommender systems. A sixth article analyzes incentives for provision of recommendations. Recommender systems introduce two interesting incentive problems. First, once one has established a profile of interests, it is easy to free ride by consuming evaluations provided by others. Second, if anyone can provide recommendations, content owners may generate mountains of positive recommendations for their own materials and negative recommendations for their competitors. Recommender systems also raise concerns about personal privacy.
Article
News production is characterized by complex and dynamic workflows in which it is important to produce and distribute news items as fast as possible. In this paper, we show how personalized distribution and consumption of news items can be enabled by automatically enriching news metadata with open linked datasets available on theWeb of data, thus providing a more pleasant experience to fastidious consumers where news content is presented within a broader historical context. Further we present a faceted browser that provides a convenient way for exploring news items based on an ontology of NewsML-G2 and rich semantic metadata.
Conference Paper
Recommender systems are used to suggest customized prod- ucts to users. Most recommender algorithms create collab- orative models by taking advantage of web user profiles. In the last years, in the area of recommender systems, the Netflix contest has been very attractive for the researchers. However, many recent papers on recommender systems present results evaluated with the methodology used in the Netflix contest, also in domains where the objectives are dif - ferent from the contest (e.g., top-N recommendation task). In this paper we do not propose new recommender algo- rithms but, rather, we compare different aspects of the offi- cial Netflix contest methodology based on RMSE and hold- out with methodologies based onk-fold and classification accuracy metrics. We show, with case studies, that differen t evaluation methodologies lead to totally contrasting con- clusions about the quality of recommendations.
Conference Paper
This chapter discusses content-based recommendation systems, i.e., systems that recommend an item to a user based upon a description of the item and a profile of the user's interests. Content-based recommendation systems may be used in a variety of domains ranging from recommending web pages, news articles, restaurants, television programs, and items for sale. Although the details of various systems differ, content-based recommendation systems share in common a means for describing the items that may be recommended, a means for creating a profile of the user that describes the types of items the user likes, and a means of comparing items to the user profile to determine what to recommend. The profile is often created and updated automatically in response to feedback on the desirability of items that have been presented to the user.
Article
Recommendation algorithms are best known for their use on e-commerce Web sites, where they use input about a customer's interests to generate a list of recommended items. Many applications use only the items that customers purchase and explicitly rate to represent their interests, but they can also use other attributes, including items viewed, demographic data, subject interests, and favorite artists. At Amazon.com, we use recommendation algorithms to personalize the online store for each customer. The store radically changes based on customer interests, showing programming titles to a software engineer and baby toys to a new mother. There are three common approaches to solving the recommendation problem: traditional collaborative filtering, cluster models, and search-based methods. Here, we compare these methods with our algorithm, which we call item-to-item collaborative filtering. Unlike traditional collaborative filtering, our algorithm's online computation scales independently of the number of customers and number of items in the product catalog. Our algorithm produces recommendations in real-time, scales to massive data sets, and generates high quality recommendations.
Article
Grouping people into clusters based on the items they have purchased allows accurate recommendations of new items for purchase: if you and I have liked many of the same movies, then I will probably enjoy other movies that you like. Recommending items based on similarity of interest #a.k.a. collaborative #ltering# is attractive for many domains: books, CDs, movies, etc., but does not always work well. Because data are always sparse # any given person has seen only a small fraction of all movies # much more accurate predictions can be made by grouping people into clusters with similar movies and grouping movies into clusters which tend to be liked by the same people. Finding optimal clusters is tricky because the movie groups should be used to help determine the people groups and visa versa. We present a formal statistical model of collaborative #ltering, and compare di#erent algorithms for estimating the model parameters including variations of K-means clustering and Gibbs...
Article
kkkkkkkkkkkkkk kkkkkkkkkkkkkcombines the coverage and speed of content-filters with the depth of collaborative filtering. We apply our research approach to an online newspaper, an as yet untapped opportunity for filters useful to the wide-spread news reading populace. We present the design of our filtering system and describe the results from preliminary experiments that suggest merits to our approach. 1 Introduction That we are in the age of information is evident quite clearly in newspapers as an information source. Nearly everywhere in North America you can have 1/2 dozen newspapers delivered to your doorstep, each with hundreds of new articles each day. Near...
Article
Recommender systems improve access to relevant products and information by making personalized suggestions based on previous examples of a user's likes and dislikes. Most existing recommender systems use social filtering methods that base recommendations on other users' preferences. By contrast, content-based methods use information about an item itself to make suggestions. This approach has the advantage of being able to recommended previously unrated items to users with unique interests and to provide explanations for its recommendations. We describe a content-based book recommending system that utilizes information extraction and a machine-learning algorithm for text categorization. Initial experimental results demonstrate that this approach can produce accurate recommendations.
Semantically enhanced collaborative filtering on the web://www.springerlink.com/content/y8bd5n544j91wc8w
  • B Mobasher
  • X Jin
  • Zhou
Clustering methods for collaborative filtering Menlo Park California, pp 114–129 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.33.4026 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10
  • L Ungar
  • Foster
  • L Ungar
  • Foster
The art, science and business of recommendation engines
  • A Iskold
Iskold A (2007) The art, science and business of recommendation engines. Available at http:// www.readwriteweb.com/archives/recommendation_engines.php
Automatic information enrichment in news production In: Proceed-ings of the 10th international workshop on image analysis for multimedia interactive services
  • Mannens
Mannens E et al (2009) Automatic information enrichment in news production. In: Proceed-ings of the 10th international workshop on image analysis for multimedia interactive services, London, United Kingdom, pp 61–64
File-based broadcast workflows: on MAM systems and their integration demands
  • De Geyter
Do metrics make recommender algorithms? In: International conference on advanced information networking and applications workshops
  • E Campochiaro
Using social applications in ad campaigns
  • S Corcoran
Advanced video coding for generic audiovisual services
  • Iso Itu-T
  • Iec
Material exchange format (MXF)-file format specification
  • Smpte
Clustering methods for collaborative filtering
  • C Cornelis
Clustering methods for collaborative filtering In: Proceedings of the workshop on recommendation systems
  • L Ungar
  • D Foster