Conference Paper

ArnetMiner: extraction and mining of academic social networks

DOI: 10.1145/1401890.1402008 Conference: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Source: DBLP
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Link prediction is an important task in Social Network Anal-ysis. This problem refers to predicting the emergence of future relation-ships between nodes in a social network. Our work focuses on a super-vised machine learning approach for link prediction. Here, the target attribute is a class label indicating the existence or absence of a link be-tween a node pair. The predictor attributes are metrics computed from the network structure, describing the given pair. The majority of works for supervised prediction only considers unweighted networks. In this light, our aim is to investigate the relevance of using weights to improve supervised link prediction. Link weights express the 'strength' of rela-tionships and could bring useful information for prediction. However, the relevance of weights for unsupervised approaches of link prediction was not always verified (in some cases, the performance was even harmed). Our preliminary results on supervised prediction on a co-authorship net-work revealed satisfactory results when weights were considered, which encourage us for further analysis.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The Information Inference Framework presented in this paper provides a general-purpose suite of tools enabling the definition and execution of flexible and reliable data processing workflows whose nodes offer application-specific processing capabilities. The IIF is designed for the purpose of processing big data, and it is implemented on top of Apache Hadoop-related technologies to cope with scalability and high-performance execution requirements. As a proof of concept we will describe how the framework is used to support linking and contextualization services in the context of the OpenAIRE infrastructure for scholarly communication.
    Procedia Computer Science 11/2014;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Motivated by viral marketing, the problem of influence maximization in social networks has attracted much attention in recent years and several studies have been done on that. However, almost all of these studies are focused on the progressive influence models, such as independent cascade (IC) and Linear threshold (LT) models, which cannot capture the reversibility in actions. In this paper, we present the Heat Conduction (HC) model which is a non-progressive influence model and has favorable real world interpretations. We also show that HC unifies, generalizes, and extends the existing nonprogressive models, such as Voter model and non-progressive LT [1]. In addition, we tackle the influence maximization problem for HC, which is proved to be NP-hard, with a scalable and provably near-optimal solution; we prove that the influence spread is submodular under HC and apply the greedy method. To the best of our knowledge, we are the first to present a scalable solution for influence maximization under non-progressive LT model, as a special case of HC model. Our fast and efficient algorithm benefits from two key properties of the proposed HC framework, where we establish closed-form expressions for the influence function computation and the greedy seed selection. Through extensive experiments on several real and synthetic networks, we validate the efficacy of our algorithm and demonstrate that it outperforms the state-of-the-art methods in terms of both influence spread and scalability.

Full-text (2 Sources)

Available from
May 16, 2014