Article

An analysis of the collaboration network of the International Conference on Conceptual Modeling at the Age of 40

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The International Conference on Conceptual Modeling celebrated 40 years of existence at its 38th edition held in Salvador, Brazil, on 4–7 November 2019. As one of the most traditional and well-known conferences in the database area, it has its origins on the Entity-Relationship Model proposed by Peter P. Chen in 1975. To celebrate such an accomplishment, this article goes over the ER history from distinct perspectives. Overall, we investigate the complete ER collaboration network built on bibliographic data collected from DBLP, comprising its 38 editions held from 1979 to 2019. We analyze several aspects regarding the evolution of its network metrics, such as degree, clustering coefficient and average shortest path, over the four decades. In particular, we analyze the role of the most engaged ER authors, the number of distinct authors, institutions and published papers, and the evolution of some of the most frequent terms presented in the titles of its papers, as well as the influence and impact of the prominent ER authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In this way, a conceptual model can serve as a social artifact with respect to the need to capture a shared conceptualization of a group [81]. Much research that attempts to understand and characterize research on the development and application of conceptual modeling (e.g., [137,123,159,50,126,36]). The field of conceptual modeling has evolved over the past four decades and has been influenced by many disciplines including programming languages, software engineering, requirements engineering, database systems, ontologies, and philosophy. ...
... Conceptual modeling activities have been broadly applied in the development of information systems over a wide range of domains for varied purposes [50]. Activities and topics related to conceptual modeling have evolved over the past four decades [123,101]. Notably, Jaakkola and Thalheim [104] highlight the importance of modeling, especially with the current emphasis on the development of artificial intelligence (AI) and machine learning (ML) tasks. Other research has also proposed the need for conceptual modeling to support machine learning and, in general, combining conceptual modeling with artificial intelligence [81,137,123,159,50]. ...
... Notably, Jaakkola and Thalheim [104] highlight the importance of modeling, especially with the current emphasis on the development of artificial intelligence (AI) and machine learning (ML) tasks. Other research has also proposed the need for conceptual modeling to support machine learning and, in general, combining conceptual modeling with artificial intelligence [81,137,123,159,50]. Conceptual models are a "lens" through which humans gain an "intuitive, easy to understand, meaningful, direct and natural mental representation of a domain" [81]. ...
Preprint
Full-text available
Both conceptual modeling and machine learning have long been recognized as important areas of research. With the increasing emphasis on digitizing and processing large amounts of data for business and other applications, it would be helpful to consider how these areas of research can complement each other. To understand how they can be paired, we provide an overview of machine learning foundations and development cycle. We then examine how conceptual modeling can be applied to machine learning and propose a framework for incorporating conceptual modeling into data science projects. The framework is illustrated by applying it to a healthcare application. For the inverse pairing, machine learning can impact conceptual modeling through text and rule mining, as well as knowledge graphs. The pairing of conceptual modeling and machine learning in this this way should help lay the foundations for future research.
... In this way, a conceptual model can serve as a social artifact with respect to the need to capture a shared conceptualization of a group [81]. Much research that attempts to understand and characterize research on the development and application of conceptual modeling (e.g., [137,123,159,50,126,36]). The field of conceptual modeling has evolved over the past four decades and has been influenced by many disciplines including programming languages, software engineering, requirements engineering, database systems, ontologies, and philosophy. ...
... Conceptual modeling activities have been broadly applied in the development of information systems over a wide range of domains for varied purposes [50]. Activities and topics related to conceptual modeling have evolved over the past four decades [123,101]. Notably, Jaakkola and Thalheim [104] highlight the importance of modeling, especially with the current emphasis on the development of artificial intelligence (AI) and machine learning (ML) tasks. Other research has also proposed the need for conceptual modeling to support machine learning and, in general, combining conceptual modeling with artificial intelligence [81,137,123,159,50]. ...
... Notably, Jaakkola and Thalheim [104] highlight the importance of modeling, especially with the current emphasis on the development of artificial intelligence (AI) and machine learning (ML) tasks. Other research has also proposed the need for conceptual modeling to support machine learning and, in general, combining conceptual modeling with artificial intelligence [81,137,123,159,50]. Conceptual models are a "lens" through which humans gain an "intuitive, easy to understand, meaningful, direct and natural mental representation of a domain" [81]. ...
Article
Full-text available
Both conceptual modeling and machine learning have long been recognized as important areas of research. With the increasing emphasis on digitizing and processing large amounts of data for business and other applications, it would be helpful to consider how these areas of research can complement each other. To understand how they can be paired, we provide an overview of machine learning foundations and development cycle. We then examine how conceptual modeling can be applied to machine learning and propose a framework for incorporating conceptual modeling into data science projects. The framework is illustrated by applying it to a healthcare application. For the inverse pairing, machine learning can impact conceptual modeling through text and rule mining, as well as knowledge graphs. The pairing of conceptual modeling and machine learning in this this way should help lay the foundations for future research.
... Jaakkola and Thalheim [18] examine the progression of data models and their role in information systems development, highlighting the importance of modeling. Lima et al. [23] trace the evolution of conceptual models over four decades to identify emerging areas of interest. Härer and Fill [16] investigate the evolution of research topics related to conceptual modeling based on a bibliometric analysis of conceptual modeling research; their two major observations are that topics were related to the technical aspects of modeling and to business processes. ...
... Vision contributions may describe a general vision for some aspect of CM, discuss philosophical issues surrounding CM or CMR, or survey some aspect of the field. Our paper is an example of this type; others include Lima et al. [23] and Härer and Fill [16]. ...
Chapter
Full-text available
To contribute to the ongoing discussions related to understanding and organizing the field of conceptual modeling, this paper presents a reference framework for articulating conceptual modeling research. The framework accommodates the diverse nature of conceptual modeling research contributions. The framework can describe many styles of research, including empirical research. The framework was inspired by, and is able to characterize, a large set of published papers in conceptual modeling. The framework allows researchers and reviewers to acknowledge the contributions of their work. Using the framework to describe a research paper also promotes meaningful discussion among reviewers and readers.
... They concluded that conceptual modeling research is well established and note that conceptual modeling could be applied in humanities, law, and the natural sciences. Lima et al. [25] analyzed the collaboration network of authors who published in the first 40 years of the International Conference on Conceptual Modeling (the ER conference) using various graph metrics, e.g. degree, clustering coefficient, and average shortest path. ...
... Cosentino et al. (Cosentino et al., 2016) focused on authors and papers in conferences to automate conference analytics. Lima et al. (Lima et al., 2020) analyzed the collaboration of authors over forty years of participation in the International Conference on Conceptual Modeling. ...
... To guide the evolution of conceptual modeling research, various frameworks have been developed [131,146,158,184,196,236] where the authors consider the state of the art in theory and practice. Most of these efforts do not engage in a comprehensive survey of conceptual modeling publications. ...
Article
Full-text available
Conceptual modeling is an important part of information systems development and use that involves identifying and representing relevant aspects of reality. Although the past decades have experienced continuous digitalization of services and products that impact business and society, conceptual modeling efforts are still required to support new technologies as they emerge. This paper surveys research on conceptual modeling over the past five decades and shows how its topics and trends continue to evolve to accommodate emerging technologies, while remaining grounded in basic constructs. We survey over 5,300 papers that address conceptual modeling topics from the 1970s to the present, which are collected from 35 multidisciplinary journals and conferences, and use them as the basis from which to analyze the progression of conceptual modeling. The important role that conceptual modeling should play in our evolving digital world is discussed, and future research directions proposed.
... Indeed, Guarino [29] has long recognized that all information systems have ontologies that are not explicit, but embedded in parts of the systems. For conceptual modeling, research on ontologies has become an important, and active area of inquiry over the past fifteen years [46]. ...
Article
Since the first version of the Entity–Relationship (ER) model proposed by Peter Chen over forty years ago, both the ER model and conceptual modeling activities have been key success factors for modeling computer-based systems. During the last decade, conceptual modeling has been recognized as an important research topic in academia, as well as a necessity for practitioners. However, there are many research challenges for conceptual modeling in contemporary applications such as Big Data, data-intensive applications, decision support systems, e-health applications, and ontologies. In addition, there remain challenges related to the traditional efforts associated with methodologies, tools, and theory development. Recently, novel research is uniting contributions from both the conceptual modeling area and the Artificial Intelligence discipline in two directions. The first one is efforts related to how conceptual modeling can aid in the design of Artificial Intelligence (AI) and Machine Learning (ML) algorithms. The second one is how Artificial Intelligence and Machine Learning can be applied in model-based solutions, such as model-based engineering, to infer and improve the generated models. For the first time in the history of Conceptual Modeling (ER) conferences, we encouraged the submission of papers based on AI and ML solutions in an attempt to highlight research from both communities. In this paper, we present some of important topics in current research in conceptual modeling. We introduce the selected best papers from the 37th International Conference on Conceptual Modeling (ER’18) held in Xi’an, China and summarize some of the valuable contributions made based on the discussions of these papers. We conclude with suggestions for continued research.
Article
Full-text available
Scientific agenda setting is critical at all levels of research, but can be strongly influenced by structural path dependencies of the science system itself. In this article we examine how knowledge production is shaped by interconnected path dependencies using the field of tropical marine sciences as a global case study. We use scientometric analysis methods on an original data set of 1328 peer-reviewed journal publications to examine publication trends including a co-authorship network analysis, links between author origin and research locations as well as a quantitative analysis of terminology use over space (i.e., region) and time. Scientometric findings are analytically discussed through a conceptual framework premised on theories of path dependency. Findings and critical analysis highlight how tropical marine science provides a prominent global example of how North American, European and Australian science programs predominantly shape knowledge production of the global science system, generating critical reflection on the path dependencies these create on current and likely future knowledge production and science agendas. Similar dependencies face other fields of science, and thus this study provides broadly relevant quantitative observational empirical findings supplemented with a critical social science analysis of how a transcultural Science and Technology Studies lens is useful for unpacking the webs of path dependencies driving, inhibiting and/ or shaping global knowledge production, placing meaning and context over observed empirical trends.
Article
Full-text available
The analysis presented here focuses on mapping, based on publication output, the scientific collaboration of African based researchers and the role of the South African research community as a channel for within- and intercontinental collaborations. We have selected 10 scientific fields, namely, Tropical Medicine, Parasitology, Infectious Disease, Ecology, Water Resources, Immunology, Zoology, Plant Sciences, Agricultural and Food Sciences, and Psychology to gain a clear picture of the aforementioned scientific activity. As a first step, we created cooperation networks and visualized them on world-maps. In addition, centrality measures of the network were calculated to see the frequency of involvement regarding different countries, with a focus on South Africa, in the collaboration process. Furthermore, first and last authorship positions of the publications were summed to highlight the influence of the selected authors on the direction of and resources provided to the publications. Finally, the most prominent funding organizations and their focus on the selected fields were singled out. Through combining these steps of analysis, we gained an accurate picture of the level of involvement of the South African research community in within- and intercontinental scientific collaboration.
Article
Full-text available
Australia is a vast country with an average distance of 1911 km between its eight state capital cities. The quantitative impact of this distance on collaboration practices between Australian universities and between different types of Australian universities has not been examined previously and hence our knowledge about the spatial distribution effects, if any, on collaboration practices and opportunities is very limited. The aim of the study reported here was therefore to analyse the effect of distance on the collaboration activities of humanities, arts and social science scholars in Australia, using co-authorship as a proxy for collaboration. In order to do this, gravity models were developed to determine the distance effects on external collaboration between universities in relation to geographic region and institutional alliance of 25 Australian universities. Although distance was found to have a weak impact on external collaboration, the strength of the research publishing record within a university (internal collaboration) was found to be an important factor in determining external collaboration activity levels. This finding would suggest that increasing internal collaboration within universities could be an effective strategy to encourage external collaboration between universities. This strategy becomes even more effective for universities that are further away from each other. Establishing a hierarchical structure of different types of universities within a region can optimise the location advantage in the region to encourage knowledge exchange within that region. The stronger network could also attract more collaboration between networks.
Article
Full-text available
The Brazilian Symposium on Databases (SBBD) celebrated its 30th edition in October 2015. As the database community has evolved over the years, so has the data analysis area. To celebrate such accomplishments, this article goes over the SBBD history from distinct social perspectives. Overall, we investigate the complete SBBD co-authorship network built from bibliographic data of SBBD’s 30 editions, from 1986 to 2015, and analyze several network metrics, considering the network evolution over the three decades. In particular, we analyze the progress of the most engaged SBBD authors, the number of distinct authors, institutions, and published papers, and the evolution of some of the most frequent terms presented in the titles of the papers, as well as the influence and impact of the most prominent SBBD authors.
Conference Paper
Full-text available
Conferences have become primary sources of dissemination in computer science research, in particular, in the software engineering and database fields. Assessing the quality, scope and community of conferences is therefore crucial for any researcher. However, digital libraries and online bibliographic services offer little help on this, thus providing only basic metrics. Researchers are instead forced to resort to the tedious task of manually browsing different sources (e.g., DBLP, Google Scholar or conference sites) to gather relevant information about a given venue. In this paper we propose a conceptual schema providing a holistic view of conference-related information (e.g., authors, papers, committees and topics). This schema is automatically and incrementally populated with data available online. We show how this schema can be used as a single information source for a variety of complex queries and metrics to characterize the ER conference. Our approach has been implemented and made available online.
Article
Full-text available
Quantitative and qualitative studies of scientific performance provide a measure of scientific productivity and represent a stimulus for improving research quality. Whatever the goal (e.g., hiring, firing, promoting or funding), such analyses may inform research agencies on directions for funding policies. In this article, we perform a data-driven assessment of the performance of top Brazilian computer science researchers considering three central dimensions: career length, number of students mentored, and volume of publications and citations. In addition, we analyze the researchers’ publishing strategy, based upon their area of expertise and their focus on venues of different impact. Our findings demonstrate that it is necessary to go beyond counting publications to assess research quality and show the importance of considering the peculiarities of different areas of expertise while carrying out such an assessment.
Article
Full-text available
Research productivity assessment is increasingly relevant for allocation of research funds. On one hand, this assessment is challenging because it involves both qualitative and quantitative analysis of several characteristics, most of them subjective in nature. On the other hand, current tools and academic social networks make bibliometric data web-available to everyone for free. Those tools, especially when combined with other data, are able to create a rich environment from which information on research productivity can be extracted. In this context, our work aims at characterizing the Brazilian Computer Science graduate programs and the relationship among themselves. We (i) present views of the programs from different perspectives, (ii) rank the programs according to each perspective and a combination of them, (iii) show correlation between assessment metrics, (iv) discuss how programs relate to another, and (v) infer aspects that boost programs' research productivity. The results indicate that programs with a higher insertion in the coauthorship network topology also possess a higher research productivity between 2004 and 2009.
Article
Full-text available
The Brazilian symposium on computer networks and distributed systems (SBRC) reached its 30th edition as the paramount scientific event in the area of computer networks and distributed systems in Brazil. Faced with this opportune moment in the event’s history, we here study the collaboration network established among authors who have jointly published in the symposium. Towards that end, we collected bibliographic data from all 30 editions, and built the co-authorship network of the event. We then analyzed the network structural features and evolution throughout its history. Our results reveal the main kind of co-author relationship among authors, show the most prominent communities within SBRC, the regions of Brazil that attracts the most authors, the researchers with central roles in the network as well as the importance of inter-state collaborations. Finally, we align our results with historical facts that may have had a key impact on the symposium success.
Article
Full-text available
The Brazilian Lattes Platform is an important academic/resume dataset that registers all of the academic activity of researchers associated with different major knowledge areas. Currently, the activity of over a million researchers has been registered in this dataset. The academic information collected in this dataset is used to evaluate, analyze, and document the scientific production of research groups. Information about the interactions between Brazilian researchers in the form of co-authorships, however, has not been analyzed. In this paper we identified and characterized Brazilian academic co-authorship networks of researchers registered in the Lattes Platform, using topological properties of graphs. For this purpose, we explored (i) strategies to develop a very large Lattes curricula dataset, (ii) an algorithm for identifying automatic co-authorships based on bibliographic information, and (iii) topological metrics to investigate interactions among researchers. The aim of our study was to characterize co-authorship networks to gain an in-depth understanding of the network structures and dynamics (social behavior) among researchers in all available Brazilian major knowledge areas. In this study, we evaluated information from a total of 1,131,912 researchers associated with the 8 major Brazilian knowledge areas: Agricultural Sciences; Biological Sciences; Exact and Earth Sciences; Humanities; Applied Social Sciences; Health Sciences; Engineering; and Linguistics, Letters and Arts.
Conference Paper
Full-text available
The demand for quality assessment criteria and associated evaluation methods in academia is increasing and has been the focus of many studies in the last decade. This growth arises due to the pursuit of academic excellence and support for the decision making of funding agencies. The high pressure from such scenario requires quality criteria objectively defined. In this paper, we develop an assessment procedure for graduate programs evaluation based on the internal collaborations among their research groups. These collaborations are evaluated through analysis on co-authorships networks based on novel metrics of social interaction. Furthermore, our procedure is easily reproduced and may be customized for evaluating any set of research groups. Our experiments show that the ranking provided by our metrics are according to the based (which is the official ranking defined by a national agency).
Conference Paper
Full-text available
In social network analysis, a k-clique is a relaxed clique, i.e., a k-clique is a quasi-complete sub-graph. A k-clique in a graph is a sub-graph where the distance between any two vertices is no greater than k. The visualization of a small number of vertices can be easily performed in a graph. However, when the number of vertices and edges increases the visualization becomes incomprehensible. In this paper, we propose a new graph mining approach based on k-cliques. The concept of relaxed clique is extended to the whole graph, to achieve a general view, by covering the network with k-cliques. The sequence of k-clique covers is presented, combining small world concepts with community structure components. Computational results and examples are presented.
Conference Paper
Full-text available
We analyze knowledge production in Computer Science by means of coauthorship networks. For this, we consider 30 graduate programs of different regions of the world, being 8 programs in Brazil, 16 in North America (3 in Canada and 13 in the United States), and 6 in Europe (2 in France, 1 in Switzerland and 3 in the United Kingdom). We use a dataset that consists of 176,537 authors and 352,766 publication en- tries distributed among 2,176 publication venues. The re- sults obtained for different metrics of collaboration social networks indicate the process of knowledge production has ùachanged differently for each region. Research is increasingly done in teams across different fields of Computer Science. The size of the giant component indicates the existence of isolated collaboration groups in the European network, con- trasting to the degree of connectivity found in the Brazilian and North-American counterparts. We also analyzed the temporal evolution of the social networks representing the three regions. The number of authors per paper experienced an increase in a time span of 12 years. We observe that the number of collaborations between authors grows faster than the number of authors, benefiting from the existing network structure. The temporal evolution shows differences between well-established fields, such as Databases and Com- puter Architecture, and emerging fields, like Bioinformatics and Geoinformatics. The patterns of collaboration analyzed in this paper contribute to an overall understanding of Com- puter Science research in different geographical regions that could not be achieved without the use of complex networks and a large publication database.
Article
Full-text available
This article reports about a study conducted to assess the quality of the top Brazilian Computer Science graduate programs. The study is based on data from DBLP and considers the scientific production of these programs in the triennial 2004--2006. A comparison of the scientific production of the Brazilian programs against that of reputable programs in North America and Europe indicates that the former compares well with these programs, both in terms of publication rate and number of graduates. The study also shows that the Brazilian programs follow international publication ratios of more than two conference papers per journal article. These results are a clear indication that the Computer Science field has reached maturity in Brazil.
Article
Full-text available
This paper analysis the distribution of some characteristics of com- puter scientists in Brazil according to regions and gender. Computer scientist is defined as the faculty of a graduate level computer science department. Under this definition, there were 886 computer scientists in Brazil in November 2006.
Article
Full-text available
The structure, dynamics, and importance of the social network of collaboration among scientists has been already studied, sometimes yielding counter-intuitive conclusions. In this paper we investigate the role played by people who served as PC (Program Committee) members in the network formed by members of the Brazilian computer science community and their co-authors. Some characteristics of such network are compared with those reported in similar studies involving other scientific collaboration networks. As a result, we show that apart from the evidence of Milgram’s phenomenon (six degrees of separation), there is no other community with completely similar patterns (among those used for comparison). This is probably due to the unique characteristics of the target network. For instance, their members do not necessarily interact with each other in terms of co-authorship since they belong to different sub-areas of computer science. There are strong evidences that the clusters in this network are connected by non-Brazilian members. Moreover, nodes with high degrees have little connection to Brazilian authors.
Article
Full-text available
Collaboration networks are social networks in which relationships represent some kind of professional collaboration. The study of collaboration networks can help identify individuals or groups that are important or influential within a given community. We start this work by characterizing the structural properties of the scientific collaboration network in the area of Computer Science. In particular, we consider the global network (all individuals) and the Brazilian network (individuals affiliated with Brazilian institutions) and establish a direct comparison between them. Our empirical results indicate that despite exhibiting features found in most social networks, these two networks also have some interesting differences. We then present a novel approach to rank individuals within a group in the network (as opposed to ranking all individuals) using solely their relationships. Intuitively, the importance assigned to an individual by our metric is proportional to the intensity of its relationship to the outside of the group. We use the proposed approach and other classical metrics to rank individuals of the Brazilian network and compare the results with the ranking of the Research Fellowship Program of CNPq (an agency of the Brazilian Ministry of Science and Technology). The direct comparison indicates the effectiveness of the proposed approach in identifying influential researchers, in particular when considering top ranked individuals. We then extend the proposed approach to rank small groups of individuals (as opposed to single individuals). We apply this and other classical metrics to rank graduate programs in Computer Science in Brazil and compare the results with the ranking of graduate programs provided by CAPES (an agency of the Brazilian Ministry of Education). Our results indicate that the proposed method can effectively identify influential groups such as well-established graduate programs in Brazil.
Article
Full-text available
Many complex systems in nature and society can be described in terms of networks capturing the intricate web of connections among the units they are made of. A key question is how to interpret the global organization of such networks as the coexistence of their structural subunits (communities) associated with more highly interconnected parts. Identifying these a priori unknown building blocks (such as functionally related proteins, industrial sectors and groups of people) is crucial to the understanding of the structural and functional properties of networks. The existing deterministic methods used for large networks find separated communities, whereas most of the actual networks are made of highly overlapping cohesive groups of nodes. Here we introduce an approach to analysing the main statistical features of the interwoven sets of overlapping communities that makes a step towards uncovering the modular structure of complex systems. After defining a set of new characteristic quantities for the statistics of communities, we apply an efficient technique for exploring overlapping communities on a large scale. We find that overlaps are significant, and the distributions we introduce reveal universal features of networks. Our studies of collaboration, word-association and protein interaction graphs show that the web of communities has non-trivial correlations and specific scaling properties.
Article
Identifying female CS scientists by combining a robust bibliographic database and name filtering tools.
Article
Knowing driving factors and understanding researcher behaviors from the dynamics of collaborations over time offer some insights, i.e. help funding agencies in designing research grant policies. We present longitudinal network analysis on the observed collaborations through co-authorship over 15 years. Since co-authors possibly influence researchers to have interest changes, by focusing on researchers who could become the influencer, we propose a stochastic actor-oriented model of bipartite (two-mode) author-topic networks from article metadata. Information of scientific fields or topics of article contents, which could represent the interests of researchers, are often unavailable in the metadata. Topic absence issue differentiates this work with other studies on collaboration dynamics from article metadata of title-abstract and author properties. Therefore, our works also include procedures to extract and map clustered keywords as topic substitution of research interests. Then, the next step is to generate panel-waves of co-author networks and bipartite author-topic networks for the longitudinal analysis. The proposed model is used to find the driving factors of co-authoring collaboration with the focus on researcher behaviors in interest changes. This paper investigates the dynamics in an academic social network setting using selected metadata of publicly-available crawled articles in interrelated domains of “natural language processing” and “information extraction”. Based on the evidence of network evolution, researchers have a conformed tendency to co-author behaviors in publishing articles and exploring topics. Our results indicate the processes of selection and influence in forming co-author ties contribute some levels of social pressure to researchers. Our findings also discussed on how the co-author pressure accelerates the changes of interests and behaviors of the researchers.
Conference Paper
sonSQL is a MySQL variant that aims to be the default database system for social network data. It uses a conceptual schema called sonSchema to translate a social network design into logical tables. This paper introduces sonSchema, shows how it can be instantiated, and illustrates social network analysis for sonSchema datasets. Experiments show such SQL-based analysis brings insight into community evolution, cluster discovery and action propagation.
Article
To identify and analyze the collaboration pattern between scientists in computer science discipline, we constructed a new coauthorship network by extracting the article information of the whole year of 2012 from DBLP library The nodes are scientists and two scientists are connected if they have coauthored a paper. We study the network in depth by the theory of complex network. The empirical analysis shows that the productivity of authors follows a power-law distribution and the size of collaboration follows an approximate exponential distribution. Moreover, we find that the network is a small-world network and have no apparent scale-free property. Finally, four different indicators are employed to study the impact of authors.
Conference Paper
This paper proposes a conceptual modelling paradigm for network analysis applications, called the Network Analytics ER model (NAER). Not only data requirements but also query requirements are captured by the conceptual description of network analysis applications. This unified analytical framework allows us to flexibly build a number of topology schemas on the basis of the underlying core schema, together with a collection of query topics that describe topological results of interest. In doing so, we can alleviate many issues in network analysis, such as performance, semantic integrity and dynamics of analysis.
Conference Paper
With the advent of social media there is an ever increasing amount of unstructured data that can be analyzed to obtain insights. Two prominent examples are sentiment analysis and the discovery of correlated concepts. A convenient representation of information in such scenarios is in terms of concepts extracted from the unstructured data, and measures, such as sentiment scores, associated with these concepts. Typically, social media analysis reports these concepts and their associated measures. We argue that much richer insights can be obtained through the use of OLAP-style multidimensional analysis. It is fairly straightforward to see how to add traditional dimension hierarchies such as time and geography, and to analyze the data along these dimensions using traditional OLAP operations such as roll-up; for instance, to answer queries of the form "What was the average sentiment for X in Europe during the past month?" However, it is trickier to answer queries of the form "What was the average sentiment for concepts related to X in Europe during the past month?" We introduce a conceptual modeling framework that extends traditional multidimensional models and OLAP operators to address the new set of requirements for data extracted from social media. In this model, we organize data along both traditional dimensions (we call these metadata dimensions) and concept dimensions, which model relationships among concepts using parent-child hierarchies. Specifically: (i) we allow operations on parent-child hierarchies to be treated in a uniform way as operations on traditional dimension hierarchies; (ii) to model the rich relationships that can exist among concepts, we extend the parent-child hierarchies to be rooted level-DAGs rather than simply trees; and (iii) we introduce new equivalence classes that allow us to reason with "similar" concepts in new ways. We show that our modeling and operator framework facilitates multidimensional analysis to gain further insights from social media data than is possible with existing methods.
Article
In this paper we investigate the co-authorship graph obtained from all papers published at SIGMOD between 1975 and 2002. We find some interesting facts, for instance, the identity of the authors who, on average, are "closest" to all other authors at a given time. We also show that SIGMOD 's co-authorship graph is yet another example of a small world---a graph topology which has received a lot of attention recently. A companion web site for this paper can be found at http://db.cs.ualberta.ca/coauthorship.
Article
Some facts and figures representing 30 years of history of the ACM Symposium on Principles of Database Systems (PODS) are presented. A community is defined as the maximal union of k-cliques that can be reached from each other through a series of adjacent k-cliques. Looking at the names inside communities it is found that cultural-geographic origin is a strong community-builder. PODS is influenced by and influences other fields of research. Such influence could be measured by examining the conference venues in which PODS-researchers publish. While the number of PODS-papers do not drastically change over the years, the database theory pie decreases from 15% to 5%, in favor of for instance, database systems which increased from 23 to 32%. The success of a community is also reflected by the number of people attending its main event even when they do not have a paper.
Article
A data model, called the entity-relationship model, is proposed. This model incorporates some of the important semantic information about the real world. A special diagrammatic technique is introduced as a tool for database design. An example of database design and description using the model and the diagrammatic technique is given. Some implications for data integrity, information retrieval, and data manipulation are discussed. The entity-relationship model can be used as a basis for unification of different views of data: the network model, the relational model, and the entity set model. Semantic ambiguities in these models are analyzed. Possible ways to derive their views of data from the entity-relationship model are presented.
Article
Using data from computer databases of scientific papers in physics, biomedical research, and computer science, we have constructed networks of collaboration between scientists in each of these disciplines. In these networks two scientists are considered connected if they have coauthored one or more papers together. We have studied many statistical properties of our networks, including numbers of papers written by authors, numbers of authors per paper, numbers of collaborators that scientists have, typical distance through the network from one scientist to another, and a variety of measures of connectedness within a network, such as closeness and betweenness. We further argue that simple networks such as these cannot capture the variation in the strength of collaborative ties and propose a measure of this strength based on the number of papers coauthored by pairs of scientists, and the number of other scientists with whom they worked on those papers. Using a selection of our results, we suggest a variety of possible ways to answer the question, "Who is the best connected scientist?"
Article
Networks of coupled dynamical systems have been used to model biological oscillators, Josephson junction arrays, excitable media, neural networks, spatial games, genetic control networks and many other self-organizing systems. Ordinarily, the connection topology is assumed to be either completely regular or completely random. But many biological, technological and social networks lie somewhere between these two extremes. Here we explore simple models of networks that can be tuned through this middle ground: regular networks 'rewired' to introduce increasing amounts of disorder. We find that these systems can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs. We call them 'small-world' networks, by analogy with the small-world phenomenon (popularly known as six degrees of separation. The neural network of the worm Caenorhabditis elegans, the power grid of the western United States, and the collaboration graph of film actors are shown to be small-world networks. Models of dynamical systems with small-world coupling display enhanced signal-propagation speed, computational power, and synchronizability. In particular, infectious diseases spread more easily in small-world networks than in regular lattices.
Article
Using computer databases of scientific papers in physics, biomedical research, and computer science, we have constructed networks of collaboration between scientists in each of these disciplines. In these networks two scientists are considered connected if they have coauthored one or more papers together. Here we study a variety of nonlocal statistics for these networks, such as typical distances between scientists through the network, and measures of centrality such as closeness and betweenness. We further argue that simple networks such as these cannot capture variation in the strength of collaborative ties and propose a measure of collaboration strength based on the number of papers coauthored by pairs of scientists, and the number of other scientists with whom they coauthored those papers.
Article
By using data from three bibliographic databases in biology, physics, and mathematics, respectively, networks are constructed in which the nodes are scientists, and two scientists are connected if they have coauthored a paper. We use these networks to answer a broad variety of questions about collaboration patterns, such as the numbers of papers authors write, how many people they write them with, what the typical distance between scientists is through the network, and how patterns of collaboration vary between subjects and over time. We also summarize a number of recent results by other authors on coauthorship patterns.
Article
I propose the index h, defined as the number of papers with citation number ≥h, as a useful index to characterize the scientific output of a researcher. • citations • impact • unbiased
Article
Long a matter of folklore, the "small-world phenomenon" --- the principle that we are all linked by short chains of acquaintances --- was inaugurated as an area of experimental study in the social sciences through the pioneering work of Stanley Milgram in the 1960's. This work was among the first to make the phenomenon quantitative, allowing people to speak of the "six degrees of separation" between any two people in the United States. Since then, a number of network models have been proposed as frameworks in which to study the problem analytically. One of the most refined of these models was formulated in recent work of Watts and Strogatz; their framework provided compelling evidence that the small-world phenomenon is pervasive in a range of networks arising in nature and technology, and a fundamental ingredient in the evolution of the World Wide Web. But existing models are insu#cient to explain the striking algorithmic component of Milgram's original findings: that individuals using local information are collectively very e#ective at actually constructing short paths between two points in a social network. Although recently proposed network models are rich in short paths, we prove that no decentralized algorithm, operating with local information only, can construct short paths in these networks with non-negligible probability. We then define an infinite family of network models that naturally generalizes the Watts-Strogatz model, and show that for one of these models, there is a decentralized algorithm capable of finding short paths with high probability. More generally, we provide a strong characterization of this family of network models, showing that there is in fact a unique model within the family for which decentralized algorithms are e#ect...
Article
Complex networks describe a wide range of systems in nature and society, much quoted examples including the cell, a network of chemicals linked by chemical reactions, or the Internet, a network of routers and computers connected by physical links. While traditionally these systems were modeled as random graphs, it is increasingly recognized that the topology and evolution of real networks is governed by robust organizing principles. Here we review the recent advances in the field of complex networks, focusing on the statistical mechanics of network topology and dynamics. After reviewing the empirical data that motivated the recent interest in networks, we discuss the main models and analytical tools, covering random graphs, small-world and scale-free networks, as well as the interplay between topology and the network's robustness against failures and attacks. Comment: 54 pages, submitted to Reviews of Modern Physics
  • A Barabási
A. Barabási, Network Science, Cambridge University Press, 2016.
Social Network Analysis -Methods and Applications
  • S Wasserman
  • K Faust
S. Wasserman, K. Faust, Social Network Analysis -Methods and Applications, Structural Analysis in the Social Sciences, vol. 8, Cambridge University Press, 2007.
Il-Yeol Song, Trends in citation analysis of the ER conference papers
  • C Chen
C. Chen, W.Z. Il-Yeol Song, Trends in citation analysis of the ER conference papers (1979-2005), in: Proceedings of the 11th International Conference on the International Society for Scientometrics and Informetrics, Society for Scientometrics and Informatrics, Leuven, Belgium, 2007, pp. 189-200.
Aggregating productivity indices for ranking researchers across multiple areas
  • H Lima
  • T H P Silva
  • M M Moro
  • R L T Santos
  • W Meira
  • A H F Laender
H. Lima, T.H.P. Silva, M.M. Moro, R.L.T. Santos, W. Meira Jr., A.H.F. Laender, Aggregating productivity indices for ranking researchers across multiple areas, in: Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries, 2013, pp. 97-106.
Analysis of papers from twenty-five years of SIGIR conferences: What have we been doing for the last quarter of a century?
  • A F Smeaton
  • G Keogh
  • C Gurrin
  • K Mcdonald
  • T Sødring
A.F. Smeaton, G. Keogh, C. Gurrin, K. McDonald, T. Sødring, Analysis of papers from twenty-five years of SIGIR conferences: What have we been doing for the last quarter of a century?, SIGIR Forum 37 (1) (2003) 49-53.
  • X Liu
  • J Bollen
  • M L Nelson
  • H Van De Sompel
X. Liu, J. Bollen, M.L. Nelson, H. Van de Sompel, Co-authorship networks in the digital library research community, Inf. Process. Manage. 41 (6) (2005) 1462-1480.
Mirella has been working with research in Computer Science in the area of Databases since 1997. Her research interests include social networks analysis, query optimization, and hybrid database modeling
  • M Sc
  • B Sc
Mirella M. Moro is associate professor at the Computer Science department at UFMG (Belo Horizonte, Brazil). She holds a Ph.D. in Computer Science (University of California Riverside -UCR, 2007), and M.Sc. and B.Sc. in Computer Science as well (UFRGS, Brazil). She was a member of the ACM Education Council and the Education Director of SBC (Brazilian Computer Society, 2009-2015), the editor-in-chief of the electronic magazine SBC Horizontes (2008-2012), and associated editor of JIDM (2010-2012). Mirella has been working with research in Computer Science in the area of Databases since 1997. Her research interests include social networks analysis, query optimization, and hybrid database modeling. Her recent publications include papers on prestigious venues such as Scientometrics, Data & Knowledge Engineering, ACM Hypertext, IEEE/WIC/ACM Web Intelligence, TPDL, JCDL as well as JIDM and SBBD. She is also an advocate for increasing women participation in Computer Science, coordinating projects such as BitGirls.