Thesis

Digital traces, sociology and twitter: between false promises and real potential

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

We are told that society changes. It evolves toward a more fluid, active and horizontal form of socialisation (Bauman, 2000; Castells, 2009; Sheller & Urry, 2006; Urry, 2000; Wittel, 2001). As much as a change in the social, it is also a change in the conception sociology gives to the sociality and society as large. There is a shift in social theory with recent raises of post-demographic perspective, and postmodernism (Latour, 2005, 2011, 2013; Ruppert, Law, & Savage, 2013).Along these, post changes, another revolution takes place, in the use of new technology and mainly the Web. These new uses are associated with an explosion of the quantity of digital traces, often labelled as Big Data.This Big Data paradigm brings radical changes in research that are perfectly suitable for computational researchers and other data scientists, and they are taking full advantage of it (Hey, Tansley, & Tolle, 2009). This new landscape, in society and in research, puts pressure on sociology, and other social sciences fields, to find adequate answers to these societal, theoretical and methodological challenges. The best proxy of these challenges and the tensions between scientific fields and the new form of social interactions are the Social Network Sites (SNSs). They represent the extreme case of horizontal and fluid interactions while producing an incredible amount of accessible digital traces.These digital traces are the essential bricks for all the research using SNSs. But despite this importance, few researches actually investigate and explain how these digital traces are produced and what is the impact of their context of access, collection and aggregation. This thesis focuses on these digital traces and the gap left in the literature. This empty space is however conceived as a central point of tension between sociological positions on how to define new social interactions, and methodological principles imposed by the logic of Big Data.The work is articulated around one specific social network, Twitter. The reason for this choice lays in its openness, the easy use of its APIs, and, in consequence, by the fact it is the most extensively studied SNS for now.I begin the work on the definition of the new form of sociality using the network concept as the key concept around which several notions, such as social, cultural and technological can be articulated. I conclude that none of these evolutions are independent and need to be seen as co-integrated. In consequence, the change in the social interaction needs to be seen as much as a factual change, than a change in our way to interpret it. From this conception of the network and the importance in our understanding of social interactions, I retrace the evolution of the notion of Big Data, specifically with the example of Tesco and their ClubCard. This is the first step to locate the technological changes into a more comprehensive methodological framework. This framework, the transactional perspective, is decomposed to understand the consequence of such position applied on SNSs and specifically on Twitter. This is the first explanation of why the research almost entirely focuses on the Tweets and what are the consequences on our understanding of the interaction on the social web service. Then I use this first iteration of the definition of a digital trace to build a new definition of what a Social Network Site is and centre this definition around the concept of activity and context.I operationalise these concepts on Twitter to develop a new method to capture social interaction and digital traces that are often put aside due to the difficulty of their access. This method takes into account the limits imposed by the Twitter APIs and describes the consequences they have on the generation of a dataset.The method is based on a constant screening of sampled profiles over time. This method allows people/us/you to reconstruct the missing information in the profile (the trace of the changes in the friends’ and followers’ lists). This information creates the measure of context (the user’s network) and activity (tweeting and adding or removing links) defined earlier.The obtained dataset provides an opportunity to see the importance of the aggregation process and the flexibility offered by the digital traces. Then following this, I developed three analyses with different levels of aggregation for different purposes. The first analysis was to test the hypothesis of the influence on users’ context activity on their own activity over time. The second analysis, did not use the time as a measure of aggregation but tested the same hypothesis on an individual level. And finally, the information about the activity itself is analysed in order to see to which extent the digital traces obtainable contain the sufficient information about the change in activity itself.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
We conceive social media platforms as sociotechnical entities that variously shape user platform involvement and participation. Such shaping develops along three fundamental data operations that we subsume under the terms of encoding, aggregation, and computation. Encoding entails the engineering of user platform participation along narrow and standardized activity types (e.g., tagging, liking, sharing, following). This heavily scripted platform participation serves as the basis for the procurement of discrete and calculable data tokens that are possible to aggregate and, subsequently, compute in a variety of ways. We expose these operations by investigating a social media platform for shopping. We contribute to the current debate on social media and digital platforms by describing social media as posttransactional spaces that are predominantly concerned with charting and profiling the online predispositions, habits, and opinions of their user base. Such an orientation sets social media platforms apart from other forms of mediating online interaction. In social media, we claim, platform participation is driven toward an endless online conversation that delivers the data footprint through which a computed sociality is made the source of value creation and monetization.
Article
Full-text available
Research typically focuses on one medium. But in today’s digital media environment, people use and are influenced by their experience with multiple systems. Building on media ecology research, we introduce the notion of integrated media effects. We draw on resource dependence and homophily theories to analyze the mechanisms that connect media systems. To test the integrated media effects, we examine the relationships between news media visibility and social media visibility and hyperlinking patterns among 410 nongovernmental organization (NGO) websites in China. NGOs with greater news media visibility and more social media followers receive significantly more hyperlinks. Further, NGOs with a similar number of social media followers prefer to hyperlink to each other. The results suggest that both news media and social media systems are related to the configuration of hyperlink networks, providing support for the integrated media effects described. Implications for the study of hyperlink networks, online behaviors of organizations, and public relations are drawn from the results.
Article
Full-text available
(abridged for arXiv) With the first direct detection of gravitational waves, the Advanced Laser Interferometer Gravitational-wave Observatory (LIGO) has initiated a new field of astronomy by providing an alternate means of sensing the universe. The extreme sensitivity required to make such detections is achieved through exquisite isolation of all sensitive components of LIGO from non-gravitational-wave disturbances. Nonetheless, LIGO is still susceptible to a variety of instrumental and environmental sources of noise that contaminate the data. Of particular concern are noise features known as glitches, which are transient and non-Gaussian in their nature, and occur at a high enough rate so that accidental coincidence between the two LIGO detectors is non-negligible. In this paper we describe an innovative project that combines crowdsourcing with machine learning to aid in the challenging task of categorizing all of the glitches recorded by the LIGO detectors. Through the Zooniverse platform, we engage and recruit volunteers from the public to categorize images of glitches into pre-identified morphological classes and to discover new classes that appear as the detectors evolve. In addition, machine learning algorithms are used to categorize images after being trained on human-classified examples of the morphological classes. Leveraging the strengths of both classification methods, we create a combined method with the aim of improving the efficiency and accuracy of each individual classifier. The resulting classification and characterization should help LIGO scientists to identify causes of glitches and subsequently eliminate them from the data or the detector entirely, thereby improving the rate and accuracy of gravitational-wave observations. We demonstrate these methods using a small subset of data from LIGO's first observing run.
Article
Full-text available
Social media can be viewed as a social system where the currency is attention. People post content and interact with others to attract attention and gain new followers. In this paper, we examine the distribution of attention across a large sample of users of a popular social media site Twitter. Through empirical analysis of these data we conclude that attention is very unequally distributed: the top 20\% of Twitter users own more than 96\% of all followers, 93\% of the retweets, and 93\% of the mentions. We investigate the mechanisms that lead to attention inequality and find that it results from the "rich-get-richer" and "poor-get-poorer" dynamics of attention diffusion. Namely, users who are "rich" in attention, because they are often mentioned and retweeted, are more likely to gain new followers, while those who are "poor" in attention are likely to lose followers. We develop a phenomenological model that quantifies attention diffusion and network dynamics, and solve it to study how attention inequality grows over time in a dynamic environment of social media.
Article
Full-text available
Berger and Luckmann’s concept of “social construction” has been widely adopted in many fields of the humanities and social sciences in the half-century since they wrote The Social Construction of Reality. One field in which constructivism was especially provocative was in Science and Technology Studies (STS), where it was expanded beyond the social domain to encompass the practices and contents of contemporary natural science. This essay discusses the relationship between social construction in STS and Berger and Luckmann’s original conception of it, and identifies problems that arose from indiscriminate uses of constructivism.
Article
Full-text available
This essay concerns itself with a detailed presentation of the contents of two works on the construction of reality in society. One is Berger and Luckmann’s well-known book (The social construction of reality. Penguin Book, Harmondsworth, 1966); the other is the somewhat less known book on the same subject by Holzner (Reality construction in society. Schenkman Publishing Company, Cambridge, 1972; orig. 1968). These works deal with the social construction of shared symbolic and cognitive universes of meaning, and partake of the same theoretical sources, namely, Symbolic Interactionism and Schutz’s phenomenological sociology. They differ in their theoretical pursuits, however. Berger and Luckmann deal with what is necessary for the construction of a shared symbolic world, as society is endowed with both an objective and a subjective reality. Holzner is more interested in the social distribution and control of reality construction, and in the different types of reality constructs.
Article
Full-text available
Big social data have enabled new opportunities for evaluating the applicability of social science theories that were formulated decades ago and were often based on small- to medium-sized samples. Big Data coupled with powerful computing has the potential to replace the statistical practice of sampling and estimating effects by measuring phenomena based on full populations. Preparing these data for analysis and conducting analytics involves a plethora of decisions, some of which are already embedded in previously collected data and built tools. These decisions refer to the recording, indexing and representation of data and the settings for analysis methods. While these choices can have tremendous impact on research outcomes, they are not often obvious, not considered or not being made explicit. Consequently, our awareness and understanding of the impact of these decisions on analysis results and derived implications are highly underdeveloped. This might be attributable to occasional high levels of over-confidence in computational solutions as well as the possible yet questionable assumption that Big Data can wash out minor data quality issues, among other reasons. This article provides examples for how to address this issue. It argues that checking, ensuring and validating the quality of big social data and related auxiliary material is a key ingredient for empowering users to gain reliable insights from their work. Scrutinizing data for accuracy issues, systematically fixing them and diligently documenting these processes can have another positive side effect: Closely interacting with the data, thereby forcing ourselves to understand their idiosyncrasies and patterns, can help us to move from being able to precisely model and formally describe effects in society to also understand and explain them.
Article
Full-text available
The recent Facebook study about emotional contagion has generated a high-profile debate about the ethical and social issues in Big Data research. These issues are not unprecedented, but the debate highlighted that, in focusing on research ethics and the legal issues about this type of research, an important larger picture is overlooked about the extent to which free will is compatible with the growth of deterministic scientific knowledge, and how Big Data research has become central to this growth of knowledge. After discussing the ‘emotional contagion study’ as an illustration, these larger issues about Big Data and scientific knowledge are addressed by providing definitions of data, Big Data and of how scientific knowledge changes the human-made environment. Against this background, it will be possible to examine why the uses of data-driven analyses of human behaviour in particular have recently experienced rapid growth. The essay then goes on to discuss the distinction between basic scientific research as against applied research, a distinction which, it is argued, is necessary to understand the quite different implications in the context of scientific as opposed to applied research. Further, it is important to recognize that Big Data analyses are both enabled and constrained by the nature of data sources available. Big Data research is bound to become more widespread, and this will require more awareness on the part of data scientists, policymakers and a wider public about its contexts and often unintended consequences.
Article
Full-text available
Earth observation technology has provided highly useful information in global climate change research over the past few decades and greatly promoted its development, especially through providing biological, physical, and chemical parameters on a global scale. Earth observation data has the 4V features (volume, variety, veracity, and velocity) of big data that are suitable for climate change research. Moreover, the large amount of data available from scientific satellites plays an important role. This study reviews the advances of climate change studies based on Earth observation big data and provides examples of case studies that utilize Earth observation big data in climate change research, such as synchronous satellite-aerial-ground observation experiments, which provide extremely large and abundant datasets; Earth observational sensitive factors (e.g., glaciers, lakes, vegetation, radiation, and urbanization); and global environmental change information and simulation systems. With the era of global environment change dawning, Earth observation big data will underpin the Future Earth program with a huge volume of various types of data and will play an important role in academia and decision-making. Inevitably, Earth observation big data will encounter opportunities and challenges brought about by global climate change.
Article
Full-text available
In this paper we take advantage of recent developments in identifying the demographic characteristics of Twitter users to explore the demographic differences between those who do and do not enable location services and those who do and do not geotag their tweets. We discuss the collation and processing of two datasets-one focusing on enabling geoservices and the other on tweet geotagging. We then investigate how opting in to either of these behaviours is associated with gender, age, class, the language in which tweets are written and the language in which users interact with the Twitter user interface. We find statistically significant differences for both behaviours for all demographic characteristics, although the magnitude of association differs substantially by factor. We conclude that there are significant demographic variations between those who opt in to geoservices and those who geotag their tweets. Not withstanding the limitations of the data, we suggest that Twitter users who publish geographical information are not representative of the wider Twitter population.
Article
Full-text available
In this paper we aim to understand the connectivity and communication characteristics of Twitter users who post content subsequently classified by human annotators as containing possible suicidal intent or thinking, commonly referred to as suicidal ideation. We achieve this understanding by analysing the characteristics of their social networks. Starting from a set of human annotated Tweets we retrieved the authors’ followers and friends lists, and identified users who retweeted the suicidal content. We subsequently built the social network graphs. Our results show a high degree of reciprocal connectivity between the authors of suicidal content when compared to other studies of Twitter users, suggesting a tightly-coupled virtual community. In addition, an analysis of the retweet graph has identified bridge nodes and hub nodes connecting users posting suicidal ideation with users who were not, thus suggesting a potential for information cascade and risk of a possible contagion effect. This is particularly emphasised by considering the combined graph merging friendship and retweeting links.
Article
Full-text available
Centrality is one of the most studied concepts in social network analysis. There is a huge literature regarding centrality measures, as ways to identify the most relevant users in a social network. The challenge is to find measures that can be computed efficiently, and that can be able to classify the users according to relevance criteria as close as possible to reality. We address this problem in the context of the Twitter network, an online social networking service with millions of users and an impressive flow of messages that are published and spread daily by interactions between users. Twitter has different types of users, but the greatest utility lies in finding the most influential ones. The purpose of this article is to collect and classify the different Twitter influence measures that exist so far in literature. These measures are very diverse. Some are based on simple metrics provided by the Twitter API, while others are based on complex mathematical models. Several measures are based on the PageRank algorithm, traditionally used to rank the websites on the Internet. Some others consider the timeline of publication, others the content of the messages, some are focused on specific topics, and others try to make predictions. We consider all these aspects, and some additional ones. Furthermore, we include measures of activity and popularity, the traditional mechanisms to correlate measures, and some important aspects of computational complexity for this particular context.
Article
Full-text available
This paper introduces a distinctive approach to methods development in digital social research called ‘interface methods’. We begin by discussing various methodological confluences between digital media, social studies of science and technology (STS) and sociology. Some authors have posited significant overlap between, on the one hand, sociological and STS concepts, and on the other hand, the ontologies of digital media. Others have emphasized the significant differences between prominent methods built into digital media and those of STS and sociology. This paper advocates a third approach, one that (a) highlights the dynamism and relative under-determinacy of digital methods, and (b) affirms that multiple methodological traditions intersect in digital devices and research. We argue that these two circumstances enable a distinctive approach to methodology in digital social research – thinking methods as ‘interface methods’ – and the paper contextualizes this approach in two different ways. First, we show how the proliferation of online data tools or ‘digital analytics’ opens up distinctive opportunities for critical and creative engagement with methods development at the intersection of sociology, STS and digital research. Second, we discuss a digital research project in which we investigated a specific ‘interface method’, namely co-occurrence analysis. In this digital pilot study we implemented this method in a critical and creative way to analyse and visualize ‘issue dynamics’ in the area of climate change on Twitter. We evaluate this project in the light of our principal objective, which was to test the possibilities for the modification of methods through experimental implementation and interfacing of various methodological traditions. To conclude, we discuss a major obstacle to the development of ‘interface methods’: digital media are marked by particular quantitative dynamics that seem adverse to some of the methodological commitments of sociology and STS. To address this, we argue in favour of a methodological approach in digital social research that affirms its maladjustment to the research methods that are prevalent in the medium.
Chapter
Full-text available
While the rise of social media has made activists much less dependent on television and mainstream newspapers, this certainly does not mean that activists have more control over the media environments in which they operate. Media power has neither been transferred to the public, nor to activists for that matter; instead, power has partly shifted to the technological mechanisms and algorithmic selections operated by large social media corporations (Facebook, Twitter, Google). Through such technological shaping, social media greatly enhance the news-oriented character of activist communication, shifting the focus away from protest issues towards the spectacular, newsworthy, and ‘conflictual’ aspects of protest. Simultaneously, social platforms not only allow users to engage in personal networks but also steer them towards such connections. While personal networks and viral processes of content dissemination can generate strong sentiments of togetherness, they are antithetical to community formation.
Article
Full-text available
Manuel Castells deals in his book Communication Power with the question where power lies in the network society. In this paper, I discuss important issues that this book addresses, and connect them, where possible, to my own works and reflections. The book is discussed along the following lines: the concept of power, web 2.0 and mass self-communication, media manipulation, social movements, novelty & network society.
Article
Full-text available
This article questions the meaning of the social in social media. It does this by revisiting boyd and Ellison’s seminal paper and definition of social network sites. The article argues that social media are not so much about articulating or making an existing network visible. Rather, being social in the context of social media simply means creating connections within the boundaries of adaptive algorithmic architectures. Every click, share, like, and post creates a connection, initiates a relation. The network dynamically grows, evolves, becomes. The network networks. The social in social media is not a fact but a doing.
Article
Full-text available
The term prosumer, first introduced by Toffler in the 1980s, has been developed by sociologists in response to Web 2.0 (the set of technologies that has transformed a predominantly static web into the collaborative medium initially envisaged by Tim Berners-Lee). The phenomena is now understood as a process involving the creation of meanings on the part of the consumer, who re-appropriates spaces that were dominated by institutionalized production, and this extends to the exploitation of consumer creativity on the production side. Recent consumption literature can be re-interpreted through the prosumer lens in order to understand whether prosumers are more creative or alienated in their activities. The peculiar typology of prosumption introduced by Web 2.0 leads us to analyze social capital as a key element in value creation, and to investigate its different online and offline forms. Our analysis then discusses the digital divide and critical consumerism as forms of empowerment impairment.
Article
Full-text available
Google Trends reveals that at the time we were writing our article on ‘The Coming Crisis of Empirical Sociology’ in 2007 almost nobody was searching the internet for ‘Big Data’. It was only towards the very end of 2010 that the term began to register, just ahead of an explosion of interest from 2011 onwards. In this commentary we take the opportunity to reflect back on the claims we made in that original paper in light of more recent discussions about the social scientific implications of the inundation of digital data. Did our paper, with its emphasis on the emergence of, what we termed, ‘social transactional data’ and ‘digital byproduct data’ prefigure contemporary debates that now form the basis and rationale for this excellent new journal? Or was the paper more concerned with broader methodological, theoretical and political debates that have somehow been lost in all of the loud babble that has come to surround Big Data. Using recent work on the BBC Great British Class Survey as an example this brief paper offers a reflexive and critical reflection on what has become – much to the surprise of its authors – one of the most cited papers in the discipline of sociology in the last decade.
Article
Full-text available
Twitter is increasingly investigated as a means of detecting mental health status, including depression and suicidality, in the population. However, validated and reliable methods are not yet fully established. This study aimed to examine whether the level of concern for a suicide-related post on Twitter could be determined based solely on the content of the post, as judged by human coders and then replicated by machine learning. From the 18th February 2014 to the 23rd April 2014, Twitter was monitored for a series of suicide-related phrases and terms using the public Application Program Interface (API). Matching tweets were stored in a data annotation tool developed by the Commonwealth Scientific and Industrial Research Organisation (CSIRO). During this time, 14 701 suicide-related tweets were collected: 14% were randomly (n = 2000) selected and divided into two equal sets (Set A and B) for coding by human researchers. Overall, 14% of suicide-related tweets were classified as ‘strongly concerning’, with the majority coded as ‘possibly concerning’ (56%) and the remainder (29%) considered ‘safe to ignore’. The overall agreement rate among the human coders was 76% (average κ = 0.55). Machine learning processes were subsequently applied to assess whether a ‘strongly concerning’ tweet could be identified automatically. The computer classifier correctly identified 80% of ‘strongly concerning’ tweets and showed increasing gains in accuracy; however, future improvements are necessary as a plateau was not reached as the amount of data increased. The current study demonstrated that it is possible to distinguish the level of concern among suicide-related tweets, using both human coders and an automatic machine classifier. Importantly, the machine classifier replicated the accuracy of the human coders. The findings confirmed that Twitter is used by individuals to express suicidality and that such posts evoked a level of concern that warranted further investigation. However, the predictive power for actual suicidal behaviour is not yet known and the findings do not directly identify targets for intervention.
Article
Full-text available
This paper specifies, designs and critically evaluates two tools for the automated identification of demographic data (age, occupation and social class) from the profile descriptions of Twitter users in the United Kingdom (UK). Meta-data data routinely collected through the Collaborative Social Media Observatory (COSMOS: http://www.cosmosproject.net/) relating to UK Twitter users is matched with the occupational lookup tables between job and social class provided by the Office for National Statistics (ONS) using SOC2010. Using expert human validation, the validity and reliability of the automated matching process is critically assessed and a prospective class distribution of UK Twitter users is offered with 2011 Census baseline comparisons. The pattern matching rules for identifying age are explained and enacted following a discussion on how to minimise false positives. The age distribution of Twitter users, as identified using the tool, is presented alongside the age distribution of the UK population from the 2011 Census. The automated occupation detection tool reliably identifies certain occupational groups, such as professionals, for which job titles cannot be confused with hobbies or are used in common parlance within alternative contexts. An alternative explanation on the prevalence of hobbies is that the creative sector is overrepresented on Twitter compared to 2011 Census data. The age detection tool illustrates the youthfulness of Twitter users compared to the general UK population as of the 2011 Census according to proportions, but projections demonstrate that there is still potentially a large number of older platform users. It is possible to detect "signatures" of both occupation and age from Twitter meta-data with varying degrees of accuracy (particularly dependent on occupational groups) but further confirmatory work is needed.
Article
Full-text available
Recent protests have fuelled deliberations about the extent to which social media ignites popular uprisings. In this article, we use time-series data of Twitter, Facebook, and onsite protests to assess the Granger causality between social media streams and onsite developments at the Indignados, Occupy, and Brazilian Vinegar protests. After applying Gaussianization to the data, we found contentious communication on Twitter and Facebook forecasted onsite protest during the Indignados and Occupy protests, with bidirectional Granger causality between online and onsite protest in the Occupy series. Conversely, the Vinegar demonstrations presented Granger causality between Facebook and Twitter communication, and separately between protestors and injuries/arrests onsite. We conclude that the effective forecasting of protest activity likely varies across different instances of political unrest.
Book
The Wiley-Interscience Paperback Series consists of selected books that have been made more accessible to consumers in an effort to increase global appeal and general circulation. With these new unabridged softcover volumes, Wiley hopes to extend the lives of these works by making them available to future generations of statisticians, mathematicians, and scientists. "Cluster analysis is the increasingly important and practical subject of finding groupings in data. The authors set out to write a book for the user who does not necessarily have an extensive background in mathematics. They succeed very well." textemdash}Mathematical Reviews "Finding Groups in Data [is] a clear, readable, and interesting presentation of a small number of clustering methods. In addition, the book introduced some interesting innovations of applied value to clustering literature." textemdash{Journal of Classification "This is a very good, easy-to-read, and practical book. It has many nice features and is highly recommended for students and practitioners in various fields of study." textemdashTechnometrics An introduction to the practical application of cluster analysis, this text presents a selection of methods that together can deal with most applications. These methods are chosen for their robustness, consistency, and general applicability. This book discusses various types of data, including interval-scaled and binary variables as well as similarity data, and explains how these can be transformed prior to clustering.
Article
When evaluating the cause of one's popularity on Twitter, one thing is considered to be the main driver: Many tweets. There is debate about the kind of tweet one should publish, but little beyond tweets. Of particular interest is the information provided by each Twitter user's profile page. One of the features are the given names on those profiles. Studies on psychology and economics identified correlations of the first name to, e.g., one's school marks or chances of getting a job interview in the US. Therefore, we are interested in the influence of those profile information on the follower count. We addressed this question by analyzing the profiles of about 6 Million Twitter users. All profiles are separated into three groups: Users that have a first name, English words, or neither of both in their name field. The assumption is that names and words influence the discoverability of a user and subsequently his/her follower count. We propose a classifier that labels users who will increase their follower count within a month by applying different models based on the user's group. The classifiers are evaluated with the area under the receiver operator curve score and achieves a score above 0.800.
Article
This paper outlines a new approach to the study of power, that of the sociology of translation. Starting from three principles, those of agnosticism, generalised symmetry and free association, the paper describes a scientigc and economic controversy about the causes for the decline in the population of scallops in St. Brieuc Bay and the attempts by three marine biologists to develop a conservation strategy for that population. Four "moments" of translation are discerned in the attempts by these researchers to impose themselves and their degnition of the situation on others: Z) problematization-the researchers sought to become indispensable to other actors in the drama by degning the nature and the problems of the latter and then suggesting that these would be resolved if the actors negotiated the "obligatory passage point" of the researchers' program of investigation; G) interessemen- A series of processes by which the researchers sought to lock the other actors into the roles that had been proposed for them in that program; 3) enrolment- A set of strategies in which the researchers sought to degne and interrelate the various roles they had allocated to others; 4) mobilization- A set of methods used by the researchers to ensure that supposed spokesmen for various relevant collectivities were properly able to represent those collectivities and not betrayed by the latter. In conclusion, it is noted that translation is a process, never a completed accomplishment, and it may (as in the empirical case considered) fail.
Conference Paper
Studies have identified scale free networks – a real- world and man-made phenomena – in networks such as the human brain, protein networks, market investments networks, journal co-citation networks and the World Wide Web. Common properties such as preferential attachment and growth enable these networks to be classified as scale-free, which belong to a family of networks known as “small-world” networks, characterized by a short network distance and high clustering coefficient. These properties can clearly be identified in networks such as the World Wide Web; a complex man-man network of documents and links that grows in uncontrollable manner, they produce the ‘rich-get- richer’ effect, where nodes increase their connectivity at the expense of younger less well connected ones. By mapping the complex real-world and man-made networks, these studies are helping improve our knowledge on the “weblike” world we live in. However, as many of these scale-free networks still yet to be discovered, generalizing a scale-free model requires is still problematic. In this paper we study a network which is both a product of man-made networks, and real-life phenomena. Twitter, a micro-blogging social networking service provides a simple service to enable users to broadcast messages and form networks of ‘friends’ and ‘followers’. Studies have examine the structure of Twitter’s static networks that form as a result of the friends and follower links between users. There has also been a growing interest in exploring how it can be used to solve real- world problems, and findings ways to classifying and identifying influential users. As an alternative approach, we have examined the dynamic network structures of Twitter conversations – which form through the passing of messages between users – and found that they exhibit scale-free properties such as preferential attachment and growth. In this study, a number of Twitter datasets were collected varying in size, region and topic, and their dynamic ‘retweet’ (shared messages) structures were examined. The findings of the analysis have shown that there exhibit a power law with similar exponents across all datasets, in regards to the decay of ‘retweeted’ (or shared) messages between users. The exponents found – d which ranged from 1.2 to 1.5 – are lower than similar scale-free networks such as the Web; typically such a low exponent would indicate a skewed and uncorrelated network as a result of the number of edges growing faster than the number of nodes. However the Twitter networks examined exhibit the same scale-free properties including preferential attachment and growth as networks of a higher exponent. The findings of this study not only expands the current knowledge on documented scale-free networks, but also raises questions about the nature of communication in social networking sites.
Conference Paper
Retweets are an important mechanism for recognising propagation of information on the Twitter social media platform. However, many retweets do not use the official retweet mechanism, or even community established conventions, and these "dark retweets" are not accounted for in many existing analysis. In this paper, a comprehensive matrix of tweet propagation is presented to show the different nuances of retweeting, based on seven characteristics: whether it is proprietary, the mechanism used, whether it is directed to followers or non-followers, whether it mentions other users, if it is explicitly propagating another tweet, if it links to an original tweet, and what is the audience it is pushed to. Based on this matrix and two assumptions of retweetability, the degrees of a retweet's "darkness" can be determined. This matrix was evaluated over 2.3 million tweets and it was found that dark retweets amounted to 12.86% (for search results less than 1500 tweets per URL) and 24.7% (for search results including more than 1500 tweets per URL) respectively. By extrapolating these results with those found in existing studies, potentially thousands of retweets may be hidden from existing studies on retweets.
Article
Hashtags offer exciting opportunities for professional development, teaching, and learning. However, their use reflects users’ needs and desires. To illustrate and problematize the ways hashtags are used in professional development settings, this study reports on users’ participation patterns, users’ roles, and content contributed to three unique hashtags. This mixed methods research employs data mining techniques to retrieve data. Using a collective case study methodology, the study compares and contrasts the use of three hashtags and offers insights into the use of hashtags as emerging learning and professional development environments. Results show that hashtags exhibit similarities, such as unequal user participation. Findings also reveal differences between hashtags. For instance some hashtags are used on an ongoing basis while others have well-defined start and end dates. Ultimately, these results question deterministic thinking with respect to emerging technologies and novel professional development environments.
Conference Paper
As a result of various industry regulations service providers such as websites and app developers are required to explain the ways in which they process the personal data of service users. These “privacy disclosures”, which aim to inform users and empower them to control their privacy, take several forms. Among these forms are the privacy policy, the cookie notice and, on smart phones, the app permission request. The interaction problems with these different types of disclosure are relatively well understood – habituation, inattention and cognitive biases undermine the extent to which user consent is truly informed. User understanding of the actual content of these disclosures, and their feelings toward it, are less well understood, though. In this paper we report on a mixed-methods study that explored these three types of privacy disclosure and compare their relative merits as a starting point for the development more meaningful consent interactions. We identify four key findings – heterogeneity of user perceptions and attitudes to privacy disclosures, limited ability of users to infer data processing outputs and risks based on technical explanations of particular practices, suggestions of a naïve model of “cost justification” rather cost-benefit analysis by users, and the possibility that consent interactions are valuable in themselves as a means to improve user perceptions of a service.
Article
Despite significant work on the problem of inferring a Twitter user's gender from her online content, no systematic investigation has been made into leveraging the most obvious signal of a user's gender: first name. In this paper, we perform a thorough investigation of the link between gender and first name in English tweets. Our work makes several important contributions. The first and most central contribution is two different strategies for incorporating the user's self-reported name into a gender classifier. We find that this yields a 20% increase in accuracy over a standard baseline classifier. These classifiers are the most accurate gender inference methods for Twitter data developed to date. In order to evaluate our classifiers, we developed a novel way of obtaining gender-labels for Twitter users that does not require analysis of the user's profile or textual content. This is our second contribution. Our approach eliminates the troubling issue of a label being somehow derived from the same text that a classifier will use to infer the label. Finally, we built a large dataset of gender-labeled Twitter users and, crucially, have published this dataset for community use. To our knowledge, this is the first gender-labeled Twitter dataset available for researchers. Our hope is that this will provide a basis for comparison of gender inference methods.
Article
In this paper we focus on the connection between age and language use, exploring age prediction of Twitter users based on their tweets. We discuss the construction of a fine-grained annotation effort to assign ages and life stages to Twitter users. Using this dataset, we explore age prediction in three different ways: classifying users into age categories, by life stages, and predicting their exact age. We find that an automatic system achieves better performance than humans on these tasks and that both humans and the automatic systems have difficulties predicting the age of older people. Moreover, we present a detailed analysis of variables that change with age. We find strong patterns of change, and that most changes occur at young ages. Copyright © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Article
Social media practices and technologies are often part of how ethnographic research participants navigate their wider social, material and technological worlds, and are equally part of ethnographic practice. This creates the need to consider how emergent forms of social media-driven ethnographic practice might be understood theoretically and methodologically. In this article, we respond critically to existing literatures concerning the nature of the internet as an ethnographic site by suggesting how concepts of routine, movement and sociality enable us to understand the making of social media ethnography knowledge and places.
Article
First historical account of the development of social science research methods in Britain Accessibly and engagingly written Sheds new light on the huge social changes experienced in Britain over the last 70 years Identities and Social Change in Britain since 1940 examines how, between 1940 and 1970 British society was marked by the imprint of the academic social sciences in profound ways which have an enduring legacy on how we see ourselves. It focuses on how interview methods and sample surveys eclipsed literature and the community study as a means of understanding ordinary life. The book shows that these methods were part of a wider remaking of British national identity in the aftermath of decolonisation in which measures of the rational, managed nation eclipsed literary and romantic ones. It also links the emergence of social science methods to the strengthening of technocratic and scientific identities amongst the educated middle classes, and to the rise in masculine authority which challenged feminine expertise. This book is the first to draw extensively on archived qualitative social science data from the 1930s to the 1960s, which it uses to offer a unique, personal and challenging account of post war social change in Britain. It also uses this data to conduct a new kind of historical sociology of the social sciences, one that emphasises the discontinuities in knowledge forms and which stresses how disciplines and institutions competed with each other for reputation. Its emphasis on how social scientific forms of knowing eclipsed those from the arts and humanities during this period offers a radical re-thinking of the role of expertise today which will provoke social scientists, scholars in the humanities, and the general reader alike.
Article
Ubiquitous computing, given a regulatory environment that seems to favor consent as a way to empower citizens, introduces the possibility of users being asked to make consent decisions in numerous everyday scenarios such as entering a supermarket or walking down the street. In this note we outline a model of semi-autonomous consent (SAC), in which preference elicitation is decoupled from the act of consenting itself, and explain how this could protect desirable properties of informed consent without overwhelming users. We also suggest some challenges that must be overcome to make SAC a reality.
Article
While researchers have studied negative professional consequences of medical trainee social media use, little is known about how medical students informally use social media for education and career development. This knowledge may help future and current physicians succeed in the digital age. We aimed to explore how and why medical students use Twitter for professional development. This was a digital ethnography. Medical student "superusers" of Twitter participated in the study APPROACH: The postings ("tweets") of 31 medical student superusers were observed for 8 months (May-December 2013), and structured field notes recorded. Through purposive sampling, individual key informant interviews were conducted to explore Twitter use and values until thematic saturation was reached (ten students). Three faculty key informant interviews were also conducted. Ego network and subnetwork analysis of student key informants was performed. Qualitative analysis included inductive coding of field notes and interviews, triangulation of data, and analytic memos in an iterative process. Twitter served as a professional tool that supplemented the traditional medical school experience. Superusers approached their use of Twitter with purpose and were mindful of online professionalism as well as of being good Twitter citizens. Their tweets reflected a mix of personal and professional content. Student key informants had a high number of followers. The subnetwork of key informants was well-connected, showing evidence of a social network versus information network. Twitter provided value in two major domains: access and voice. Students gained access to information, to experts, to a variety of perspectives including patient and public perspectives, and to communities of support. They also gained a platform for advocacy, control of their digital footprint, and a sense of equalization within the medical hierarchy. Twitter can serve as a professional tool that supplements traditional education. Students' practices and guiding principles can serve as best practices for other students as well as faculty.
Article
Most electronic behavior traces available to social scientists offer a site-centric view of behavior. We argue that to understand patterns of interpersonal communication and media consumption, a more person-centric view is needed. The ideal research platform would capture reading as well as writing and friending, behavior across multiple sites, and demographic and psychographic variables. It would also offer opportunities for researchers to make interventions that make changes and additions to the information presented to people in social media interfaces. Any attempt to create such an ideal platform will have to make compromises because of engineering and privacy constraints. We describe one attempt to navigate those tensions: the MTogether project will recruit a panel of participants who will install a browser extension and mobile app that enable limited data collection and interventions.
Article
Over the past few years, we have seen the emergence of “big data”: disruptive technologies that have transformed commerce, science, and many aspects of society. Despite the tremendous enthusiasm for big data, there is no shortage of detractors. This article argues that many criticisms stem from a fundamental confusion over goals: whether the desired outcome of big data use is “better science” or “better engineering.” Critics point to the rejection of traditional data collection and analysis methods, confusion between correlation and causation, and an indifference to models with explanatory power. From the perspective of advancing social science, these are valid reservations. I contend, however, that if the end goal of big data use is to engineer computational artifacts that are more effective according to well-defined metrics, then whatever improves those metrics should be exploited without prejudice. Sound scientific reasoning, while helpful, is not necessary to improve engineering. Understanding the distinction between science and engineering resolves many of the apparent controversies surrounding big data and helps to clarify the criteria by which contributions should be assessed.