Content uploaded by Marc Barbier
Author content
All content in this area was uploaded by Marc Barbier on Apr 23, 2014
Content may be subject to copyright.
1
Science and Democracy Network / Annual Meeting, Paris 25-27 June 2012
Reconstruction+ of+ Socio‐Semantic+ Dynamics+ in+ Sciences‐
Society+ Networks:+ Methodology+ and+ Epistemology+ of+ large+
textual+corpora+analysis+
Marc%Barbier%and%Jean-Philippe% Cointet% (INRA%SenS,%CorTexT%Digital% Platform%
of%IFRIS)%
Abstract:+
Until recent time, the description, light-modeling and interpretation of socio-cognitive
dynamics of science-society relations required a constructivist approach, involving collecting,
reading, classifying and interpreting tasks performed by scholars examining sets of texts,
archives, interviews, etc.
The growing mass of data produced in the so-called Knowledge Society owes a lot to the
acceleration and profusion of digital tools that are now widely used in different areas of
human activities: work, culture, leisure, political expression, etc. Social scientists now largely
acknowledge that the various modes of interaction brought by new information and
communication technologies are changing the very nature of micro-politics and the
expression of the self. In our views the conditions for producing knowledge from a Science &
Technology Studies point of view are changed too, for at least three reasons:
• the deluge of electronic sources of data overloads our capacity of enquiry,
• S&TS dynamics now intertwine heterogeneous actors, matters of facts and matters of
concerns coming from different arenas call for an integrated understanding of
knowledge production and circulation.
• Nevertheless, new digital infrastructures specifically designed for social sciences and
humanities make it possible to equip scientists with tools that enable them to tackle
the complexity of heterogeneous textual corpora dynamics and to develop innovative
analytical methodologies that will bring new insights and renewed capacities to
investigate contemporary issues.
Many researchers are paddling at present in raising projects and building facilities to
concretize those digital infrastructures, but the implication and the interest of the STS
communities remains shy, and possibly bound by a foundational sense of skepticism towards
technological promises of any kind. The aim of this communication is to propose (1) to
discuss some of the epistemic problems that surge from the use of digital platforms in STS
ambitioning at developing our capacities of enquiry of science and technology in society; (2)
present the main developments that had been led within the CorTexT plateform as well as
their driving principles.
2
1 Introduction
The growing mass of data produced in a so-called Knowledge Society owes a lot to the
acceleration and profusion of electronic data affecting most areas of human activities: work,
culture, leisure, political expression, etc. Social Sciences now widely acknowledge that the
various modes of interaction brought by new communication technology are changing the
very nature of micro-politics and the expression of the self. In our view the conditions for
producing knowledge from a Sciences & Technology Studies point of view are changed too,
for three reasons:
(i) The deluge of electronic sources of data overloads our capacity of enquiry;
(ii) S&TS dynamics now intertwine heterogeneous actors, matters of facts and matters
of concerns coming from different arenas calling for an integrated understanding of
knowledge production and circulation;
(iii) New digital infrastructures specifically designed for social sciences and humanities
make it possible to equip social scientists with platforms enabling the innovative
analysis of heterogeneous textual corpora.
But abundance of information is certainly not equivalent with abundance of knowledge.
Many critical points of view assume the opposite, claiming that the growth of knowledge is
far to be proportional to the growth of information. This could perhaps explain why many
researchers of the STS communities could remain shy in front the challenge and the possible
use of such tools and instruments, and possibly bound by a foundational sense of skepticism
towards technological promises of any kind. This is why the use of informetrics or
webometrics technologies has to be accompanied with a critical discussion of epistemic
problems that may surge from the use of digital platforms in STS when their use aims at
developing our capacities of enquiry of science and technology in society. Serious questions
are undoubtedly raised: to what extend the constructivist approach of data is changed when
large corpora are mobilized, parsed and analyzed with machines that are black-boxing
statistical inferences, terms extraction algorithms or graph analysis metrics? What are the
new technical empowerments needed for the extension of interpretative strategy in STS
empirical work? Are we simply gaining the benefit of exploring bigger sets of data or does it
change the nature of our enquiry and the arrays of matters of facts? Does these new
methodologies open new perspectives of diachronic and multi level analysis of the
production, the use, circulation and contestation of the scientific enterprise in society?
This communication attempts to propose a first framework in order to give place to both
epistemological questions, methodological and technological issues. In a first section we will
try to provide a frame for questioning the epistemic problems that may surge from digital
platforms when used for developing our capacities of enquiry of science and technology in
society. We think that it is necessary to distinguish between epistemological and
technological issues, which will be addressed in a second section: what is at stake, from a
technical and organization point of view designing a data platform for STS. In the final
section we will depict the various types of analysis that are proposed within the CorTexT
platform as well as driving principles and thus invite our colleagues to use it.
2 Epistemological+issues+
2.1 A+changing+context+of+enquiry+
The ICT revolution has issued a large number of new patterns of communication and
collaborative tasks in various et rather segmented sphere of human activities: in the private
3
sphere of inter-individual exchanges; in the public sphere - even redefining the micro-politics
of using information system -; in the economic sphere while sourcing a new business sector
and engaging any type of professional or organized activities in new ways of being at work.
Therefore, the Web constitutes nowadays a research topic on its own for a range of emerging
research field such as bibliometrics, scientometrics, webometrics or informetrics, adopting
the idea of a measurement of information contained in flux of data that are fueled thanks to
many types of resources, structures and technologies that also build the infrastructure of this
circulation (Thelwall et al., 2006 ; Bar-Ilan, 2008).
Beyond the classical use of bibliographical datasets provided by WOS-like1 databases to
perform citations and bibliometric studies, many social scientists consider the web as a field
of enquiry on its own and are entering (or considering entering) hybrid quanti-qualitative
methods. Counting hits, nodes and links is surely not considered sufficient to turn
information into knowledge, but it may help when it comes to answering questions grounded
in cultural studies, science studies or political sciences. This may even be a necessity when
considering that digital information, which is being produced, echoes changes in human
activity that goes online for so many reasons and purposes, bearing in mind that the relation
established between entities are not virtual but actually the result of free associations
technically operated through multiple communication techniques. Nevertheless the Web as a
collective phenomenon does not translates into a unified humanity sharing cognitive
resources for the production of a common understanding. Political sciences rather insist
insists on the necessity to analyze the web as a mosaic space (Rogers, 2008). A view that
opposes a balkanized model (Sunstein, 2008) of the politics of the web to the long lasting
vision of one unified small world.
2.2 Empowering+the+idiom+of+the+co‐production+
Jasanoff landscape of the idiom of co-production and the specificity of a co-productionist
account of science-society (Jasanoff, 2006) propose a good starting point to position the
practical problem of many STS scholars dealing with the ever-growing mass and availability
of information when it comes to classify and interpret numerous sets of texts, archives,
interviews, etc. (see previous section 1.1).
2.2.1 Dualist+or+even+symmetric+description+of+science+in+society+and+of+society+in+science++
A first characteristic of the over-presence of data is that a large array of practical resources
for the public understanding of scientific activities and production is available. One might
make the “easy-going” hypothesis that the world science, innovation and scientific expertise
are more accessible to “lay-thinkers”. But the web is far from being a large Wikipedia for
training purposes. A second characteristic of the web - known by many of us - is that the web
represents a territory that should be occupied with meaningful declaration conceived as a
resource for action through communication for many actors engaged in the science-society
debates, on both sides. A very large mass of positions, claims, and advocacies - with
reference to science in civic debate or with reference to society in research policy - became
particularly easy to access. It does not mean that the expression on the web is saturating the
set of matters of fact and matters of concerns that has to be empirically accounted for. Many
of the classical in-depth, situated, longitudinal works of enquiry are still required to go
beyond an account of communicational strategy, somehow addressed by media studies.
Nevertheless the nature of this communication is related to the existence of ways of beings of
contemporaneous research activities, or to that of civic and activist interference with
technoscience. The task became harder, since the availability of such discourses, and its
profusion in case of controversies, remains as a sort of arena of political engagement, which
1 Web Of Science is brought by ISI from Thomson Reuters
4
at the same time empowers a symmetric description but also blurs the clarity of discourses.
As a result, it becomes difficult to stay away from the accountability of communicational
discourses online.
2.2.2 The+non‐linear+explanation+of+the+shaping+of+science+and+technology+in+context++
The second idiom is related to the fact that explanation of shaping of science and technology
have to be considered in context, meaning that the potentiality of any scientific or
technological scripts does not perform the world by itself, though one would get a laboratory.
Symmetrically the performativity of any techno-scientific achievements on a given society
corresponds to a redistribution of power relations, convoking STSs to ask question about the
“how” and the “why” and not strictly achieving heavy or thick account of successes or
failures. With this stake, the extension of traceability of many activities and communication
acts themselves represents a resource to keep this ambition high. More than ever, the
mobilization of ICT in scientific practices and the relative openness of science to public
scrutiny represent a possible extension for non-linear explanations. There also, one can claim
for the benefit of ad-hoc techniques with capacity to capture and to make the traces to
“become talkative” within the interpretative stance of co-constructionists.
2.3 +The+stakes+of+the+politics+of+knowledge++
The classical reflection of STS about the co-construction particularly aims at developing a
symmetric account of ordering Nature through knowledge and technology and of ordering
Society through power and culture. This separation has been for long questioned by Actor-
Network-Theorising, pointing that the process of co-production should be considered
through the monad of translation and free association, considering the performativity of both
human and non human actants, as well as of crystallized arrangements and setting of hybrid
nature recalling Stengers cosmopolitan views.
For many contenders of ANT, the agnostic lecture of performativity tend not to consider any
type of politics that would not be at work within the web of translations, meaning that only
the necessity of a full account of the co-production process should drive the interpretation of
big or micro-politics of translations. If this position does fit with the purpose of symmetry in
order not to introduce critical views and the political affordance of the one who analyze and
interpret reality, still this methodological prudence is raising a problem: that of the closure of
the politics of any process of co-production. To make the argument short: one thing is to
adopt a symmetric attitude of power-relations within the co-production process, the other is
to consider the asymmetry of position of actants in terms of their capacity and force to act at
the governance of those process. ANTist would argue that there is no outside and inside in
the politics of knowledge, only long or short actor-networks, nevertheless not all actants have
a laboratory to raise the world, particularly when those actants are non human or of an
heterogeneous nature (hybrid arrangement and temporary settings). Moreover, as pointed by
many scholars who study the governance of Science, Technology and Innovation (Borras
and Edler, 2012), the co-production became a matter of politics that are not necessarily
assembled in one political process dedicated to a particular co-production process that would
be under study. It means that one might consider the existence of segmented areas of polities
dealing with co-production: the area of specific organization working on its own identity
building, the politics of a particular institution or institutional framework at work in the
governance of knowledge, the politics of establishing research policies connecting epistemic
communities together with stakeholders of innovation and the politics of re-presentation in
representative institution. It seems therefore particularly difficult to establish empirically and
theoretically a politics of hybrids that would be simply be passing through this fragmentation
of the politics of the state of knowledge.
Quoting Silvia Gherardi, we could support the idea that “if we are to determine the linkage
among the various connections in action along the spiral from the individual to the institution, we must
5
abandon the idea that the social order is aggregated or negotiated by a plurality of dissonant voices
which eventually blend together to resemble a musical canon” Gherardi (2006: 220). Another way
of phrasing this idea in our context is to say that the politics of knowledge is not setting an
isotropic field of representations about nature and society co-production. We shall therefore
find in relation to the issues at stake in polities and in-between them. Where and how the co-
production is produced and governed in a texture of practices that are, of course, situated
and enacted but in the same time taking place in a political situation that has been designed
or which emerged to perform and support the realization of the co-production ? We could
argue thus that actors of the politics of knowledge are reflexively conscious of being engaged
into it, and that is changing a lot the type of investigation we shall develop and also the type
of engagement of the STS community as a specific “bound of knowledge” that reflect on the
process under study.
In this perspective, it is as necessary to work on disciplines of power, as on the subjectivation
of apparatus and technology of power-relations that are proposed or imposed by a sovereign
or legitimate centre. This is what M.Foucault confirmed when he said lately that “the
‘dispositif’ is essentially of a strategic nature; it follows that it deals with a certain tampering with
power-struggles, with a rational and concerted intervention within those power-struggles, either to
develop them in a specific direction, or to block them, or else to stabilize them and to use them. The
‘dispositif’, thus, is always encapsulated in relations of power, but it is also always linked to one or more
knowledge bounds that sprang from It, but that also empower its creation” (Dits et écrits, Volume
III, p. 299 sq., our translation).
It follows from this particular epistemological debate about the politics of co-production, that
we have to assume that the large availability of arguments, which spring out of the
fragmented arenas of the politics of knowledge, is a resource and a challenge: a resource
since we use them as a draught for our professional engagement as researchers, but also a
challenge since they are possibly active within the world under study. This complexity thus
calls for a more systematic and responsible attitude in STS to capture this state of availability
and fluidity of arguments that spring from various “opinion wells” available on the web.
Willing to capture and interpret the fragmentation of the politics of knowledge on any
specific issues of pool of issues is therefore a methodological response that echoes the
epistemological problems that we just have tried to expose. This methodological response - if
considered as legitimated and constructive for STS whatever their orientation would be,
interactional or constitutive (Jasanoff, 2006: 18-19) - calls for a technological challenge that
one could called a Platform for STS.
3 Designing+a+digital+platform+for+STS+
3.1 The+challenges+
We can distinguish between three types of challenges:
- Scientific, insofar as the modeling of socio-cognitive dynamics from massive
observational data remains a young research field. We assume that this field will not
reach maturity until we combine a fine-grained analysis of textual content (relying on
ad hoc computational linguistics rather than plain term statistics) with multi-level
models of complex, hybrid systems (in particular involving heterogeneous networks
at various scales) and a qualitative understanding of the appraisal of these models by
users and practitioners (featuring social studies of science).
- Instrumental, since a comprehensive study of the public space requires taking into
account heterogeneous sources. This implies a variety of methodological and
technical challenges, which are only partly solved, as of today, and remain the focus
6
of ongoing research efforts: massive textual corpora processing, knowledge
extraction (from the identification of named entities to the characterization of
utterance endorsement and, more broadly, “hedging ”), visualization of multi-level
data in such a way that heterogeneous entities dynamics taking place at various time
scales are both accurately and ergonomically represented.
- Political: social sciences have clearly identified the phenomena of co-production of
knowledge and socio-political issues (see previous section), but without proposing a
systematic methodology of analysis of events, traces and discourses that ground the
science-society debates in the communication sphere. Besides, we need to bridge the
gap between qualitative small-scale studies and actual large-scale dynamics that can
be spread over different arenas. Therefore, a better understanding of the dynamics of
frames from the in-vivo observation of textual traces corpora in these arenas should
allow a better identification of opportunities/moments of mutation and modulation
of the scientific-technical trajectories. On the other hand, automatically drawn maps
can help practitioners to improve, for instance, the quality of deliberations.
In this context, we wish to target empirical domains – mostly define by the definition o
research matters of enquiry related to matters of concerns in society- where we expect to
witness a most significant urge in the coming years in order be able to afford the description
of black-boxing of normative choices that accompanies any process of shaping the social and
nature in scientific and technological productions. Those political embeddings have to be
enlightened and possibly have to be accounted in order to point out the existence of
alternatives or the fact that previous orders are about to be erased. Moreover, we also wish to
equip the co-constructionist project consisting in opening up interpretation of forthcoming
reality in a predictive - but not ballistic - attempt with the willingness to enlighten the relation
of communities or groups of interests to possible common futures.
3.2 Working+on+free‐associations+or+the+come‐back+of+the+co‐word+analysis+with+a+socio‐
semantic+pulse+
3.2.1 Background+
Co-word analysis is a small branch of network analysis, which is largely grounded in Actor-
Network-Theory (Callon et al., 1983) and in the implementation of specific algorithm for
mapping scientific knowledge. Born in relation to the evaluation and policy of science
(Callon et al., 1986; Law et al., 1988), co-word analysis is a critical prolongation of the early
approaches of co-citation (Small, 1973), which largely depends on techniques for full text
indexation. The relevance of co-word analysis for mapping large scientific domain has
received critiques in relation to the significance of the relationships of words and its context
of enunciation (see lately Leydesdorff & Hellstein, 2006). Thus, it is to be noticed that other
types of characterization exist and that we only propose one possible way of characterizing
knowledge dynamics.
At present, the evolution of the analysis of scientific networks is largely attached to the
question of characterizing collaborative and cognitive dynamics of knowledge production
(Powell et al., 2005) and to the emergence of multi or trans-disciplinary emerging fields of
research (Lucio-Arias, Leydesdorff, 2007) or paradigmatic field of research (Chavalarias,
Cointet, 2008). Tracing and mapping knowledge in scientific database or in other electronic
sources represents a huge field of problems for many disciplines dealing with information. It
is also the case for co-word analysis (Mogoutov, Kahane, 2007). More locally, in relation to
specific area of research, mapping heterogeneous networks appears to help the understanding
of social dynamic of research activities (Cambrosio, Keating, Mogoutov, 2004; Cambrosio et
7
al., 2006; Bourret et al., 2006).
Beyond this precise inherited domain, elaborations of new methods of socio-cognitive
dynamics modeling can impact a much wider strategic economic field of activity: new tools
to explore digital libraries, support computer- assisted scientific innovation and strategic
intelligence (Nederhof and Van Wijk, 1997; Valverde et al., 2007), knowledge extraction
(He, 1999; Sintchenko et al., 2010), tracking of debates and controversies in blogs, medias,
online forums and, more broadly, the digital public space at large (Lazer et al., 2009;
Sunstein, 2007).
The use of those tools in the context of an interactive work with members representing a
scientific community is a significant way of realizing a kind of participatory sociology of
scientific knowledge, trying notably to avoid an evaluative perspective and more to co-design
a situation of using tools in a comprehensive way and in relation to a purpose of maieutic
intervention. This attitude toward network using co-word analysis mapping in interaction
with a scientific community shares many ideas of shifting the use of tool from a scientific
context to a science policy context (Noyons, 2001).
3.2.2 The+Socio‐semantic+turn+
Aside semantic analysis, which has been fostered for long in sociology of science by pioneers
of co-word analysis as well as Natural Language Processing research community, Social
Network Analysis (SNA), has been developing into a “normal science” (Freeman, 2004) for
decades developing its own tools paradigms and conferences. Connecting social dynamics
drawn from the observation of the various interactions between actors and the very nature of
their exchange encapsulated in their shared production is a much more recent endeavor. The
socio-semantic turn tries to bridge the gap between purely structural accounts of the social
dynamics with a more precise account of the very practices of agents. Following Giddens
(1981), “the structure is both the medium and outcome of the social practices it recursively
organizes”. As a result, social structure understanding and more importantly social structure
dynamics comprehension is only possible if one analyses both individual dynamics and
global structure. We claim that semantic analysis is a viable strategy to track human
practices at least when it comes to study knowledge communities. Analyzing both the social
interactions linking actors the one with the others and the production and exchange of
knowledge is key to provide a realistic description of the social dynamics at stake in a given
community.
The socio-semantic turn (Roth, 2006) then proposes to extend the scope of SNA by
integrating knowledge to the very dynamics of the system in a way that tries to give back
some agency to actors, which are not anymore reduced to interchangeable nodes in a social
network but which are through their practices building and moving into a larger semantic
landscape.
3.3 A+multi‐arena+perspective+
If a STS digital platform objective is to follow issues dynamics, then it should be remarked
first that issues emerge and are being transformed in various social places: there is no unique
public space, rather a multiplicity of them. The “frame struggle” around those issues should
be systematically addressed in the different arenas pertinent to a given case study (and
consequently in corpora coming from different sources). Arenas then correspond to the
different public spaces where stakes are defined.
The platform then ambitions to propose a common solution for the analysis of various types
of arenas and related data sources: scientific production (publications, patents), international
8
press articles, web content, legal production, etc. Socio-semantic modeling is sufficiently rich
and generic to provide a general framework for analyzing dynamics pertaining to each arena.
Yet, an issue cannot be described simply by summing observations made in different arenas.
To get a realistic account of the issue one should also consider how they interact. The global
dynamics of the issue is certainly influenced by the overall circulation of entities (human or
non-human actors, pieces of knowledge, problems, promises, concerns...) in these
heterogeneous spaces. This point is still a challenge both from a technical (possibly multi-
lingual arenas, linguistic genders may bias tools outputs according to the considered arena,
etc.) and methodological stance (modeling coupling between two dynamics observed in
different arenas).
4 Presentation+of+the+Digital+Platform+“CorTexT”2+
4.1 Principles+of+the+Design+of+the+Platform++
Our objective is distinct from the purpose of scientific knowledge production and of
Research Evaluation. The main objective of a platform for STS is to design innovative
methods and tools to model empirical dynamics pertaining to various public issues: we aim
to apply advanced NLP techniques and complex network analysis to heterogeneous textual
corpora in order to track the dynamics of contemporary topics. Though one would find a lot
of continuity in terms of technological and algorithmic questions and design, we rather insist
in our project on developing a technology, meaning enabling new capacities in a co-design
way-of-developing, which in turn may renew the precise modeling strategy we set up.
Frame, or framing, is a wide spread notion used in sociology, media studies, or in
communication research. In our case, we define framing as the way actors try to make sense
of an issue. By structuring its associated concepts. And actors imposing their frames will tend
to impose their own perception of the questions to the public. In this respect, our goal will be
to define the different frames supported by actors and describe their dynamics: how frames
are built, set, bridged, aligned, extended and transformed? Frames evolve according to the
underlying landscape constraining and enacting socio-semantic dynamics.
4.2 A+multi‐level+modeling+of+the+empirical+heterogeneous+dynamics+
We assume that with the help of heterogeneous networks analysis and fine-grained NLP,
frame analysis can be translated into realistic quantitative modeling which in turn could pave
the way to new empirical findings and theoretical breakthroughs. Indeed socio-semantic
network analysis offers the opportunity to operationalize the notion of frame dynamics.
Frames are structures emerging in a bottom-up fashion from the socio-semantic network.
The description and prediction of phylogenetic phenomena (i.e. development of
clusters/patterns and their continuation or disappearance from a period to another) should
provide us with an operational dynamical model of frames. Moreover this framework will
make it possible to better understand how different heterogeneous networks are coupled and
co-evolve (here, between socio-semantic networks built from sources coming from various
arenas).
4.3 The+attention+to+user+interfaces+and+the+use+of+visual+mapping+
Science mapping has always been one of the driving objective of scientometrics (De Solla
Price, 1976). Mapping science obviously echoes the epistemological objective of portraying
2 The Digital Platform CorTexT is a project of IFRIS, with the support of the LABEX SITES. The Lab
INRA SenS has particularily dedicated forces and skills to develop this project.
9
the scientific as a space on its own. Network analysis has made enormous progresses during
the 2000’s offering efficient and convenient methods through network spatialization
algorithm encapsulated in most libraries and network representation software.
Moreover, mapping provides an intermediary object recalling familiar cognitive habits
regarding classical geographical representations. If bibliometrics, scientometrics or more
generally webometrics approaches have bloomed over the last ten years, the immediate
attractiveness of maps is certainly one of the reasons. Yet network mapping has certainly not
reached the same maturity than geographical mapping: topological representation actually
requires some training for practitioners or may yield some misinterpretation as Euclidean
distances on a specialized network only try to optimally approximate actual topological
distances.
CorText platform clearly pursues the “tradition” of “knowledge mapping” even proposing its
own visualization strategy for spatializing networks3, but also proposing alternative strategies
of information representations, better suited in particular to monitor dynamical properties of
systems under study.
5 How+do+we+practice?
5.1 The+modularity+of+the+Platform:+Dataset,+analysis+and+visualization+
The methodological steps are threefold (see figure 1):
i) Back-Office: defining and collecting the corpora that shall be analyzed in each
domain of application,
ii) Middle Office: implementing linguistic and dynamical reconstruction models;
iii) Front Office: mapping framing dynamics in each application field as user-friendly
interfaces for sociologists or larger audience (end-users at large) enabling them to
build, manipulate and navigate into socio-cognitive reconstructions
FRAIS E – Document B ANR MODÈLES NUM ÉRIQUE S 2012
www
scientific
corpus
blog
corpus
press
corpus
Socio-semantic networks
1.Linguistic
Processing
sources,
targets and
hedges
extraction
web data
collection
2.Socio-cognitive Modeling
multi-arena socio-semantic networks
multi-level reconstruction
3. Issue Mapping
Web services for the
analysis of public
issues
other
arenas...
public
dissemination
Interactive tools are
co-designed by sociologists
and methodologists enrich
feeds
primary
content
processing
empirical data
analysis
élevage
porcin
jugement
critique
dévelop.
économique
tourisme
élevage
porcin
algues
vertes
t =3
eaux
usées
dévelop.
économique
tourisme
élevage
porcin
algues
vertes
t =3
eaux
usées
Figure 1: Processing chain of the project, textual data are collected in every pertinent arenas and
processed with advanded NLP tools (Task 1) to build socio-semantic networks, which dynamics
modeling (Task 2) enable sociologists or end-user to investigate frame dynamics in each domain
(Task 3).
integrate both dimensions within the same framework. Secondly, they often overlooked
the diversity of arenas where public issues emerge and are being transformed.
The FRA ISE consortium gathers computational linguists, complex system scholars,
and social scientists in a unique interdisciplinary endeavor to tackle the challenges of
socio-cognitive dynamics modeling by connecting each required dimension. Our pro-
posal is innovative for four reasons:
1. A unifying conceptualization of issues dynamics analysis –Frame, or framing, is a
widespread notion used in sociology,media studies, or in communication research.
In our case, we will define framing as the way actors try to make sense of an issue
by structuring its associated concepts. And actors imposing their frames will tend
to impose their own perception of the questions to the public. In this respect, our
goal will be to define the different frames supported by actors and describe their
dynamics: how frames are built, set, bridged, aligned, extended and transformed?
11
3 Spatialization occurs at two levels : nodes positions are constrained both by their relations to other nodes
in the network and by the higher-level community they belong to.
10
Those steps are not produced without interactions between those who run the platform and
those who use the platform. The production of local of global representations of socio-
cognitive dynamics observed in various cases under study, is issued thanks to a bridge
between the modeling effort previously described and the contribution of sociologists whose
interpretation will enrich the empirical reconstructions.
5.2 The+nature+and+meaning+of+Datasets:+Arenas+identification+and+corpus+collection+and+
normalization++
We grab and set-up into a shared architecture the datasets stemming from various arenas. In
this first phase, sociologists precisely define which sources are pertinent to the public issue
they target. At the moment, we are able to propose analysis of various sources corresponding
to various arenas:
• scientific arena which we will mainly track through scientific publications ( WOS
( Thomson Web Of Science), Pubmed - Medline, Cab), projects databases (Cordis,
NSF), and if necessary patent databases (Patstat);
• media arena, which essentially corresponds to press articles (both online and offline
press). We essentially make use of dedicated databases like Factiva to collect French-
and English-speaking thematic corpora;
• legal arena, is to be investigated through the construction of domain-specific corpora
from legifrance, parlex, eurolex, etc., according to the scope of the issue under study;
• public opinion arena is of course a crucial “space” for all types of domains, and we
perform a systematic monitoring of the blogosphere as well as, when necessary, crawls
of specific forums and websites.
Beyond the definition of pertinent sources, designing a strategy for defining the appropriate
perimeter of these corpora is necessary. Delineation and extension with lexical or citationist
strategies should be applied each time it is necessary.
5.3 The+nature+and+meaning+of+maps:+visualization+and+user+interface++
The platform aims at equipping researchers with tools for monitoring issues dynamics. These
constraints call for designing web 2.0 tools relying as much as possible on open source
libraries. We benefit from previous consortium experience in designing online interfaces to
produce innovative and informative web services. These interactive representations enable
users to circulate easily between micro and macro levels (from specific documents to high
level general trends), switch between different arenas, or choose to focus on rather actors or
semantic dynamics. Users are thus able to use a series of representation applications to
analyze data collected in each arena. These modules – some of them being still under
development- help STS scholars to appraise the dynamics of co-construction according to
different viewpoints at various resolutions: actors/coalitions, terms/frames, static/dynamic,
mono-arena/multi-arena.
Equipped with such analytical capacities researchers can produce and share visualization
and analysis, and ultimately produce a collaborative interpretation of socio-political
dynamics at stake. Sociologists, as prime end-users, are naturally deeply committed in the
conceptual design of these tools and their feedback will help enhancing them.
6 Conclusion+
Classically, issue frame analysis gives birth to qualitative theoretical models that are well
know in Grounded-Theory and situated action theorizing which bring insightful intuitions
but are not designed to be systematically tested against empirical Data. Those technologies
11
currently accompany many social scientists in the grounded interpretation of their
ethnographical work. At the same time, attempts to appraise scientific and technological
dynamics in society with quantitative analysis have largely remained an open challenge for
two reasons. Firstly, it has always been difficult to produce a significantly faithful
representation of the circulation and positioning of actors and their concerns, principally
because they carried a too scarce analysis of textual traces or simply because they had to
choose between focusing on actors interaction networks (SNA studies) or on their concerns
(purely semantic studies) — failing to integrate both dimensions within the same framework.
As a result, the sophisticated socio-semantic nature of public issues remains largely under-
exploited, and hence, under-observed. Public controversies during the last decades have
involved sciences and technology to a large extent: on one hand, the development of techno-
sciences has raised increasing concerns in terms of collective risks, on the other hand,
innovation is connected to key social issues in domains such as energy, health, food security
and carbon management. Such topics have become essentially political inasmuch as they
were dealing with possible or disputed futures, and as they were the focus of massive of
public and private funding. While this could have remained a classical field of study for
social science, contemporary problems are accompanied by a tremendous amount of
dynamic data due to the proliferation of expertise, public inquiries, audits, think tanks and
web 2.0 discussion platforms: the introduction of modeling, algorithmic and text processing
methods is needed to capture the knowledge dynamics characterizing contemporary issues
and for social scientists not to remain myopic. We aim to develop a new stream of modeling
techniques to appraise these issues in an heterogeneous way (considering actors and
concepts), with constructed topics (modeling issue frames rather than counting terms), and
able to understand the coupling dynamics between various arenas (who is early, influential,
winning over whom; inspired from whom, and deriving from which earlier issues) — we
expect the potential benefits of this type of really integrated innovation over existing local
approaches to be absolutely critical.
References+
1) Bar-Ilan J., (2008). Informetrics at the beginning of the 21stcentury—A review, Journal of Informetrics
2 (2008) 1–52
2) Borrás S. and Edler J., (2012). The Governance of Change in Socio-Technical and Innovation Systems:
Some Pillars for Theory-Building, Communication to the Jean Monnet International Workshop,
“The Governance of Innovation and Socio-Technical Systems: Theorising and Explaining
Change”, Copenhagen Business School, Denmark, March 1rst -2nd 2012.
3) Bourret P., Mogoutov A., Julian-Reynier C., and Cambrosio A., (2006). A New Clinical Collective for
French Cancer Genetics A Heterogeneous Mapping Analysis, Science, Technology, & Human
Values , 31 (4): 431-464
4) Callon, M., J. Law, A. Rip (1986), Mapping the Dynamics of Science and Technology. London: The
MacMillan Press Ltd.
5) Callon, M., J. P. Courtial, W. A. Turner, S. Bauin (1983), From translations to problematic networks:
An introduction to co-word analysis, Social Science Information, 22: 191-235.
6) Callon, M., J.P. Courtial, W.A. Turner, and S. Bauin (1983). From translations to problematic
networks: An introduction to co-word analysis. Social Science Information 22: 191–235.
7) Cambrosio A., Keating P., Mercier S., Lewisonc G., and Mogoutov A., (2006). Mapping the
emergence and development of translational cancer research, European journal of cancer, 24:
3140-3148
8) Cambrosio A., Keating P., Mogoutov A. (2004). Mapping collaborative work and innovation in
biomedicine: a computer assisted analysis of antibody reagent workshops, Social Studies of
Science, 34 (3): 325-364.
9) Chavalarias D, Cointet JP. (2008). Bottom-up scientific field detection for dynamical and hierarchical
science mapping, methodology and case study, Scientometrics. 75(1): 37-50.
10) Chavalarias, D. and J.P. Cointet (2008). Bottom-up scientific field detection for dynamical and
hierarchical science mapping, methodology and case study. Scientometrics 75: 37–50.
12
11) Foucault, M., (1994). Dits et Écrits, Volume III, Paris: Gallimard.
12) Freeman L.C., (2004). The Development of Social Network Analysis: A Study in the Sociology of
Science. Vancouver: BC Press.
13) Gherardi S., (2006). Organizational knowledge: the place of workplace learning, Oxford : Blackwell
Publishing.
14) Giddens, A. (1981). Agency, insitution, and time-space analysis. Advances in Social Theory and
Methodology: Toward an Integration of Micro-and Macro-sociologies, Knorr-Cetina, K. and Cicourel,
A.V., Routledge, 8.
15) He, Q. (1999). Knowledge Discovery through Co-Word Analysis. Library Trends.
16) Jasanoff S. (eds), 2006. States of knowledge, the co-production of science and local order, Routledge.
17) Jones, D.S., Cambrosio A., and Mogoutov A., (2011). Detection and characterization of transla-tional
research in cancer and cardiovascular medicine. Journal of Translational Medicine 9: 57
18) Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabasi, A.-L., Brewer, D., Christakis, N., et al.
(2009). SOCIAL SCIENCE: Computational Social Science. Science (New York, NY),
323(5915), 721–723. doi:10.1126/science.1167742
19) Leydesdorff L, Hellsten I., (2006). Measuring the meaning of words in contexts: An automated analysis
of controversies about 'Monarch butterflies,' 'Frankenfoods,' and 'stem cells', Scientometrics, 67
(2): 231-258.
20) Lucio-Arias D, and Leydesdorff L, (2007). Knowledge emergence in scientific communication: from
"fullerenes" to "nanotubes", Scientometrics, 70 (3): 603-632
21) Nederhof, A., & Van Wijk, E. (1997). Mapping the social and behavioral sciences world-wide:
Use of maps in portfolio analysis of national research efforts. Scientometrics, 40(2), 237–276.
doi:http://dx.doi.org/10.1007/BF02457439
22) Noyons E., (2001). Bibliometric mapping of science in a policy context, Scientometrics, 50(1): 83-98.
23) Powell W.W., White D.R., Koput K.W. and Owen-Smith J., (2005). Network dynamics and field
evolution: the growth of interorganizational collaboration in the life sciences, American Journal of
Sociology, 110, pp. 901–975.
24) Price, D. J. de S. (1976). A General Theory of Bibliometric and Other Cumulative Advantage
Processes. Journal of the American Society for Information Science and Technology, 27(5--6), 292–
306.
25) Rogers R., (2008). The Politics of Web Space,
26) Roth, C. (2006). Co-evolution in Epistemic Networks -- Reconstructing Social Complex Systems.
Structure and Dynamics: eJournal of Anthropological and Related Sciences, 1(3), article 2.
27) Sintchenko, V., Anthony, S., Phan, X., Lin, F., & Coiera, E. (2010). A PubMed-Wide
Associational Study of Infectious Diseases.
28) Small, H. (1973), Co-citation in scientific literature: A new measure of the relationship between
publications, Journal of the American Society for Information Science, 24: 265-269.
29) Sunstein, C. R. (2007). Republic. com 2.0. Princeton Univ Pr.
30) Thelwall M., Vaughan L., Björneborn L., (2006). Information Retrieval. Webometrics, Annual Review
of Information Science and Technology, Volume 39, Issue 1, Pages 81-135.
31) Valverde, S., Sole, R. V., Bedau, M. A., & Packard, N. H. (2007). Topology and Evolution of
Technology Innovation Networks. Physical Review E, 76(5), 056118.