ArticlePDF Available

A Framework for Web Science

Authors:

Abstract and Figures

This text sets out a series of approaches to the analysis and synthesis of the World Wide Web, and other web-like information structures. A comprehensive set of research questions is outlined, together with a sub-disciplinary breakdown, emphasising the multi-faceted nature of the Web, and the multi-disciplinary nature of its study and development. These questions and approaches together set out an agenda for Web Science, the science of decentralised information systems. Web Science is required both as a way to understand the Web, and as a way to focus its development on key communicational and representational requirements. The text surveys central engineering issues, such as the development of the Semantic Web, Web services and P2P. Analytic approaches to discover the Web’s topology, or its graph-like structures, are examined. Finally, the Web as a technology is essentially socially embedded; therefore various issues and requirements for Web use and governance are also reviewed.
Content may be subject to copyright.
A preview of the PDF is not available
... The concept of SW has been proposed for smoothly exchanging and reusing information among large information silos [52]. The main idea is to extend the potency of the Web with an analogous extension of the human's cognitive process [52,53]. SW consists of four layers⎯syntax layer (XML, URI, and Unicode), semantic layer (ontology and RDF), provenance layer (rule, logic, proof, and trust), and application layer [53]. ...
... The main idea is to extend the potency of the Web with an analogous extension of the human's cognitive process [52,53]. SW consists of four layers⎯syntax layer (XML, URI, and Unicode), semantic layer (ontology and RDF), provenance layer (rule, logic, proof, and trust), and application layer [53]. The syntax layer Note that other methods can also model chaotic data points such as surface roughness heights: Markov chain, DNA-based computing, non-stationary Gaussian process, semantic modeling, and the like [49]. ...
... The concept of SW has been proposed for smoothly exchanging and reusing information among large information silos [52]. The main idea is to extend the potency of the Web with an analogous extension of the human's cognitive process [52,53]. SW consists of four layers-syntax layer (XML, URI, and Unicode), semantic layer (ontology and RDF), provenance layer (rule, logic, proof, and trust), and application layer [53]. ...
Article
Full-text available
In smart manufacturing, human-cyber-physical systems host digital twins and IoT-based networks. The networks weave manufacturing enablers such as CNC machine tools, robots, CAD/CAM systems, process planning systems, enterprise resource planning systems, and human resources. The twins work as the brains of the enablers; that is, the twins supply the required knowledge and help enablers solve problems autonomously in real-time. Since surface roughness is a major concern of all manufacturing processes, twins to solve surface roughness-relevant problems are needed. The twins must machine-learn the required knowledge from the relevant datasets available in big data. Therefore, preparing surface roughness-relevant datasets to be included in the human-cyber-physical system-friendly big data is a critical issue. However, preparing such datasets is a challenge due to the lack of a steadfast procedure. This study sheds some light on this issue. A state-of-the-art method is proposed to prepare the said datasets for surface roughness, wherein each dataset consists of four segments: semantic annotation, roughness model, simulation algorithm, and simulation system. These segments provide input information for digital twins’ input, modeling, simulation, and validation modules. The semantic annotation segment boils down to a concept map. A human- and machine-readable concept map is thus developed where the information of other segments (roughness model, simulation algorithm, and simulation system) is integrated. The delay map of surface roughness profile heights plays a pivotal role in the proposed dataset preparation method. The successful preparation of datasets of surface roughness underlying milling, turning, grinding, electric discharge machining, and polishing shows the efficacy of the proposed method. The method will be extended to the manufacturing processes in the next phase of this study.
... This new evolution of the web is called the "Semantic Web." The Semantic Web is an extension of the current traditional World Wide Web -adding semantic descriptions and ontologies (Berners-Lee et al., 2006). One benefit is that such characterization and modeling help provide additional meaning to the web content; making content machine-understandable (Berners-Lee et al., 2001). ...
... The Web is often described using metaphors that compare it to an evolving ecosystem. For example, Berners-Lee et al. (2006b) suggest that "exploring the metaphor of 'evolution' may help us to envisage the Web as a populated ecology, and as a society with the usual social requirements of policies and rules". Throughout this thesis, I will try to show that evolution offers not only a metaphor, but also a meaningful theoretical and methodological framework for understanding the development of the Web. ...
Thesis
This thesis explores the evidence behind popular narratives regarding the development of the Web industry over time. Topics such as the rate of technological growth, generations in Web technology and the emergence of novel technological species are conceptualised here using theoretical perspectives from the fields of economic history, innovation studies and cultural evolution. These fields share an interest in applying the evolutionary principles of variation, selection and transmission to technological change. Each principle is investigated here by means of empirical investigations into the temporal patterns of Web innovation, adoption of technical standards and transmission of knowledge through time. The methodological approach is based on an original longitudinal dataset of 20,493 US patents related specifically to the Web to trace the history of this industry between the years of 1990 through 2013. Quantitative analyses revealed that innovation in the Web industry in some ways conformed, and in other ways deviated from theoretical models of technology growth. Areas of consistency include an initial S-shaped trajectory of corporate innovation that aligned with stock market movements. Associations like this have previously been observed in other technological revolutions. The unique aspects of Web evolution relate mainly to its continued growth beyond the expected ceiling of the S-curve. It was found that this extension can be partly attributed to firms who adopted interactive Web 2.0 applications such as social networks, blogs, wikis and RSS feeds. Moreover, Web 2.0 firms were continuing to adopt core Web standards that had been established earlier. It appears that standardisation played a role in the long-term evolution of the Web industry by providing a means for knowledge to be conserved, transmitted and combined in new ways. The thesis concludes with implications for researchers, managers and policy makers with a view to understanding and fostering sustainable long-term innovation. Specific recommendations are also provided to support the future expansion of Web technology into the emerging fields of data science and AI.
... As a first step towards the adaptation of Regulatory Circuits to the analysis of our case study, we successfully integrated the data used to build TF-gene regulatory networks into a unique structured graph (28), therefore facilitating the re-use of this resource. More precisely, in this previous work, we used Semantic Web Technologies, a generic data and knowledge integration framework (29,30), to generate a unique RDF dataset that can be queried by dedicated SPARQL queries. This allowed recomputing the relations between a TF and a gene published by the Regulatory Circuits project. ...
Preprint
Full-text available
Motivation Transcriptional regulation is performed by transcription factors (TF) binding to DNA in context-dependent regulatory regions and determines the activation or inhibition of gene expression. Current methods of transcriptional regulatory networks inference, based on one or all of TF, regions and genes activity measurements require a large number of samples for ranking the candidate TF-gene regulation relations and rarely predict whether they are activations or inhibitions. We hypothesize that transcriptional regulatory networks can be inferred from fewer samples by (1) fully integrating information on TF binding, gene expression and regulatory regions accessibility, (2) reducing data complexity and (3) using biology-based logical constraints to determine the global consistency of the candidate TF-gene relations and qualify them as activations or inhibitions. Results We introduce Regulus , a method which computes TF-gene relations from gene expressions, regulatory region activities and TF binding sites data, together with the genomic locations of all entities. After aggregating gene expressions and region activities into patterns, data are integrated into a RDF endpoint. A dedicated SPARQL query retrieves all potential relations between expressed TF and genes involving active regulatory regions. These TF-region-gene relations are then filtered using a logical consistency check translated from biological knowledge, also allowing to qualify them as activation or inhibition. Regulus compares favorably to the closest network inference method, provides signed relations consistent with public databases and, when applied to biological data, identifies both known and potential new regulators. Altogether, Regulus is devoted to transcriptional network inference in settings where samples are scarce and cell populations are closely related. Regulus is available at https://gitlab.com/teamDyliss/regulus
Thesis
This thesis makes an original contribution to knowledge by demonstrating why it is important to widen our understanding of contemporary political participation to incorporate digital activism and clicktivism, particularly with regard to access and inclusion of a wider range of voices and opinions outside of those who already have access to mainstream political platforms of communication. Existing debates within political science on alternative forms of political participation are limited by comparing them to traditional politics, organisations and processes and ranking them accordingly as legitimate or illegitimate forms of political participation. What is not considered in these debates is that women, particularly feminists, are marginalised from male-dominated political structures, which delimit participation within the bounds of traditional politics. In this thesis, I evidence the significance of feminist digital activism and clicktivism as a means of lowering the barriers to create an inclusive definition of political participation. By taking an interdisciplinary approach, this thesis draws on debates within literature from three fields: web science, political participation and feminist activism. The intersection of these literatures reveals a new perspective on the contested concept of political participation, the motivations for and impact of, labelling digital activism as a form of contemporary political participation, unconstrained by borders, boundaries and citizenship. Accordingly, Twitter is the object of analysis for this qualitative investigation and the specific characteristics and practices that are unique to this platform merit a study of its own, which is currently missing in the literature. Digital feminist activism is explored as a form of political participation through an ethnographic study of feminist activists’ use of Twitter, which demonstrates that instances such as the #MeToo moment in 2017 can raise societal awareness about pertinent issues, which affects political and social change. Drawing concepts from the literature on digital activism, political participation and feminist activism creates the conceptual lens for analysing the empirical data gathered through undertaking a range of semi-structured interviews with feminist activists from Australia, Aoteroa New Zealand, Europe and the United States. The feminist Twitter community was observed as part of the ethnographic study during the year-long interview window, which allowed the researcher to examine feminist activists’ communication, action and connection practices. Further, interview respondents were identified and recruited on Twitter during this observation process. Feminist activists are inherently political; the actions they take, who they communicate with and connect to, are practices shaped by Twitter’s distinct characteristics, which enable feminist activists to interact and connect with geographically dispersed feminists, broadening access to information, resources, and knowledge. A tweet can challenge and critique a sexist headline when it directly addresses the journalist who penned the article and mentions the mainstream media company that published it: I evidence that it is not merely easy, disposable and inconsequential. I argue that clicktivism is a form of digital activism, which enables an individual to be political and to participate. Further, clicktivist practices, such as using a hashtag to contribute to large-scale action are easily replicated, which essentially is what makes this form of digital activism so significant.
Chapter
Full-text available
This chapter explores the interactions of 20 individual minority language users in social media. It queries how they are motivated to communicate in these languages, and to what extent their motivations are ideological. Drawing on theories of innovation and psychology, three categories of motivations are established: intrinsic, self-determined extrinsic, and externally-determined extrinsic motivations. Practices driven by self-determined extrinsic motivations were most directly aimed at protecting or promoting minority languages. The other categories of motivations drove a range of relevant practices, although some participants explicitly denied language-ideological goals. The chapter contributes to a theoretical understanding of how motivations can interrelate with specific contexts to produce socially innovative outcomes.
Article
Zusammenfassung Aus den Ergebnissen von Umfragen zum Stellenwert von Archiven, Bibliotheken und Museen für die Demokratie in der Digitalität im europäischen Projekt ALMPUB ergibt sich, dass diese Informations- und Kulturerbe-Institutionen im Vergleich zu anderen gesellschaftlichen Institutionen ein außerordentlich hohes Vertrauen in der Bevölkerung genießen. Der Beitrag arbeitet die Konsequenzen aus dieser Beobachtung heraus und folgert, dass den Informationseinrichtungen i. w. S. eine hohe gesellschaftliche Verantwortung zufällt. Es wird konstatiert, dass das komplementäre Themenfeld „Vertrauen und Verantwortung“ in der Informationswissenschaft noch Forschungs- und Reflexionsdesiderate aufweist.
Article
Full-text available
The paper proposes a novel framework capable of establishing machine-to-machine (M2M) interactions between chemical and electrical systems in the industry. The framework termed as ElChemo addresses the challenges in M2M interaction of entities from different silos, such as differences in the domains’ behaviour, the heterogeneities arising from different vocabularies and software. The OntoTwin ontology has been developed based on OntoPowSys and OntoEIP ontologies, which are parts of an intelligent platform called the “J-Park Simulator (JPS)”. The ElChemo framework uses Description Logic (DL) and SPIN reasoning techniques to establish the interaction between the chemical and electrical systems in a plant. This paper presents a depropaniser section of a chemical plant and its corresponding electrical system as a use case scenario to demonstrate the interoperability between the two silos within the ElChemo framework. The results from the use case demonstrate, as a proof of concept, the potential of the proposed framework and can be considered as the first step towards the development of a knowledge graph based framework capable of increasing interoperability between cross-domain interactions.
Article
Full-text available
Social machines are a prominent focus of attention for those who work in the field of Web and Internet science. Although a number of online systems have been described as social machines (examples include the likes of Facebook, Twitter, Wikipedia, Reddit, and Galaxy Zoo), there is, as yet, little consensus as to the precise meaning of the term "social machine." This presents a problem for the scientific study of social machines, especially when it comes to the provision of a theoretical framework that directs, informs, and explicates the scientific and engineering activities of the social machine community. The present paper outlines an approach to understanding social machines that draws on recent work in the philosophy of science, especially work in so-called mechanical philosophy. This is what might be called a mechanistic view of social machines. According to this view, social machines are systems whose phenomena (i.e., events, states, and processes) are explained via an appeal to (online) socio-technical mechanisms. We show how this account is able to accommodate a number of existing attempts to define the social machine concept, thereby yielding an important opportunity for theoretical integration.
Book
Efficient access to data, sharing data, extracting information from data, and making use of the information have become urgent needs for today's corporations. With so much data on the Web, managing it with conventional tools is becoming almost impossible. New tools and techniques are necessary to provide interoperability as well as warehousing between multiple data sources and systems, and to extract information from the databases. XML Databases and the Semantic Web focuses on critical and new Web technologies needed for organizations to carry out transactions on the Web, to understand how to use the Web effectively, and to exchange complex documents on the Web. This reference for database administrators, database designers, and Web designers working in tandem with database technologists covers three emerging technologies of significant impact for electronic business: Extensible Markup Language (XML), semi-structured databases, and the semantic Web. The first two parts of the book explore these emerging technologies. The third part highlights the implications of these technologies for e-business. Overall, the book presents a comprehensive view of critical technologies for the Web in general and XML in particular. The semantic Web, XML, and semi-structured databases are still relatively new technologies that integrate many other technologies. As these technologies and integration of these advances mature, we can expect to see progress in the semantic web. The information contained in XML Databases and the Semantic Web is essential to the future success of effective e-business on the Web.