Webraries and Web Archives – The Web Between Public and Private

To read the full-text of this research, you can request a copy directly from the author.


Since the mid 1990s the World Wide Web - or simply: the Web - has become a cornerstone in our communicative infrastructure, and large portions of our individual and societal lives cannot be fully understood without adding the Web to the equation. However, the Web disappears at an unprecedented pace compared to other media types which is why national Web archives have been established to collect and preserve this part of the cultural heritage. But the term 'Web archive' obscures that, by and large, Web archives are not strictly speaking archives, but rather libraries, since they preserve what has been made publicly available, and not the private Web (personal email correspondence, personal social media profiles, companies' intranets). Therefore a national Web archive with a remit to preserve the national Web should rather be called a 'Webrary', that is 'a collection of Web publications'. This chapter argues that what is at stake here is not only a terminological issue. Rather the advent of the Web is challenging and blurring the fundamental distinction between archives and libraries that has prevailed for centuries. © 2017 N. Brügger Published by Elsevier Ltd All rights reserved.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... 42 For these, and for ethical and legal reasons, web archives really only reflect the publicly accessible, or 'open', web. 43 Whilst those doing the web archiving have come up with an array of creative workarounds to potential problems, 44 there are limits. The migration of content from sites and blogs to platform environments is a key challenge. ...
Full-text available
Contemporaneous collecting of the publicly available web has provided researchers with an invaluable source with which to interpret various aspects of the recent past. With millions of websites gathered, stored and made accessible in national web archives over the past 25 years, this paper argues for the need to reflect upon, and respond to, the biases, inequalities and silences that exist in these vast repositories. This article presents a research agenda for web archivists and web historians to together think broadly about the social, material and technical dimensions that shape what is included in web archives, and what is excluded. A key challenge impacting this effort is that various complexities and contingencies of archival formation are obscured. These include wider social inequalities, the entanglement of human and machine decision-making in the archiving process, changing dynamics of power over information online and the environmental impact of technical systems. Accounting for these social, material and technical factors that shape the formation of web archives provides opportunities to develop and use archives in ways that better acknowledge both the strengths and limitations of national web archives as a proxy for the web’s past.
... Because social media platforms are built from software developed by private companies, they are able to shape their content in whatever way their engineers decide (Brügger 2017;Rosenzweig 2001). This can result in extremely different representations of social media data and significant challenges for digital preservation professionals seeking to document and manage an ever-increasing number of proprietary data standards. ...
Full-text available
In this article, we explore the long-term preservation implications of application programming interfaces (APIs) which govern access to data extracted from social media platforms. We begin by introducing the preservation problems that arise when APIs are the primary way to extract data from platforms, and how tensions fit with existing models of archives and digital repository development. We then define a range of possible types of API users motivated to access social media data from platforms and consider how these users relate to principles of digital preservation. We discuss how platforms’ policies and terms of service govern the set of possibilities for access using these APIs and how the current access regime permits persistent problems for archivists who seek to provide access to collections of social media data. We conclude by surveying emerging models for access to social media data archives found in the USA, including community driven not-for-profit community archives, university research repositories, and early industry–academic partnerships with platforms. Given the important role these platforms occupy in capturing and reflecting our digital culture, we argue that archivists and memory workers should apply a platform perspective when confronting the rich problem space that social platforms and their APIs present for the possibilities of social media data archives, asserting their role as “developer stewards” in preserving culturally significant data from social media platforms.
Full-text available
Учебное пособие является первым в России, призванным представить основы работы с веб-архивами при проведении исторических исследований. Предназначено для углубленного изучения веб-архивов как исторических источников и возможностей их использования в исследованиях. В главах книги показана специфика веб-истории как междисциплинарного исследовательского поля, описан процесс веб-архивирования, продемонстрировано влияние веб-архивов на складывание исторических источников нового типа, представлен краткий обзор исторических исследований, проведенных на основе использования ресурсов веб-архивов, рассмотрен инструментарий и методы проведения исследования в области веб-истории. Учебное пособие предназначено студентам исторических специальностей, исследователям, изучающим социальную, культурную, экономическую и политическую историю современности, а также историю информационных технологий, сети Интернет и Всемирной паутины. Также пособие будет полезно студентам и специалистам в области социальных и гуманитарных наук, использующим ресурсы Интернета и веб-архивов в профессиональной деятельности.
A second presidential social media transition in the United States occurred as Joe Biden took office on January 20, 2021. In the years since Barack Obama pioneered the use of platforms like Facebook and Twitter while President, Donald Trump shaped his Presidency around the use of Twitter, primarily through a personal account created before entering politics. In this paper, we examine Donald Trump's use of Twitter during his presidency as a lens through which to understand the ongoing archival preservation and data management challenges posed by social media platforms. The blurred lines between public and private records, deleting tweets, and the preservation issues that appeared after his suspension from Twitter and other platforms following the January 6, 2021 insurrection at the US Capitol all highlight an urgent, ongoing need by archivists, digital preservationists, and information scholars to consider how we might collect and manage social media records in an ever‐changing information landscape. This paper draws primarily on publicly available information from existing preservation initiatives to analyze the state of digital preservation for presidential records. Our findings highlight how both public and private entities manage and provide access to Donald Trump's tweets, pointing to broader implications for social media data preservation.
Web Archiving and Archiving StrategiesA Brief History of Web ArchivingThe Archived Web DocumentWeb Philology and the Use of Archived Web MaterialThe Future of Web ArchivingReferences
Since its beginnings in 1995, the Internet Archive has pursued the longterm goal of providing universal access to all knowledge, within our lifetime.
Web History and the Web as a Historical Source
  • N Brügger
Brügger, N. (2012). Web History and the Web as a Historical Source. Zeithistorische Forschungen, 9(2), 316-325.