Project

Digital History Explorations

Updates
0 new
0
Recommendations
0 new
0
Followers
0 new
2
Reads
0 new
39

Project log

Rik Hoekstra
added 2 research items
Digitization and digital methods have had a big impact on migration history and history in general. The dispersed and fragmented nature of migration heritage that involves at least two countries and many cultural heritage institutions make it clear that migration history can be much improved by using digital means to connect collections. This makes it possible to overcome the biases that policy have introduced in private and public collections alike by selection and perspective. Digital methods are not immune to these biases and may even introduce new distortions because they often change heritage contextualizations. In this article, Van Faassen and Hoekstra argue that therefore they should be embedded in source criticism methodology. They use the example of post-world War II Dutch-Australian emigration to show how a migrant registration system can be used as a structural device to connect migrant heritage. They use methods from computer vision to assess the information distribution of the registration system. Together, connecting collections and information assessments give an encompassing view of the migrant visibility and invisibility in the heritage collections and perspectives for scholars to become aware of heritage biases.
The past decades have changed the way we deal with archives and archival materials. Archives digitised their inventories and part of their collections, but they were joined by many other parties who published archival collections and archive-worthy materials on the web. The information world is in continuous flux, and the developments have made clear that archives and libraries have lost much of their position as the vested authorities of information access. They are challenged by technological parties and citizen science that have as yet not established themselves in definitive positions as information brokers. We propose to analyse the field in terms of information authority, a composite of many different aspects that all contribute to its importance, availability and use. In this article, we first explore the issue of (information) authority in the digital realm, and explain why we choose a conflict metaphor to analyse the different types of partners in the information ecosystem. Digital archives call for cooperation and openness as information is ‘everywhere’, but this is hard to realise as it requires translating intentions into technical means. To maintain their position of authority, archives adopt standards and regulations. We argue that openness is the key, but hard to organise with the existing standards because they are used in monolithic ways that make it hard to combine information. Combining asks for methods from established scholarly and archival disciplines as well as from technology. Furthermore, sharing and cooperation require a harmonisation of contexts: the different contexts in which information is created and organised need to be aligned to understand how collections can be combined across different dimensions. This calls for providing (structured) metadata that define the scope of a collection, to allow one to determine whether combining information is useful. At the moment, the state of affairs is in flux and there is no fixed methodology. In the last part, we explore ways to facilitate and evaluate interdisciplinary communication and collaboration on methodology to address the preceding challenges of producing quality, in all realms.
Rik Hoekstra
added 8 research items
Structure from Sources - Large collections of sources like serial archive funds contain a wealth of (historical) information. Digitizing them promises to make this information available for large scale processing, making the information much better available than when they still were just paper. But digitizing is an involved process, that goes much further than just scanning and text recognition. For twenty years or more historians and other humanities scholars alike have struggled with large collections of imperfect texts in which many thousands of mostly unknown entities figure, like persons, institutions, places etcetera. These collections, however, contain structure and structural information of their own that were consciously and unconsciously imposed on them when they were created. If we use these structures and make them explicitly available in our data, contextualizing the dispersed information, thus making it available for further analysis. In this HuC lecture examples from both the REPUBLIC and the Migrant, Mobilities and Connection projects will be used to show how this approach can be used.
In the NWO REPUBLIC project, we are creating digital access to the corpus of the Resolutions of the States General of the Dutch Republic (1576-1796). This corpus contains the decisions made in the States General each day for a 220 year period. The resolutions were recorded using a standard structure and contain many standard formulations for aspects of the decision making process, including the source of the topic that was decided on (a formal request, a missive, etc.), whether a decision was reached and what that decision was. We discuss different techniques we use to identify formulaic expressions and how we iteratively build a corpus-specific phrase model with which we can identify 1) the dates and attendants of each meeting, which are followed by all the resolutions of that day, 2) resolution boundaries, e.g. where they start and stop in the running text, so we know which text belongs to which resolution, 3) different types of opening phrases that correspond to different types of sources (e.g. requests, missives, reports, etc.), and 4) the decision paragraphs that state what decision, if any, was reached. We discuss how we built ground truth to evaluate the phrase model and the fuzzy searching and extraction process. Finally, we discuss how this approach generalised to other corpora and text genres.
Rik Hoekstra
added a research item
The term Macroscope has recently been introduced as an instrument to study historical big data using digital tools. In this paper we argue the need for a more elaborate set of con- cepts to describe and reason about the interactions to select, enrich, connect, analyse and evaluate historical data using digital tools. Interactions change the data and are essential in understanding any subsequent analysis. It makes them part of historical research method- ology, but there is little consensus on how these steps can or should be performed. Moreover, they are rarely reported and discussed. We introduce the term data scope as an instrument encompassing these choices and interactions. Elaborating on these processes encourages deeper reflection on and discussion of the interactions and their consequences for research outcomes.
Rik Hoekstra
added a research item
Objective: To examine if, over a period of centuries, the Dutch medical establishment enjoyed a survival advantage over a population group with a comparable social background and level of education. Design: Retrospective database research. Method: We used documents which provided data on the births and deaths of 15,649 male and 659 female medical professionals and of 15,304 male clergy. We calculated the remaining life expectancy at the age of 25 of those generations born between the middle of the 16th century and the beginning of the 20th century. We applied event history analysis to estimate remaining life expectancy, dependently of survival at the age of 25. In doing this we applied Gompertz distribution and made a maximum likelihood estimation. Results: From the middle of the 16th century onwards, the development of the life expectancy of medical professionals and clergy was comparable; it was characterised by a continuing increase in remaining life expectancy which was only interrupted in those generations who were confronted with a series of epidemics. The level of the remaining life expectancy was also comparable. Only in the generation born in the first decade of the 20th century did the life expectancy of medical professionals become on par with that of the total male population. The remaining life expectancy of female medical professionals born from 1850 onwards was higher than that of the total female population. Conclusion: For a long time, medical training conferred no advantage on survival.
Rik Hoekstra
added a research item
For quite some time, humanities scholars have been using digital tools in addition to their established methodology to try and make sense of large and expanding data sources that cannot be handled with traditional methods alone. The digital methods have computer science aspects that may be combined with but do not readily fit into humanities methodology; an issue which is still too implicit in scholarly debate. This gives rise to a need for methodological consolidation to structure the combination of digital and established humanities methods. In this paper, we propose an approach to such consolidation, that we call data scopes
Rik Hoekstra
added a research item
Preprint of an article for the congress volume about the Registers of the Counts of Holland from the House of Henegouwen
Rik Hoekstra
added a research item
Data Scopes zijn een concept voor de omgang met samengestelde data in een humanities context. Met data scopes willen we bijdragen aan methodologische reflectie op en consolidatie van de verzameling methoden die door velen in de menswetenschappen (vaak in de vorm van tools) al worden gebruikt in aanvulling op de bestaande methoden. Data scopes komen voort uit onderzoeksvragen en worden gevormd in het onderzoek in interactie tussen de onderzoeker en zijn ruwe data. We onderscheiden vier verschillende activiteiten in het onderzoek met data scopes - modelleren - normalisatie - linken - categoriseren Met data scopes willen we bereiken dat - data verwerking erkend wordt als essentieel en onlosmakelijk onderdeel van het onderzoek, en niet een noodzakelijk technisch proces waarna het echte onderzoek kan beginnen - meer transparantie bij de totstandkoming van onderzoek dat gebaseerd is op de verwerking en analyse van grote en complexe hoeveelheden data
Rik Hoekstra
added 5 research items
Uit: Eef Dijkhof, Michel van Gent (red.), Uit diverse bronnen gelicht. Opstellen aangeboden aan Hans Smit ter gelegenheid van zijn vijfenzestigste verjaardag (Den Haag, 2007), 117-131.
Dataset with Spanish Habsburg Viceroys between 1500-1700, consisting of three tables in csv format with data about persons , functions and occupation of posts (with begin and en year) Sources: The lists of viceroyalties were compiled into a database from a large number of printed and online sources. The most important are: http://grandesp.org.uk (now only available from the internet archive archive.org) http://fmg.ac/Projects/MedLands/SPANISH%20NOBILITY%20LATER%20MEDIEVAL%202.htm; Mendoza, Poderosas señores, http://www.uam.es/personal_pdi/ciencias/depaz/mendoza/monde2.htm; http://geneall.net/; http://remilitari.com/; http://tercios.org/, http://web.archive.org/web/20151022021933; http://www.fundacionmedinaceli.org/casaducal [all retrieved December 20, 2015). Printed sources include the lists in RUIZ RODRIGUEZ 2007. SCHAEFER 2003. HANKE AND RODRIGUEZ 1976-1978 and HANKE AND RODRIGUEZ 1978-1980.
Rik Hoekstra
added a research item
Digitale publicaties zijn niet alleen geschikt om snel iets op te zoeken of als verzameling van materiaal, maar ook geschikt waren voor heel andere betogen. Een digitale bronnenpublicatie heeft daarom meer niveau’s. Aan de hand van een aantal voorbeelden uit eerder onderzoek wil ik illustreren hoe verschillende van de publicaties uit de Huygens ING resources bruikbaar zijn buiten het directe terrein waarvoor ze zijn gemaakt.