9 May 2022 - 11 May 2022 9:00am-5:00pm Alison Richards Building, Cambridge


May 9-11, 2022

A digital humanities hackathon organised by CDH in collaboration with Cambridge University Libraries.


Timings: 9am-5pm each day
Location: Cambridge (in-person)
Eligibility: University of Cambridge staff and students only

Places are limited. Deadline to apply extended to March 31, 2022.

Apply here:

(In)Accessible Archive(s) Hackathon is an exciting opportunity to explore novel visualisation and alternative experiences of digital collections. We’re inviting people from any discipline and with a wide range of skill sets to join us. We seek new ways to break open the inaccessible archive and bring a rich experience of inaccessible knowledge into the open.

Digital collections are situated within a complex and often contested realm of accessibility and inaccessibility. Whether they are digitised from physical collections or born digital, individuals and groups can struggle to access and understand them. The form in which they are presented, the language used to describe them, and the circumstances in which they were collected, are some of the issues that may play a role. What can we do to bring out the inaccessible worlds, places and spaces of privilege and history that are documented within?

We offer support from experts in collections and digital methods from Cambridge University Library and Cambridge Digital Humanities. We also make some suggestions for topics and datasets, but we encourage and welcome you to develop your own project.

(In)Accessible Archive(s) Hackathon will be an intensive, creative experience. The goal is to prototype a visualisation, or other form of expression, presented on the web, that explodes some aspect of the data for a new audience. The Hackathon will be based around three themes, which are detailed below, and which provide example questions and projects you could work on during the sessions.

Who are we looking for?

Cambridge University staff, students and researchers, from any discipline! We’re looking for developers, designers, creatives, archivists, museum professionals, historians, scientists, linguists, as well as anyone with an interest in historical texts to join us.

Further information

The Hackathon will be in-person and you will need to provide your own laptop. Lunch will be provided each day.

Please get in touch with any questions or access requirements:

Hackathon Themes

Theme 1: (Re)Gaining Access

In this theme you will explore accessibility in digital collections. While the digitisation of physical artefacts is one step in improving access to archival materials, much more is often needed to make an archive accessible. How can we improve access for a range of users with different requirements, or develop new approaches that support meaningful engagement with what is contained within? How does the metadata attached to archive content dictate access? Given the diverse range of collection materials and sources, there is a risk that the language used in describing the content in our data could prevent/challenge cross-discovery if there are pockets of nuance in data creation or different cultural approaches.

Example question

  • How does the metadata attached to archive content dictate access?
  • How does the way we describe places/time in the metadata affect our ability to make connections?
  • How can we negotiate different languages and scripts in a dataset – can it be done automatically or do we need to record different versions in the metadata?

Example projects

  • Generate alt-text descriptions for visual content
  • Creating user-views that are suitable for colour blind users
  • Exploring limitations and omission of existing content description data
  • IIIF manifests into an ImageGraph tool and attempting to analyse the colour information contained in the images, with a mind to seeing if we could potentially flag-up images that might be challenging for those that are colour-blind – red and green are often used with meaning in Medieval manuscripts for example.
  • Inputting the IIIF manifest to ImageGraph, creating a machine-generated description/caption, and seeing if we could add that back into the IIIF manifest in order to create “alt-text” that could be read by screen readers.

Example Datasets

Theme 2: (Un)Travelled Places

In this theme you will explore digital collections that document places and spaces where access is not necessarily straightforward, either physically, politically or ethically. These collections document routes less travelled, places that might no longer exist, or sites that can no longer be visited. The knowledge they contain might be esoteric, exotic, ephemeral, and seen often through the eyes of outsiders.

Example questions

  • What has been preserved and what has been lost?
  • What points of view are privileged and what are ignored or missing?
  • What methods can we use for exploring spatial and temporal narratives within the archive?
  • How can the archive be made accessible to those who live (or used to live) in the places that are documented?
  • What value is there for current political or global challenges (e.g. war, climate change) in studying these archives?
  • What role does the travel writer or academic researcher play in writing histories from a particular point of view?

Example projects

  • An interactive map showing how and why a place transformed over time
  • An experience of forbidden, ‘wild’ or inaccessible places, or a critique of those concepts
  • Automated image classification, labelling or analysis of photographs with little metadata
  • Linked open data visualisation of species or archaeological sites and monuments
  • Topic modelling or sentiment analysis of language used to describe ‘foreign’ places

Example datasets

  • Landscapes and monuments: Iran to Spain 1977-2007

    • A photographic archive of monuments and landscapes, including documentation of Palmyra, Aleppo and Yemen, which have since been transformed through war, urbanisation and dam-building. Fowden is a historian of first millennium CE Eurasia.
  • Joseph Needham’s archive of wartime China 1943-46

    • A collection of travel journals and photographs documenting sites of historic interest, Buddhist grottoes and the history of Chinese science, technology and medicine. The famous biochemist travelled as a representative of the British Council.
  • Oliver Rackham’s notebooks 1960-2015

    • A collection of notebooks kept continuously from Rackham’s youth right up until his death, including notes, maps, sketches and photographs of plants, woodlands and landscapes from Britain and internationally. Rackham was a prolific writer and historical ecologist who had a special interest in historic woodlands.
  • Visions of Italy between XIX and XX century

    • A collection of travel writings – reports, diaries, letters and guidebooks – about Italy written by English native authors and published between the country unification and the beginning of the 1930’s.

Theme 3: (Un)Privileged Access

In this theme you will work with datasets that document debates and decision-making processes that impact on public life, but where participation may require privileged access rights. The Hansard Speeches dataset records parliamentary debates, where elected MPs set the agenda for public priorities. But how have different topics been defined and ascribed value in political speeches? And how closely do the elected represent the people? For example, in 1980 only 3% of MPs in the UK were women. How did the increasing participation of women influence what items made it onto the agenda or how they were talked about?

Particularly when working with textual data, finding ways to discover, summarise, visualise is important in supporting sensemaking processes. In this sub-theme you will look at patterns and trends in language use to help interpret the wider socio-political issues present in the data.Example questions

  • What topics are most dominant, contested or divisive?
  • What narratives are constructed using “evidence” and expert opinions
  • How are facts presented and utilised in arguments?
  • Did the participation of women MPs change as numbers increased and what impact did this have on agenda setting / topic discussed?
  • How is the role of expert performed and in what ways is this captured in the language used?

Example projects

  • A visualisation of sentiment and topic composition of the data
  • An interface to facilitate argument exploration, or for exploring the use of subjective versus objective language, in debates.
  • A visualisation of the changing participation of women in British politics (e.g. number of words spoken, topics, keywords, etc.), including points in time where the number of women MPs suddenly increased (e.g. 1997 election).
  • A geographical representation of the prominent concerns in different regions.
  • A visualisation of the coverage and narrative strategies used in relation to one more topic (e.g. climate change or art vs science) or comparing different countries.

Example datasets

  • Hansard Speeches and Sentiment V3.0.1:

    • A dataset of speeches made in UK parliament of ten words or more, made in the House of Commons between 1980 and 2016, together with speaker age, gender and constituency. Speeches from 1936 to 1979 are also available, but speaker information is not included. Automatically generated sentiment labels are included. Available as csv, json and R format.
  • Strategic Advisory Group of Experts (SAGE) Meeting Minutes:

    • Minutes of approximately 90 SAGE meetings convened to establish scientific guidance for the COVID-19 pandemic. Each meeting details emerging issues and documents the development of scientific guidance and advice. The minutes also record attendees and key actions points.
  • ParlaSpeech Corpus:

    • Parliamentary speeches from the key legislative chambers of Austria, the Czech Republic, Germany, Denmark, the Netherlands, New Zealand, Spain, Sweden, and the United Kingdom over the past 30 years. Meta-data include information on date, speaker, party, and partially agenda item under which a speech was held. The accompanying release note provides a more detailed guide to the data.

Booking is now closed

Cambridge Digital Humanities

Tel: +44 1223 766886
Email enquiries@crassh.cam.ac.uk