skip to content


Text-mining the archive 2

Tuesday, 1 May, 2018 - 11:00 to 13:00

Course leader: Dr Paul Nulty This session will introduce word vectors and topic modelling, two of the most commonly used natural language processing methods in social science and humanities research. Topic models produce sets of words that may be interpreted as representing the underlying topics which give rise to a...

Read more

Text-mining the archive 1

Tuesday, 24 April, 2018 - 11:00 to 13:00

Course leader: Dr Paul Nulty This session will introduce basic methods for reading and processing text files in Python. We will proceed slowly through an example that demonstrates reading in a large text corpus from structured or unstructured files, basic string processing, word frequency counting, and syntactic analysis...

Read more

Automatic Text Recognition: an introduction to Transkribus

Monday, 26 March, 2018 - 14:00 to 16:00

This workshop will show how the Transkribus transcription platform can be used to perform the automated transcription and searching of handwritten and printed documents. The workshop will be delivered by Louise Seaward (University College London) and Roger Labahn (University of Rostock). It will start with a presentation...

Read more

Beyond words (2): challenges in reading historical document collections at scale

Tuesday, 6 February, 2018 - 11:30 to 15:30

Speakers: John Sheridan, Digital Director, The National Archives Daniel Bruder, Cambridge Computer Laboratory Cambridge Digital Humanities is collaborating with the National Archives to run a series of workshops aimed at developing a funding proposal for a project exploring ways of extracting and visualising elements of...

Read more

Turn your PDFs into searchable text

Tuesday, 23 January, 2018 - 14:00 to 16:00

Simple OCR tools

Read more

How to get bulk data from websites

Tuesday, 16 January, 2018 - 11:00 to 12:30

Led by Dr Paul Nulty Many organisations provide access to data through Application Programming Interfaces which define the form of requests that computer programs should make to communicate with other computer programs. Web APIs, such as those provided by many social media platforms and government agencies, allow...

Read more

Webscraping for beginners

Tuesday, 21 November, 2017 - 14:00 to 16:00

Led by: Dr Gabe Racchia Digital research projects commonly require the researcher to collect of a large number of documents from the Internet. Frequently, although the researcher can find the documents online, they are in a format that is impossible to use, and/or there are so many documents that obtaining a large number...

Read more

Digital research project design for beginners

Tuesday, 17 October, 2017 - 14:00 to 16:00

Qualitative methods and digital research

Read more

Curating your own digital archive

Thursday, 16 November, 2017 - 11:00 to 13:00

Principles of metadata creation, version control, and database construction

Read more