skip to primary navigationskip to content
 

Text-mining the archive 2

When May 01, 2018
from 11:00 AM to 01:00 PM
Where S2, Alison Richard Building
Add event to calendar vCal
iCal

Course leader: Dr Paul Nulty

This session will introduce word vectors and topic modelling, two of the most commonly used natural language processing methods in social science and humanities research. Topic models produce sets of words that may be interpreted as representing the underlying topics which give rise to a collection of texts, and produce estimates of the prevalence of each topic in each document. Word vectors allow quantitative comparison of word similarity and association. We will proceed through example code using the gensim python library, illustrating the application of these methods, and help attendees to get functioning examples up and running on their own system. 
Pre-registration is essential: please book here
PhD students and staff from the University of Cambridge have priority for bookings on this course - if the course appears fully booked and you fall into this category please contact the course organisers directly

What is CDH?

Cambridge Digital Humanities is a creative and collaborative space where students, researchers and international visitors can come together to engage in dialogue, experiment with technology and advance scholarship.

CDH on Twitter