30 Apr 2020 11:00 - 12:00 IT Training Room, Cambridge University Library

Description

Introduction to Text-mining with Python [remote delivery]
This online session will introduce basic methods for reading and processing text files in Python with Jupyter Notebooks. We’ll discuss why you might wish to do text-mining, and whether coding with Python is the right choice for you. We’ll run through the 5 steps of text-mining, and start to walk through an example that reads in a text corpus, splits it into words and sentences (tokens), removes unwanted words (stopwords), counts the tokens (frequency analysis), and visualises results.
This initial session is one hour long and will be delivered remotely by video conferencing. During the session we will cover the essentials of working with the Jupyter Notebooks provided so that you can carry on working through the materials in your own time. The first session will be followed by a second, optional Q&A session for troubleshooting issues and recapping essentials.
Pre-requisites: No prior knowledge of Python is required, and no installations will be needed. We will use web services available in your browser to follow along.
Required preparation: A short internet-based exercise in working with variables and text in Python will be sent out one week prior to the session. You will also get instructions on how to find the materials we will be using and how to log onto the video conferencing platform. Please make sure you have some time to prepare properly so that we can concentrate on teaching during the remote session.
Audience: PhD students and staff
Please register here
 

Cambridge Digital Humanities

Tel: +44 1223 766886
Email enquiries@crassh.cam.ac.uk