|This event spans multiple dates:|
|19 May 2022||11:00–12:00||Online|
|26 May 2022||11:00–12:00||Online|
Mary Chester-Kadwell – Research Software Engineering Coordinator, Cambridge Digital Humanities
Please note this workshop has limited spaces and an application process in place. Application forms should be completed by noon, Sunday 8 May 2022. Successful applicants will be notified by the end-of-day Tuesday 10 May 2022.
This course introduces best practices and techniques to help you better manage your code and data and develop your project into a usable, sustainable, and reproducible workflow for research.
Developing your coding practice is an ongoing process throughout your career. This intermediate course is aimed at students and staff who use coding in research or plan on starting such a project soon. We introduce a range of best practices and techniques to help you better manage your code and data, and develop your project into a usable, sustainable, and reproducible workflow. All the examples and exercises will be in Python.
If you are interested in attending this course, please fill in the application form. We will prioritise places for students and staff in the schools of Arts & Humanities, Humanities & Social Sciences, libraries and museums. However, all are welcome to apply.
By the end of this course, you should be able to:
- Understand the role code plays in research, and discuss whether it could or should be open, reproducible and sustainable.
- Set up a new project, or re-organise an existing project, using best practices to structure, version, run and document code and data.
- Do a basic re-organisation of existing code to make it easier to automate or more pleasant to use.
We will cover
- Code as a research artefact: experimental vs reusable code; questions of publishing code and licensing; principles of openness, reproducibility, sustainability.
- Project management:
- Project structure: use a widely-used template to organise your project;
- Version control: track the history of code and data to document progress, roll back and never lose your work;
- Dependencies: record the libraries your code relies on so you can reliably run your code;
- Virtual environments: isolate your code from the rest of your system to avoid conflicts with different versions of dependencies;
- Documentation and self-documenting code: various ways to document your code and data for your future self and others.
- Refactoring (re-organising) code for re-use and automation:
- Transform long functions and Jupyter notebooks into small functions;
- Turn scripts into command-line programs you can automate;
- Add a GUI to programs (i.e. windows with buttons) to make your code easier and more pleasant to use.
- Next steps: resources and directions.
This course takes a ‘flipped classroom’ approach whereby much of the learning takes place self-paced in your own time. Preparatory material is released the week before the course takes place. The course starts with a 1-hour remote video session to introduce the topics and materials and ends with another 1-hour remote video session to discuss progress and next steps. Self-paced materials are provided to work through in between the sessions. A chat forum will be used on Moodle for asking/answering questions during the week.
Please make sure you can plan time in your schedule to complete the preparatory and self-paced materials in order to get the most out of the course. Time estimates for working through these materials are as follows:
- Preparatory materials (total: 0-3 hours):
- Optional: Required installations/set-up: 1 hour
- Optional: Basic introduction to the command-line: 1-2 hours
- Self-paced materials (total: 3-6 hours)
- Version control exercise: 1-2 hours
- Refactoring exercise: 1-2 hours
- Choice of:
- Command-line exercise: 1-2 hours
- GUI exercise: 1-2 hours
The amount of time you may spend on the self-paced materials depends on your pre-existing experience and own personal goals.
Existing knowledge of Python is required, including a working knowledge of Python syntax, variables, conditions, loops, functions and imports. You should be using Python in research in some way and have some existing code to work with or plan to start such a project soon. Basic knowledge of the command-line would benefit, but preparatory material will be available on this topic.
If you are unsure whether your coding experience is sufficient, please apply anyway, and we can talk about it together.
You will need a laptop/desktop to join the sessions and follow the self-paced materials. You will need to install Python 3, a text editor or IDE of your choice, and git version control. You will also need to sign up for a GitHub account. Instructions for these preparations will be provided.