About this workshop
Learning Objectives
By the end of this module, you will be able to:
- List the common methodological approaches used in text preparation and identify when and how to use them based on source materials and analysis objectives.
- Use OpenRefine to define subsets of a dataset for further processing and normalize textual data and/or metadata
- Explain the benefits and challenges of applying a scripted or semi-scripted approach to text preparation and analysis; identify situations where scripting your work will be beneficial.
- Apply prepared computational techniques to perform common text preparation steps and basic analyses.
Schedule
Activity | Time Allotted | Key Topics / Activities |
---|---|---|
Pre-workshop activities (Self-guided) | 60 - 90 minutes (completed before workshop) | Install OpenRefine Introduction to OpenRefine Introduction to Python and Jupyter Notebooks |
Introductory lecture + discussion | 15 minutes | Introduction to text preparation and analysis - Why prepare your text? - Overview of concepts and methods - Key considerations for different source materials and analyses |
OpenRefine Part 1 (hands-on) | 20 minutes | Introduction to OpenRefine Manual cleanup (e.g. find and replace) Faceting |
Break | 10 minutes | Break |
OpenRefine Part 1 (hands-on) | 20 minutes | Stemming and clustering GREL/regular expressions |
Programming & Python (lecture + hands-on) | 20 minutes | Overview of programmatic approaches The ‘what’ and ‘when’ to program Using Python for text preparation |
Final Thoughts (lecture + discussion) | 10 minutes | Final thoughts & key considerations Where to learn more |
Head to the Preparation page page to get started with your pre-workshop activities.