Tools for Annotation & Corpus Linguistics
Date: 2. Juli 2025
Speaker: Luke Günther, Project S, SFB 1252
Duration: 14:00 - 15:30
Overview
Essential tools and techniques for corpus linguistics and text annotation. Manual and automated approaches to linguistic data analysis.
Learning Objectives
- Tools for corpus creation and management
- Annotation schemes and best practices
- Automated vs. manual annotation approaches
- Practice with linguistic data and annotation tools
Materials
- 📊 Annotation & Corpus Tools - Presentation slides by Job Schepens
Note: This presentation was generated almost completely with Claude Sonnet 4.
- 🤖 Automatic Annotation Tools - Brown bag lunch presentation by Job Schepens and Fahime Same (formerly INF)
Note: This presentation was given during a brown bag lunch session last year.
Tools Covered
- ELAN - Multimedia annotation tool
- WebAnno - Web-based annotation platform
- CATMA - Computer Assisted Text Markup and Analysis
- AntConc - Corpus analysis toolkit
- R packages - quanteda, tidytext, and others
Additional Resources
- CLARIN-D - Digital research infrastructure for humanities
- ELAN Documentation - Multimedia annotation tool
- WebAnno Project - Collaborative annotation platform
- Corpus Linguistics Resources - General corpus linguistics information
- Natural Language Toolkit (NLTK) - Python library for NLP