# Workshop MEDAL summer school 2023 Tartu
## Digitized newspapers and how to use them
- [Slides](https://docs.google.com/presentation/d/1S212VO0bhFNAu17H11FhH6ToFaG7X90Y3m5LhkIejgo/edit?usp=sharing)
- [Files](https://owncloud.ut.ee/owncloud/s/8YXCDW2A9Pk738g) [short-url](http://tiny.cc/estnewspapers)
- [Only code](https://owncloud.ut.ee/owncloud/s/Gj9GYzRMaPKdZzD) [short-url](http://tiny.cc/medalws3bcode)
## Access codes
Use your code to access jupyter.hpc.ut.ee/, use the default 1 core access there. These codes are meant to be used in the workshop and will expire soon. To get a permanent code, e-mail digilab@rara.ee.
## Tools and overviews
- **National Library DigiLab website** https://digilab.rara.ee/
- **DEA overview** https://peetertinits.github.io/reports/nlib/dea_info.html
- **DEA search interface** https://dea.digar.ee/
- **DEA access via JupyterLab** https://digilab.rara.ee/en/tools/access-to-dea-texts/ / https://digilab.rara.ee/tooriistad/ligipaas-dea-tekstidele/
- **Newspaper digitization overview in Estonia** https://peetertinits.shinyapps.io/digitized_newspapers/
- **DEA metadata explorer** https://digilab.rara.ee/tooriistad/ajalehtede-metaandmete-sirvija/
- **DEA ngram explorer (Postimees 1880-1940)** https://digilab.rara.ee/tooriistad/dea-ngram-uurija/#uagb-tabs__tab2
## Extra
- OCR errors, confidence measures and corrections.
- Similar words and improving your query step by step.
- Making balanced samples for studies.
## Tasks
Do a search across the dataset, use the tools at your disposal to get a good set.
E.g.
- ice cream in the 1920s
- lemonade in the 1920s
- swimming in the 1920s