# Workshop MEDAL summer school 2023 Tartu ## Digitized newspapers and how to use them - [Slides](https://docs.google.com/presentation/d/1S212VO0bhFNAu17H11FhH6ToFaG7X90Y3m5LhkIejgo/edit?usp=sharing) - [Files](https://owncloud.ut.ee/owncloud/s/8YXCDW2A9Pk738g) [short-url](http://tiny.cc/estnewspapers) - [Only code](https://owncloud.ut.ee/owncloud/s/Gj9GYzRMaPKdZzD) [short-url](http://tiny.cc/medalws3bcode) ## Access codes Use your code to access jupyter.hpc.ut.ee/, use the default 1 core access there. These codes are meant to be used in the workshop and will expire soon. To get a permanent code, e-mail digilab@rara.ee. ## Tools and overviews - **National Library DigiLab website** https://digilab.rara.ee/ - **DEA overview** https://peetertinits.github.io/reports/nlib/dea_info.html - **DEA search interface** https://dea.digar.ee/ - **DEA access via JupyterLab** https://digilab.rara.ee/en/tools/access-to-dea-texts/ / https://digilab.rara.ee/tooriistad/ligipaas-dea-tekstidele/ - **Newspaper digitization overview in Estonia** https://peetertinits.shinyapps.io/digitized_newspapers/ - **DEA metadata explorer** https://digilab.rara.ee/tooriistad/ajalehtede-metaandmete-sirvija/ - **DEA ngram explorer (Postimees 1880-1940)** https://digilab.rara.ee/tooriistad/dea-ngram-uurija/#uagb-tabs__tab2 ## Extra - OCR errors, confidence measures and corrections. - Similar words and improving your query step by step. - Making balanced samples for studies. ## Tasks Do a search across the dataset, use the tools at your disposal to get a good set. E.g. - ice cream in the 1920s - lemonade in the 1920s - swimming in the 1920s