<style> /* reduce from default 48px, center: */ .reveal { font-size: 24px; text-align: left; } /* remove black border from images: */ .reveal section img { border: 0; box-shadow: none; } </style> # Integration of Kitodo and OCR-D for Productive Mass-Digitisation ## OCR-D Phase 3 Kick-Off #### Robert Sachunsky #### July 29, 2021 --- ## Implementation Project Kitodo / OCR-D * Participants * Sächsische Landesbibliothek – Staats- und Universitätsbibliothek Dresden (SLUB) * Universitätsbibliothek der TU Braunschweig (UBBS) * Universitätsbibliothek Mannheim (UBMA) * Volume: 8 man-years * Duration: 2 years * Start: October 2021 <div> <span style="float:left;"> ![](https://www.kitodo.org/typo3conf/ext/kitodo_website/Resources/Public/Images/kitdoHomeLogo.svg =400x) </span> <span style="float:right;"> ![](https://pro.europeana.eu/files/Europeana_Professional/EuropeanaTech/EuropeanaTech%20Insight/Images/OCR/c2.png =550x) </span> </div> --- ## Prior Work * SLUB staff: OCR-D coordination / module project (2018/19) * UBMA: OCR-D module project (2018/19) * SLUB, UBMA: OCR-D pilot libraries (2019/20) * SLUB: Kitodo development (since 2012) * UBBS: Kitodo testing, documentation, migration * all: experienced OCR-D and Kitodo users --- ## Premises * Kitodo: Workflow Management System for libraries * Open-source, community-driven * Modules: * Kitodo.Production (digitisation workflows) * Kitodo.Presentation (DFG viewer etc.) * OCR: only via commercial plugins (black box, license costs) * OCR-D: operative single-workstation command-line prototype * no network interfaces for distribution/scaling yet * no error recovery and dynamic workflow execution yet * no result quality estimation and runtime evaluation yet * no assisted/automatic workflow configuration yet --- ## Goals (OCR-D) 1. Implement OCR-D as Web-based distributed system - easily scalable <!-- (by adding computing resources / servers) --> - easily deployable <!-- (via container virtualization) --> 2. Develop quality based workflow optimisation for OCR-D - use **heuristics and models** for quality estimation of interim results during preprocessing, segmentation and recognition - **weight** interim result quality relative to contribution to overall result (follow-up steps, other pages) - when insufficient, **switch** to alternative configuration for segment/page/document automatically, or **abort** computation - offer a set of empirically **optimised**, dynamically quality-controlled workflow **configurations** for various materials --- ## Goals (Kitodo) 3. Implement OCR-D as OCR module in Kitodo.Production - import images, meta-data and structure data - track and visualise result progress/quality - error handling, versioning - export, validate and ingest results - edit and manage workflow configurations 4. Extend Kitodo.Presentation and DFG Viewer - user evaluation of results, versioning - user prioritisation of (re-)OCR tasks (On-Demand-OCR) --- ## Goals (further) - close collaboration with OCR-D coordination project - cooperation with other OCR-D implementation/module projects - Kitodo community workshops (disseminate and query requirements) - Kitodo community OCR-D service (test operation) --- ## System Architecture * Kitodo with OCR-D "backend" as distributed system * strong integration of data and process management * generic/agnostic on both sides ![architecture](https://i.imgur.com/KUWoEKl.png =750x) --- ## Project Plan - AP1 (SLUB): coordination and communication - AP2 (UBBS): management of Kitodo community - AP3 (SLUB): detailled technical specification - AP4 (SLUB): OCR-D server implementation - AP5 (SLUB): concept for automatic process control - AP6 (SLUB): implement automatic process control - AP7 (SLUB): develop quality estimation metrics & models - AP8 (UBBS): set up OCR-D service for Kitodo community - AP9 (SLUB): integrate OCR-D into Kitodo.Production - AP10 (UBMA): integrate OCR-D into Kitodo.Presentation - AP11 (UBMA): data storage, ingest and versioning - AP12 (SLUB): run and test DFG Viewer with OCR on Demand - AP13 (UBBS): evaluation and documentation with Kitodo community --- ## Synergies and Interfaces - extending the processor CLI for error handling/signalling and parallelism - molding the final OCR-D Web API - providing reference implementations for server components - providing reference implementations for module containers - definining an evaluator CLI (analogous to processor CLI) - generalising the OCR-D workflow format for evaluators and switches - running & evaluating workflow experiments systematically - compiling optimised workflow configurations - defining quality metrics for workflow steps --- ## Q & A Thank you!
{"metaMigratedAt":"2023-06-16T05:06:59.844Z","metaMigratedFrom":"YAML","title":"Integration of Kitodo and OCR-D for Productive Mass-Digitisation","breaks":true,"description":"OCR-D Phase 3 Kick-Off (presentation)","slideOptions":"{\"theme\":\"white\",\"slideNumber\":true}","contributors":"[{\"id\":\"c62f1b15-791a-47e1-8e4c-ab2ed00c04bc\",\"add\":7887,\"del\":2880}]"}
    431 views