Implementation Project OCR-D / Kitodo - HackMD

<style> /* reduce from default 48px: */ .reveal { font-size: 24px; text-align: left; } .reveal .slides { text-align: left; } /* change from default gray-on-black: */ .hljs { color: #005; background: #fff; } /* prevent invisible fragments from occupying space: */ .fragment.visible:not(.current-fragment) { display: none; height:0px; line-height: 0px; font-size: 0px; } /* increase font size in diagrams: */ .label { font-size: 24px; font-weight: bold; } /* increase maximum width of code blocks: */ .reveal pre code { max-width: 1000px; max-height: 1000px; } /* remove black border from images: */ .reveal section img { border: 0; } .reveal pre.mermaid { width: 100% !important; } .reveal svg { max-height: 600px; } .reveal .scaled-flowchart-td pre.mermaid { width: 100% !important; /* why? float: left; */ } .reveal .scaled-flowchart-td svg { max-width: 100% !important; } .reveal .scaled-flowchart-td svg g.node, .reveal .scaled-flowchart-td svg g.label, .reveal .scaled-flowchart-td svg foreignObject { width: 100% !important; } .reveal .scaled-flowchart-td p { clear:both; } .reveal .centered { text-align: center } .reveal .width75 { max-width: 75%; } /* remove black border from images: */ .reveal section img { border: 0; box-shadow: none; } </style> # Implementation Project<br>OCR‑D / Kitodo ![slub-logo](https://www.slub-dresden.de/typo3conf/ext/slub_template/Resources/Public/Images/slublogo.svg =200x)     ![tub-logo](https://www.tu-braunschweig.de/typo3conf/ext/tu_braunschweig/Resources/Public/Images/Logos/tu_braunschweig_logo.svg =200x)       ![mannheim-logo](https://www.bib.uni-mannheim.de/typo3conf/ext/uma_site/Resources/Public/Images/Icons/logo-ub-de.svg =250x) 1. Current Status 2. Next steps 3. Challenges --- ## 1 Current Status **SLUB Dresden:** * Dockerized components for development * integrated pipeline from script task in .Production to OCR-D and back * **but** very simple, with very basic technologies (SSH, shared data, hard-coded configurations) * lots of processor improvements **UB Mannheim:** * updated ocrd_all (GH Action for Docker images, fixes and optimizations) * working on a public DFG Viewer test instance with an OCR-on-demand proof of concept **UB Braunschweig:** * testing SLUB components in a live .Production setting * reporting and processing of Kitodo users survey results * working paper to gather information for the project ---- ## Architecture ![](https://i.imgur.com/UMiVd3Y.png) ---- ## Components * Starting point: https://github.com/markusweigelt/kitodo_production_ocrd * **Kitodo.Production / Kitodo.Presentation / …** * any system that wants to use OCR-D processing * our goal is a flexible integration of systems * **OCR-D Manager** * mediator between systems (flexible integration) * delegates OCR to dedicated processing infrastructure (Controller) * signaling / reporting * validation * extraction (processed images, fulltext, annotations, mets, …) * -> https://github.com/markusweigelt/ocrd_manager * **OCR-D Controller** * web service entry point * orchestrates processing infrastructure * -> https://github.com/bertsky/ocrd_controller ---- ## Modes of operation * **Easy:** * images * **Standard:** * images + METS (+ default workflow selection) * **Expert:** * images + METS + custom workflow configuration ---- ## Integration ![](https://i.imgur.com/UBu5zVg.png) ---- ## Demo --- ## 2 Next Steps **SLUB Dresden:** * improve components to achieve more flexibility * job queue, or delegate to workflow/processing server in Controller * more preconfigured integration scenarios (e.g. .Presentation) * options for configuration in .Production * de-coupling of data shares (explicit transfer) * workflow selection/editing * adapt to coming changes in OCR-D (networking, error handling...) **UB Mannheim:** * integration of SLUB-components for OCR-D on demand **UB Braunschweig:** * survey for users in the Digital Humanities space * apply current components exemplary on productive instances --- ## 3 Challenges * must advance CLI/API changes in OCR-D (dynamic workflows, evaluators, error handling, page-parallel processing, networking) * network implementation for OCR-D * getting workflow/processing server adopted in core * roadmap for implementation of full Web API * workflow engine for OCR-D * off-the-shelf implementation vs. workflow server in core vs. makefile-based * managing/monitoring jobs with UI * fail-over, re-entry and data lifetime * efficiency, scalability, processing resources * quality estimation and smart workflows --- ## Thank you!