owned this note changed 5 years ago
Published Linked with GitHub

OCR(-D) und Kitodo

Michael Lütgen, Zeutschel GmbH
Elisabeth Engl, HAB Wolfenbüttel
Konstantin Baierer, Staatsbibliothek zu Berlin

https://hackmd.io/@kba/S1peIVxhH

Kitodo-Anwendertreffen 2019-11-19


Zeutschel OCR Cloud


OCR-D in a nutshell

  • DFG-funded Project focussed on full-text digitization of VD library material
    => Development of Software for whole OCR process
  • Coordination project (HAB, KIT, BBAW, SBB)
  • 8 module projects implementing specific functionality

Pilotbibliotheken

  • Functionality test till January 2020
  • Recognition rate and performace are secondary at the moment
  • Partner:
    • SLUB Dresden
    • UB Mannheim
    • SUB Göttingen
    • UB Heidelberg
    • ULB Halle
    • ULB Darmstadt

Specs and Documentation

https://ocr-d.github.io

  • File Formats Usage: METS, PAGE, OCRD-ZIP
  • Command line interface
  • Actionable Tool description
  • Docker conventions
  • Ground Truth Transcription Guidelines
  • Installation and implementation instructions

https://ocr-d.github.io


Modulprojekte

https://kba.cloud/ocrd-kwalitee


OCR-D Deployment


Other software of the OCR-D ecosphere

Training:

Evaluation: qurator-spk/dinglehopper

Deployment:


Why ABBYY

  • Okay segmentation
  • Fast
  • Good recognition of modern types

Why NOT ABBYY

  • Closed Source
  • Volume-based licensing
  • Little support for historical types
  • No training possible
  • Fraktur: expensive and very little development
  • Black Box: Few possibilities to influence recognition

Why OCR-D?

  • Free Software
  • Massive progress in neural networks in recent years
    => Excellent quality, esp. with fine-tuned models
  • Modular: OCR as a process of configurable steps
    => alternatives for all steps

OCR in Kitodo: Status Quo

According to https://github.com/kitodo/kitodo-production/wiki/OCR

  • OCR-Webservice of GBV/ZGV (?)
  • Intranda Taskmanager
  • Zeutschel zedOCR

OR

  • Offline OCR and reimport

Considerations

  • Configurability: In Kitodo.Production UI? Config files? Black box?
  • Technical Integration: Long-running and potentially resource-intensive processes
  • Deployment: Locally? Remote but in-house? Commercial service provider? Verbund-level service?
  • Tradeoff: Quality of recognition vs speed and manual effort

Discussion I

  • How much of a priority is OCR for your institution?
  • What use cases do you have for OCR beyond search and providing research data?
  • How important is OCR for older types or less-supported languages at your institution?

Discussion II

  • How could OCR improve Kitodo.Production?
    • Generate table of contents
    • Article/Issue separation
  • OCR training as part of Kitodo?
  • How do you integrate OCR into Kitodo now?
  • Once OCR-D is ready, what is the added value of having OCR done by an external provider?

Discussion III

  • Which file formats do you want OCR results in?
    • PAGE (currently supported by OCR-D)
    • ALTO
    • hOCR
    • TEI
    • WebAnnotation (for IIIF)


Select a repo