FOLIO Hands-on Lab auf dem 7. Bibliothekskongress 2019

# FOLIO Hands-on Lab auf dem 7. Bibliothekskongress 2019 Willkommen zum Hands-on Lab "FOLIO - Build Together" auf dem Bibliothekskongress 2019 :-) Dies ist ein MarkDown-Pad, in dem wir Notizen, Links, Fragen, etc. der Gruppe 2 (Metadaten-Management) teilen. Die MarkDown-Syntax wird oben hinter dem `(?)`-Button kurz erklärt. **In dieser Session:** Einführung in Konzepte der Metadatenverwaltung in FOLIO mit Bezug auf bibliografische Daten und Normdaten und Vorstellung der Implementierungsidee des GBV. ## Motivation (Vorstellungsrunde, Motivation und genutze Bibliothekssysteme) ### Motivation - Alternative für Publikationsdatenbank ### ILS - LBS(3+4)/CBS - externes Ausleihsystem von Bibdia - Libero - Sisis Sunrise - Aleph - Koha / CBS - Bibliotheca / Frontend OCLC Open - B3Kat - adis/BMS - Alma - Amicus - ACQ3 ## Problem - Verlagerung von Print zu Online - Lebensdauer der aktuell genutzten ILS ist überschritten - Einführung in den 1980er Jahren - Software teilweise > 20 Jahre im Einsatz (wenn auch mit Anpassungen) ## Rahmen ### Erläuterung Datenhaltung in FOLIO mit Fokus auf bibliografische Daten, Bestandsdaten und Normdaten (Inventory) #### Bibliografische Daten in FOLIO ![Bibliografische Daten in FOLIO](https://i.imgur.com/J7QcI96.png) >Codex is a normalization and virtualization layer that allows Folio to integrate metadata about various resources regardless of format, encoding, or storage location. It is the piece that allows disparate resources to be surfaced using a common vocabulary and description. **Normalization:** Codex removes differences in encoding and format to provide a single representation of all participating resources, regardless of how they are managed. **Metadata:** Codex implements a light-weight (simplified) metadata model to describe resources. This common denominator can be mapped to most existing metadata models, thus providing a common vocabulary. **Virtualization:** Codex spans storage locations. It does not matter whether a resource is managed locally or in a remote system. Furthermore, some systems may not be directly responsible for the management of resources, but may be aware of them (e.g. Orders). These systems can also participate in presenting normalized metadata to Codex - essentially providing “pseudo resources” for Folio. **Layer:** In a layered representation of Folio, Codex sits at the top. It can be the starting point for all inquiries on resources. From this layer, one may drill down further into the lower, richer layers for any given selected resource. >Inventory ist die FOLIO-App, in der bibliographische Informationen aus verschiedenen Quellen in einheitlicher, abstrahierter Form zur Verwaltung des Bestands einer Institution präsentiert werden können, unabhängig vom Format oder den Inhaltsregeln zur Beschreibung einer Ressource. Inventory ist der Ort in FOLIO, an dem Instance, Holdings & Item Datensätze für Ressourcen verwaltet werden. Die Instance-Datensätze können aus vollständigen bibliographischen Beschreibungen (in MARC oder anderen Formaten) abgeleitet werden und sollen den Bibliotheksmitarbeiter*innen genügend Informationen liefern, um Datensätze zu identifizieren und auszuwählen, um Arbeiten an zugehörigen Holdings und Items durchzuführen. Instances können auch im nativen FOLIO-Format vorliegen, wenn keine vollständige bibliographische Beschreibung erforderlich ist. Die Holdings-Datensätze liefern Informationen, die das Personal für die Suche und Verwaltung des Bestands benötigt, wie z.B. Standort und Signatur. Holdings können Bibliotheksbestände in physischer, elektronischer oder anderer Form beschreiben. Holdingsätze werden im Inventory angelegt und bearbeitet, sollen zukünftig aber auch aus MFHD-Sätzen erzeugt werden können. Ein Item-Datensatz liefert die Informationen, die zur Identifizierung und Nachverfolgung eines einzelnen Exemplars erforderlich sind, wie z.B. Barcode, Verfügbarkeit und Materialart. Item-Datensätze werden nur im Inventory angelegt und bearbeitet. Container-Datensätze enthalten mehrere Instance-, Holdings- und/oder Item-Sätze und dienen als virtuelle Container. Jedes Objekt kann in einer beliebigen Anzahl von Containern enthalten sein. Der Inventory-Container sollte nicht mit einem Paket aus dem Kontext der elektronischen Ressourcen verwechselt werden. Letzteres ist ein Mechanismus eines Content-Providers, um elektronische Ressourcen zu bündeln. Der Container im Inventory ist ein eigenes Objekt und enthält neben den Informationen, welche Objekte darin enthalten sind, ein eigenes Set an beschreibenden Metadaten. Mit dem Container sollen alle möglichen Arten von Gruppierungen hergestellt werden können. Zum Beispiel: > - eJournal Package > - Kollektion von Büchern, die als Geschenk eingegangen sind > - Karten, die für eine Ausstellung als Leihgabe bereitgestellt werden > - Ressourcen in einem Schuber ohne Einzelbarcode >Instance-Datensätze sind keine vollständigen beschreibenden Katalogisate. Die Bearbeitung von Instances ist nicht gleichbedeutend mit der vollständigen Bearbeitung von MARC (oder einem anderen Metadatenformat); die Bearbeitung von Source Records erfolgt außerhalb vom Inventory. Die FOLIO Inventory-Oberfläche ist für die interne Nutzung und Recherche ausgelegt und nicht als Discovery-System für die Bibliotheksnutzer. Übersetzt aus: [FOLIO Inventory: a working definition](https://docs.google.com/document/d/16C83Yy61GVm9dYs7aRKj9Z0on-vEuBcfXiCjuo_jALo/edit) #### Inventory - Source Record Storage - MarcCat Datenmodell ![Inventory - Source Record Storage - MarcCat Datenmodell](https://i.imgur.com/y6PU3D8.png) #### Die Datenbank hinter FOLIO ![FOLIO DB](https://i.imgur.com/Wnk2XFy.png) aus: [Ladisch, Julian: Bibliothekssystem Reloaded: Die Architektur unter FOLIO](https://www.gbv.de/Verbundzentrale/Publikationen/publikationen-der-vzg-2018/pdf/Ladisch_180830_VK_folio_architektur.pdf) #### [Inventory Beta-Metadatenschema](https://docs.google.com/spreadsheets/d/1RCZyXUA5rK47wZqfFPbiRM0xnw8WnMCcmlttT7B3VlI/edit#gid=278827110) #### [Testinstanz Snapshot Stable](http://folio-snapshot-stable.aws.indexdata.com/) #### Inventory-Zugriff - Per Browser Chrome und Firefox haben Developertools, die sich mit Strg+Umschalt+I anschalten lassen. Dort im Tab "Network" erscheinen im Untertab "Response" die Antworten des Inventory-Backends: ![Inventory Backend Response](https://i.imgur.com/h8QCdDr.png) - Per API mit `curl` Bei Interesse ### Vorstellung des Konzepts der Metadatenverwaltung und Datenflüsse im GBV Katalogisierung von Ressourcen im CBS 1.1 unter Verwendung der Software WinIBW von OCLC ![A journey through Europe CBS](https://i.imgur.com/JCXiLPX.png) [Bibliografischer Datensatz per unAPI](http://unapi.gbv.de/?id=econis:ppn:643935371&format=marcxml) Per Online Update Fetch (OUM) / Online Update Mechanism (OUM) live-update in die FOLIO-Instanz (create & update) ![A journey through Europe FOLIO](https://i.imgur.com/TyRte4p.png) **Daraus folgt:** - Katalogisierung bleibt in den vertrauten Systemen - Keine Umgewöhnung der Katalogisierinnen und Katalogisier notwendig - Skripte (Datenmasken, Exemplarmasken, Hausskripte) können weitergenutzt werden - Komforfunktionen (excelTabelle, ppnListe, rechercheStapel) bleiben erhalten - ... #### German Library Network ERM Data Flow ![German Library Network Data Flow](https://i.imgur.com/QUT1rwf.png) ### Daten nach FOLIO migrieren #### Data Migration Sub Group [Sub Group im FOLIO-Wiki](https://wiki.folio.org/display/SYSOPS/Data+Migration+Subgroup) _Charge of the Data Migration Subgroup_ Areas of concern: - Adequacy and best practices for data migration - Migration of data from various currently used Integrated Library Systems (taken from SIG charge) - acceptable data formats, record types people want to migrate, record layout for a migration record, data formats that the APIs accept - define where migration is possible and where it is necessary to start over - test perfomance of the APIs Deliverables: - Documentation for migration - Data migration plan, working with various SIGs for SME input. (from >SIG charge) - Specifications for migration tools that have not already been identified for development. (from SIG charge) - Requirements and priorities for module / API creation; set deadlines for readiness of module APIs; find gaps in what the APIs - can accomodate and determine what needs to be filled #### Data Loader Anforderungen: - Laden von Daten in JSON und MARC(21)XML in alle Module über die Modul-APIs - Performanz - Prozessierung großer Dateien (z. B. 10 Millionen Titeldatensätze) - Logging aller Prozesse und Fehler - Erstellte Daten Löschen, um neue Uploads zu ermöglichen - Befüllen der Referenztabellen - Veröffentlichung des JSON-Schemas des Loaders als Spezifikation zur Transformationen der Daten aus "Legacy Systems" #### Data Import App The data import process: ![The data import process:](https://i.imgur.com/mj66DAC.png) [Data Import Update](https://drive.google.com/file/d/1k9eHYaKFQ1nJ-X22KuMhm8vZtBAvErSl/view?usp=sharing) Präsentation vom 26.10.2018 [Sources of Batch Files](https://wiki.folio.org/display/MM/Sources+of+Batch+Files) im FOLIO Wiki #### [Mapping Pica3/Pica+ to Inventory & Inventory to Pica3/Pica+](https://docs.google.com/spreadsheets/d/1HqOIDeyUmHGwB4d8dcdyxn6CdueUvLWOiHvAyZf7YrI/edit#gid=665804497) --- ## Materialien 1. [Acquisition & Inventory integration - MM-SIG 2/21/2019](https://docs.google.com/presentation/d/16M_PVJ2rS4dglfASv1KADt9gn4kuucb1ZbqVSJhBH7k/edit#slide=id.g4cdf76c5ce_0_10) 2. [data flow FOLIO apps_RM](https://docs.google.com/presentation/d/1Ms7WWkHG7AuONygM3hvYdsjEY1xhA6FqRRGTsM8fNUw/edit#slide=id.g47c49dc113_0_3) 3. [NEW Inventory - Storage - MARCCat](https://docs.google.com/drawings/d/1Vgx5wyPIFlBfckuazSybOgkQgoDDby5y4JDKzJVb24Y/edit) 4. [Inventory Beta - Metadata Elements](https://docs.google.com/spreadsheets/d/1RCZyXUA5rK47wZqfFPbiRM0xnw8WnMCcmlttT7B3VlI/edit#gid=278827110) 5. [Mapping Inventory Metadata Elements: PICA3/PICA+ to Inventory Beta Format](https://docs.google.com/spreadsheets/d/1HqOIDeyUmHGwB4d8dcdyxn6CdueUvLWOiHvAyZf7YrI/edit#gid=665804497) 6. [Inventory-Zugriff](https://info.gbv.de/display/FOLIO/Inventory-Zugriff) 7. [The Codex Vision - TC Review COPY](https://docs.google.com/document/d/1NQamK4fSi0WRXonIgBBbpCTauac8BMdcAkY5mGXlIL0/edit) 8. [VuFind & Folio](https://vufind-folio.scanbit.net/Search/Results?lookfor=&type=AllFields&limit=20) 9. [German library networks - data flows](https://docs.google.com/presentation/d/104-5B8Ip2cyxPIdcMCqy1OYu0XtCaoy-Te57w-5UykI/edit#slide=id.p1) 10. [DataLoaderRequirements](https://docs.google.com/document/d/1sULgEXYw_uGMf5bFwy_95LZ5KvqL8nGWSkEwpjaEzWY/edit#heading=h.wszopzyeosmv) 11. [FOLIO Inventory: Working definition](https://docs.google.com/document/d/16C83Yy61GVm9dYs7aRKj9Z0on-vEuBcfXiCjuo_jALo/edit) 12. [The Codex Metadata Model](https://wiki.folio.org/pages/viewpage.action?pageId=1415393) ## Hier können Sie Ihre Fragen stellen IIRC in the hands-on lab it was said that according to current plans not all data fields from the classical ILSs will be considered in a migration scenario (approx. 60 out of 200) -- exactly which fields would that be, is there a public list available? (Anna Kasprzik, ZBW) > Please find all Inventory Beta Metadata Elements in this [Google Spreadsheet](https://docs.google.com/spreadsheets/d/1RCZyXUA5rK47wZqfFPbiRM0xnw8WnMCcmlttT7B3VlI/edit#gid=278827110). The elements are listed in four worksheets. Where "Instance" corresponds to title data (Level 0), Holdings and Items corresponds to _Exemplardaten_ in CBS terminology (Level 1+2). > [name=hemmefelix] Also, what exactly is meant by "the format of FOLIO is document-based", i.e., what are the specific consequences of that? (Anna Kasprzik, ZBW) > Since development at FOLIO takes place in two-week sprints, it would probably be very time-consuming if changes to the internal format were always mapped into a relational database. If you are interested in more detailled information about FOLIOs architecture, you could take a look at [Julian's slides](https://www.gbv.de/Verbundzentrale/Publikationen/publikationen-der-vzg-2018/pdf/Ladisch_180830_VK_folio_architektur.pdf) from GBV conference 2018. There is also elaborated [information for developers](https://dev.folio.org/) or you can get in touch with Julian directly (email see slides). > [name=hemmefelix]