# o1-pro-on-omeka-xml Thanks, that's helpful. I'll dig into the current role and limitations of Omeka's XML format—especially how it compares to modern JSON-based approaches across use cases like data interchange, metadata representation, custom theming, and API integration, in both Omeka Classic and Omeka S. I'll assess the brittleness or arbitrariness you mentioned, evaluate alternatives (like JSON-LD), and consider why the XML approach persists in Omeka’s ecosystem. I’ll follow up with a structured report shortly. # Technical Assessment of Omeka’s XML Data Format ## Introduction Omeka is a web publishing platform for digital collections, offered in two main versions: **Omeka Classic** (the original single-site application) and **Omeka S** (a newer multi-site, linked data-oriented system). Both versions historically rely on XML in various parts of their architecture – from metadata encoding to theme configuration and data exchange. This report provides an honest technical assessment of Omeka’s XML-based formats for data interchange, metadata representation, theme customization, and API integration. We compare the XML approach to modern JSON-based alternatives (e.g. JSON-LD and schema.org JSON) in terms of usability, flexibility, and maintainability. Key use cases (metadata import/export, theme and template customization, and external API integration) are examined to understand the strengths and limitations of Omeka’s continued use of XML. We also explore why XML persists in Omeka’s design, what pain points it introduces for developers and curators, and what a transition to JSON-based formats would entail. Recommendations are provided on whether moving away from XML is advisable and how such a transition could be managed. ## Omeka’s XML Format in Context Omeka Classic was built in the late 2000s, an era when XML was a dominant format for data exchange. It implemented custom XML schemas and outputs for content interchange. For example, Omeka Classic defines an **“omeka-xml”** response format – an XML serialization of an item’s metadata according to Omeka’s own schema ([Output Formats - Omeka Classic User Manual](https://omeka.org/classic/docs/Technical/Output_Formats/#:~:text=Omeka%20Classic%20uses%20Zend%20Framework%27s,information%20about%20these%20bundled%20formats)) ([Output Formats - Omeka Classic User Manual](https://omeka.org/classic/docs/Technical/Output_Formats/#:~:text=omeka)). Classic also offers a **“dcmes-xml”** format, which is an RDF/XML representation of Dublin Core metadata ([Output Formats - Omeka Classic User Manual](https://omeka.org/classic/docs/Technical/Output_Formats/#:~:text=dcmes)). In early versions, JSON output was minimal (used mainly for AJAX in the interface) and incomplete. The Classic manual notes that the default JSON output was *“streamlined... primarily used for Ajax requests”* with plans to deprecate it once a more complete **“omeka-json”** format matured ([Output Formats - Omeka Classic User Manual](https://omeka.org/classic/docs/Technical/Output_Formats/#:~:text=json%EF%83%81)). Indeed, Omeka Classic eventually introduced an “omeka-json” output that mirrors the omeka-xml content (generated by applying a JsonML XSLT to the XML) ([Output_OmekaJson — Omeka Classic 3.1 documentation](https://omeka.readthedocs.io/en/latest/Reference/models/Output/OmekaJson.html#:~:text=Generates%20JSON%20version%20of%20the,dictated%20by%20the%20JsonML%20XSLT)). This approach reveals that Classic’s JSON support was essentially bolted on top of the XML pipeline, rather than a native JSON serialization of the data. Omeka S, released in 2017, marked a significant shift toward JSON and Linked Data. It adopted JSON-LD (JSON for Linked Data) as the native exchange format and data model. The Omeka S REST API defaults to JSON-LD for all responses ([REST API - Omeka S Developer Documentation](https://omeka.org/s/docs/developer/api/rest_api/#:~:text=HTTP%20responses%20will%20be%20formatted,transporting%20Linked%20Data%20using%20JSON)), reflecting its core design around RDF (Resource Description Framework) and multi-vocabulary metadata. Other serializations of the same data (like RDF/XML, Turtle, N-Triples) are supported for interoperability, but JSON-LD is the primary representation ([REST API - Omeka S Developer Documentation](https://omeka.org/s/docs/developer/api/rest_api/#:~:text=Since%20Omeka%20S%20version%204,The%20core%20supported%20formats%20are)). Internally, Omeka S represents item metadata as linked data properties (e.g. using Dublin Core, FOAF, etc.) rather than the fixed Dublin Core fields of Classic. This means Omeka S can readily output data in modern formats and integrate with systems via JSON-based protocols. As an Omeka team presentation summarized, *“Omeka S… provides a REST API [with] integration with other systems like Fedora and DSpace via JSON-LD and RDF”* ([Next Gen Omeka | PPT](https://www.slideshare.net/slideshow/next-gen-omeka/56958514#:~:text=Omeka,Read%20less)). In practice, Omeka S still uses XML for certain modules and legacy compatibility (for example, OAI-PMH harvesting and some metadata standards), but its architecture has largely pivoted away from proprietary XML formats in favor of JSON-LD. The continued presence of XML in Omeka S is mostly confined to standards that inherently require XML (such as OAI-PMH’s required Dublin Core XML) or optional add-ons for specific XML-based schemas. ## Use Case Analysis ### 1. Importing and Exporting Metadata **Omeka Classic:** Import/export in Classic often relied on XML-based protocols or plugins, but with notable limitations. By default, Omeka Classic’s web UI allows exporting item data via the custom XML outputs (omeka-xml or Dublin Core XML) by appending `?output=omeka-xml` to an item or collection URL ([Output Formats - Omeka Classic User Manual](https://omeka.org/classic/docs/Technical/Output_Formats/#:~:text=To%20access%20the%20response%20formats%2C,For%20example)). This yields an XML document containing the item’s metadata in Omeka’s schema. The proprietary nature of **omeka-xml** means external systems don’t natively understand it – a custom XSD (XML Schema) is provided, but harvesters or tools must specifically support Omeka’s schema to ingest it. In fact, Omeka’s developers eventually found that *“Omeka-specific XML output... is not well-supported by harvesters or other tools”* ([ Omeka S - OAI-PMH Repository ](https://omeka.org/s/modules/OaiPmhRepository/#:~:text=Omeka%20XML%20%28prefix%20%60omeka)). Classic also implements OAI-PMH (via a plugin) to expose items in standard Dublin Core XML, which aggregators like DPLA can harvest. However, there was no unified, built-in way to **import** arbitrary XML metadata into Classic. A forum response from an Omeka developer notes *“XML data that does exist tends to be in a wide variety of schemas, so there isn’t really an ‘official’ XML import pathway.”* ([Importing a XML (Dublin Core) dataset - Import/Export - Omeka Forum](https://forum.omeka.org/t/importing-a-xml-dublin-core-dataset/10639#:~:text=We%20have%20a%20relatively%20small,an%20%E2%80%9Cofficial%E2%80%9D%20XML%20import%20pathway)). Users with XML records (MARCXML, custom Dublin Core XML, etc.) often had to convert them to CSV or use community plugins to import into Omeka ([Importing a XML (Dublin Core) dataset - Import/Export - Omeka Forum](https://forum.omeka.org/t/importing-a-xml-dublin-core-dataset/10639#:~:text=There%20are%20other%20options%20like,use%20XML%20as%20your%20source)). This reveals a workflow constraint: while Omeka could output XML, it couldn’t flexibly ingest arbitrary XML without custom mapping, making **CSV** a more common interchange format for Classic. In summary, exporting data from Classic via XML was possible but sometimes incomplete (earlier versions had bugs omitting certain fields) and importing XML was not straightforward. **Omeka S:** With Omeka S, data interchange has moved toward JSON and standard linked data formats. Omeka S provides a robust REST API to **export** and **import** items in JSON-LD. For instance, the **Omeka S Item Importer** module can pull items from one Omeka S installation into another via the API ([Importing and Exporting - Omeka S User Manual](https://omeka.org/s/docs/user-manual/importexport/#:~:text=You%20can%20use%20the%20Omeka,import%20sites%20and%20their%20pages)), leveraging JSON as the data interchange format under the hood. Bulk data export in Omeka S is handled by modules like **Bulk Export**, which by default support CSV, JSON, and other text formats – interestingly, generic XML export is not provided out-of-the-box ([Export Data and LIDO - Development - Omeka Forum](https://forum.omeka.org/t/export-data-and-lido/18271#:~:text=Daniel_KM%20%20July%201%2C%202023%2C,3%3A51pm%20%203)). (This is telling: the developers prioritized modern formats and omitted a catch-all XML export, likely because few users demand a raw Omeka XML dump when JSON or CSV is available.) If needed, one can extend Bulk Export with a plugin to output a specific XML schema. For example, a developer notes adding GeoJSON output via a plugin and states *“there is no output for XML currently, but this is only a format – the hard point is to do the mapping”* ([Export Data and LIDO - Development - Omeka Forum](https://forum.omeka.org/t/export-data-and-lido/18271#:~:text=Yes%2C%20there%20is%20the%20module,the%20mapping%2C%20not%20the%20code)). In other words, Omeka S is capable of exporting XML if a specific standard like LIDO or MODS is needed, but one must define the mapping from Omeka’s data to that XML schema (either via custom code or XSL transformation). For **importing** data, Omeka S benefits from its linked data approach: if metadata is available in RDF (whether JSON-LD, Turtle, or RDF/XML), Omeka S can often ingest it using its API or vocabularies. It uses the EasyRDF library which can parse multiple serialization formats ([Omeka-S ontologies - TEI? - #4 by patrickmj - Development - Omeka Forum](https://forum.omeka.org/t/omeka-s-ontologies-tei/3108/4#:~:text=Omeka%20S%20uses%20the%20EasyRDF,for%20building%20and%20exporting%20vocabularies)). For example, a user can import an ontology or controlled vocabulary expressed in RDF/XML or JSON-LD, as long as EasyRDF supports it ([Omeka-S ontologies - TEI? - #4 by patrickmj - Development - Omeka Forum](https://forum.omeka.org/t/omeka-s-ontologies-tei/3108/4#:~:text=There%E2%80%99s%20a%20variety%20of%20other,if%20it%E2%80%99s%20done%20in%20practice)) ([Omeka-S ontologies - TEI? - #4 by patrickmj - Development - Omeka Forum](https://forum.omeka.org/t/omeka-s-ontologies-tei/3108/4#:~:text=Omeka%20S%20uses%20the%20EasyRDF,for%20building%20and%20exporting%20vocabularies)). This flexibility is greater than Classic’s, since Classic was essentially hard-wired to Dublin Core and required plugins for any other schema. Omeka S’s use of JSON-LD also aligns with emerging standards like IIIF (International Image Interoperability Framework), which uses JSON-LD manifests for image collections. In fact, Omeka S includes a IIIF module to export item sets as IIIF manifests (JSON-LD) for use in viewers, a task that would have been difficult in an XML-centric Classic without additional tools. **Summary:** For data import/export, Omeka’s reliance on XML in the past introduced friction. Classic’s proprietary XML export was of limited use outside Omeka, and importing arbitrary XML required conversion. JSON-based alternatives in Omeka S (such as the JSON-LD API and standardized exports) greatly improve usability and interoperability. With JSON, the same data can be readily consumed by web applications or transformed into other formats. For example, a JSON export from Omeka S can be converted to CSV or even to XML if needed by using common programming libraries, whereas an Omeka-XML export from Classic often needed an XSLT or custom parser to convert into a useful form. The move to JSON-LD in Omeka S yields **more flexibility** in exchanging data with external systems and **better support for multiple metadata standards**, since JSON-LD can incorporate any ontology (Dublin Core, Schema.org, etc.) via context, whereas Classic’s XML was tied to a fixed schema. The one area XML remains indispensable is in **legacy protocols** (like OAI-PMH) – Omeka S continues to support OAI-PMH, outputting Dublin Core XML for harvesters. But notably, Omeka S deliberately chose not to carry forward the old “omeka-xml” format due to its lack of support, instead offering a simpler “flat” XML for those who need an XML dump of all metadata ([ Omeka S - OAI-PMH Repository ](https://omeka.org/s/modules/OaiPmhRepository/#:~:text=This%20output%20format%20uses%20an,schema%20from%20the%20last%20site)). This indicates a strategic shift: Omeka recognizes that a custom XML is more burden than benefit when modern, widely-used alternatives exist. ### 2. Theme Customization and Template Use **Omeka Classic:** In Classic, themes are primarily customized via PHP templates and an `ini` configuration file for basic settings (e.g. theme name, author) ([Introduction to Themes - Omeka S Developer Documentation](https://omeka.org/s/docs/developer/themes/#:~:text=config%2Ftheme,support%20information%2C%20theme%20version)). There is no heavy use of XML in the theme layer itself – layout and design are done in HTML/CSS/PHP. However, XML comes into play if the theme or site builder wants to output data in non-HTML formats or incorporate external data. Classic leverages Zend Framework’s “context switching” to serve alternate representations like RSS or XML for certain pages ([Output Formats - Omeka Classic User Manual](https://omeka.org/classic/docs/Technical/Output_Formats/#:~:text=Omeka%20Classic%20uses%20Zend%20Framework%27s,information%20about%20these%20bundled%20formats)). For example, an items browse page can output RSS 2.0 or Atom feeds, which are XML formats ([Output Formats - Omeka Classic User Manual](https://omeka.org/classic/docs/Technical/Output_Formats/#:~:text=atom%EF%83%81)). These feeds are generated by special PHP view templates (e.g., `items/browse.atom.php`) which structure the output as XML. If a theme developer wanted to **add a new custom output format** (say, KML for a map or a different XML schema), they would have to create a plugin or modify the theme to define the new context and provide an XML template. This is non-trivial – one has to hook into Omeka’s `response_contexts` and `action_contexts` filters to register the format and then produce valid XML manually ([Add a new XML output format - #4 by patrickmj - Import/Export - Omeka Forum](https://forum.omeka.org/t/add-a-new-xml-output-format/296/4#:~:text=,response_contexts)). An Omeka developer acknowledged that documentation for this was once incorrect, and pointed to the Geolocation plugin as an example (it added a KML output) ([Add a new XML output format - #4 by patrickmj - Import/Export - Omeka Forum](https://forum.omeka.org/t/add-a-new-xml-output-format/296/4#:~:text=)). The need to touch low-level filters and craft XML templates makes such customizations brittle: a small mistake in XML structure can break the output, and maintaining these outputs means tracking schema changes. Thus, while theme *design* in Classic doesn’t involve XML, theme *functional customization* (like exposing data to other systems or in other formats) often did involve working with XML and XSL. For instance, some Classic users built custom XSL transformations on the Omeka-XML output to generate specialized displays or feeds. One user described taking the single XML output of an item and applying XSLT to format it for radio producers ([Incomplete XML outputs · Legacy Forums · Omeka](https://omeka.org/forums-legacy/topic/incomplete-xml-outputs/#:~:text=Hi%20Omeka%20community%21)) – a creative workaround, but one that underscores the extra layer of complexity XML introduced. Another aspect is embedding structured metadata for SEO or rich snippets. In Classic, if one wanted to embed **schema.org metadata** on item pages (for better search engine indexing), it was up to the theme developer to add microdata in the HTML or to insert a JSON-LD `<script>` block manually. There was no native support to output JSON-LD in Classic themes. This again could be seen as a limitation of the XML-era design: Classic could output Dublin Core XML for OAI-PMH, but it didn’t automatically output schema.org JSON that modern web crawlers prefer. **Omeka S:** Theme customization in Omeka S remains primarily a server-side templating affair (Omeka S uses the Laminas framework’s PHP-based templating). However, because Omeka S is “API-first” and JSON-LD–centric, theme developers have new options. For one, Omeka S **automatically embeds JSON-LD** structured data in item pages by default. In early versions of Omeka S, each item’s public page included an embedded JSON-LD script tag containing the item’s metadata as JSON-LD (contextualized to the vocabularies in use). This was intended to facilitate linked data consumption and SEO. It did have a trade-off: embedding a full JSON-LD record for each item can make pages heavier. In one forum discussion about performance, a developer noted *“the page size… is rather high… probably the embedded JSON-LD that’s driving up the size here,”* and mentioned that Omeka S introduced a setting to disable or limit this if needed ([Significant slowness relative to Omeka Classic? - #4 by jflatnes - Omeka S - Omeka Forum](https://forum.omeka.org/t/significant-slowness-relative-to-omeka-classic/8089/4#:~:text=The%20page%20size%20there%20is,the%20latest%20version%20of%20S)). The key point is that Omeka S directly supports outputting machine-readable JSON within the theme, whereas Classic required manual effort to do anything similar. If a theme builder in Omeka S wants to customize or extend this, they can adjust the JSON-LD context or use the API. For example, a theme could use JavaScript to call Omeka S’s JSON API in the background to populate interactive components (maps, timelines, etc.), which is much easier than parsing XML. Modern front-end libraries (React, Vue, etc.) can consume Omeka S’s JSON directly to create dynamic user interfaces, effectively treating Omeka as a headless CMS if desired. For delivering alternate formats to end-users, Omeka S encourages using modules rather than theme hacks. The **Output Formats** module in Omeka S allows administrators or visitors to export item lists in various serializations (JSON-LD, N-Triples, RDF/XML, etc.) ([Output Formats - Omeka S User Manual](https://omeka.org/s/docs/user-manual/modules/outputformats/#:~:text=%2A%20JSON,Triples%20%2A%20RDF%2FXML)) ([Output Formats - Omeka S User Manual](https://omeka.org/s/docs/user-manual/modules/outputformats/#:~:text=%2A%20JSON,triples%29%20%2A%20RDF%2FXML%20%28application%2Frdf%2Bxml)). This modular approach means a theme doesn’t need custom logic for, say, a JSON download button – the module can provide it. If a new format is needed, a developer can add it via a module (similar in concept to Classic’s plugins, but generally cleaner in Omeka S). As noted earlier, adding XML output is possible but not a priority: for instance, an Omeka S user looking to export in LIDO (an XML standard for museum data) discovered that Bulk Export didn’t list an XML option by default ([Export Data and LIDO - Development - Omeka Forum](https://forum.omeka.org/t/export-data-and-lido/18271#:~:text=Hello%20again%2C%20i%20am%20still,is%20appreciated%20Image%3A%20%3Aslight_smile%3A%20Best)) ([Export Data and LIDO - Development - Omeka Forum](https://forum.omeka.org/t/export-data-and-lido/18271#:~:text=Yes%2C%20there%20is%20the%20module,the%20mapping%2C%20not%20the%20code)). The solution was to implement a mapping for LIDO, either through code or by exporting JSON and converting externally. In practice, many Omeka S users will export as JSON or CSV and then use separate tools to get XML if absolutely required. In summary, **theme and template customization** in Omeka Classic could be limited by the system’s XML-centric data access (only certain hard-coded XML outputs were available without custom coding). The Classic theme had to work around or with these outputs if it wanted to provide data to other contexts. This often resulted in brittle solutions using XSLT or custom plugin development. Omeka S, by shifting to JSON-LD and an extensible API, has made it simpler to integrate content into templates and external widgets. Themes can still be fully customized in HTML/CSS, but the data feeding those themes is more readily accessible in developer-friendly formats. A side-by-side comparison of how data can be exposed for theme or external use illustrates this difference: - *Classic:* To provide an interactive map of item locations on a page, one might either generate a KML (XML) feed via a plugin and use a mapping library to fetch/parse it, or embed coordinates in data attributes and write custom JS. - *S:* One can directly fetch item data as JSON from the API (or even use the embedded JSON-LD) and feed it to a mapping JavaScript library (many of which accept JSON). The amount of format conversion is reduced, and no XML parsing on the client is needed. ### 3. API Integration and External Systems **Omeka Classic:** A REST API was introduced in Omeka Classic 2.1 (2013) to expose data to external systems ([Omeka REST API — Omeka Classic 3.1 documentation](https://omeka.readthedocs.io/en/latest/Reference/api/index.html#:~:text=Added%20in%20version%202)). This API was a welcome addition, as previously developers had to rely on the `?output=` mechanism or OAI-PMH for programmatic access. The Classic REST API serves and accepts JSON representations of resources. For example, a request to `/api/items/123` would return a JSON object with the item’s metadata. This made integration with external applications (mobile apps, other websites, etc.) much easier than dealing with XML feeds. However, the Classic API has some quirks: it mirrors the underlying data model, which is still Dublin Core centric. Also, modifying data via the API required authentication and careful use of JSON payloads. It was a step forward, but not as fully featured or semantically rich as Omeka S’s API. Notably, the Classic API’s JSON structure is somewhat arbitrary – it’s essentially Omeka’s data converted to JSON keys and values, without the linked data context that Omeka S provides. This means if you pull data from Omeka Classic’s API, you get raw strings for element texts and some internal IDs, but not URIs or semantic types for those fields. Developers often had to consult Omeka’s documentation to interpret the JSON. Additionally, because Classic’s JSON was **derived from XML via XSLT** ([Output_OmekaJson — Omeka Classic 3.1 documentation](https://omeka.readthedocs.io/en/latest/Reference/models/Output/OmekaJson.html#:~:text=Generates%20JSON%20version%20of%20the,dictated%20by%20the%20JsonML%20XSLT)), it wasn’t as clean or logical as a hand-crafted JSON format might be. For example, empty elements might simply be absent, and multiple values for a field might be given as arrays or repeated keys depending on how the XSLT produced them. Any change in the XML schema or XSL could alter the JSON output (hence “brittle”). In fact, an issue was discovered in one Classic release where certain Dublin Core fields (Title, Creator, etc.) were missing from the omeka-xml (and thus the JSON) output ([Incomplete XML outputs · Legacy Forums · Omeka](https://omeka.org/forums-legacy/topic/incomplete-xml-outputs/#:~:text=OmekaXML%20has%20all%20the%20fields,it%20into%20the%20OmekaXML%20output)) ([Incomplete XML outputs · Legacy Forums · Omeka](https://omeka.org/forums-legacy/topic/incomplete-xml-outputs/#:~:text=annewootton%20March%2010%2C%202012)) – an omission caused by a bug. Such inconsistencies could break API consumers expecting those fields. The fix required patching Omeka’s code/schema ([Incomplete XML outputs · Legacy Forums · Omeka](https://omeka.org/forums-legacy/topic/incomplete-xml-outputs/#:~:text=Jim%20Safley%20March%2010%2C%202012)). This example highlights maintainability challenges: when your API relies on an intermediate XML schema and transformation, a small oversight can propagate to the API output and confuse developers. Integration with external systems in Classic was also commonly achieved through **OAI-PMH harvesting** (for libraries/archives systems) or via export/import plugins. OAI-PMH is XML-based and was well-supported by Classic (with the OAI-PMH Repository plugin exposing items, and OAI-PMH Harvester plugin to pull from other repositories). While reliable, OAI-PMH has its own limitations (only Dublin Core by default, being a one-way sync mechanism, etc.), and its XML nature meant that any system outside the library world might prefer a more web-friendly API. **Omeka S:** From the outset, Omeka S was designed for integration. Its core API is a full CRUD (Create-Read-Update-Delete) RESTful service using JSON-LD. Responses from `/api` include not just data but also context to interpret that data as linked properties ([REST API - Omeka S Developer Documentation](https://omeka.org/s/docs/developer/api/rest_api/#:~:text=Responses%EF%83%81)). For example, an Omeka S item JSON might include `dcterms:title` as a key (or within an array of values) with an associated URI in the `@context`, making it clear that it’s Dublin Core Title. This semantic richness means external systems can understand the data more easily or even treat the Omeka S API as a Linked Data endpoint. Clients interacting with the Omeka S API are expected to send JSON (specifically JSON-LD) when creating or updating resources. In some cases, Omeka S enforces a particular structure for JSON input – the documentation notes that *“payloads should also be JSON-LD, but Omeka S sometimes requires clients to follow a particular structure, even if alternate valid JSON-LD would also represent the same data”* ([REST API - Omeka S Developer Documentation](https://omeka.org/s/docs/developer/api/rest_api/#:~:text=Requests%EF%83%81)). This indicates that while JSON-LD is flexible, the Omeka S implementation may be a bit strict on format (likely for simplicity in server-side parsing). Nonetheless, it is far more flexible than Classic because it can handle any vocabulary. An Omeka S item with custom ontology fields (say, Schema.org `schema:creator` alongside Dublin Core) will return those in the JSON-LD, whereas Classic’s API could only ever return the Dublin Core and Item Type fields it knew about. Integration with external services like Fedora Commons or DSpace is also improved. Omeka S has connector modules that can push or pull data from Fedora and DSpace, using their APIs (which are often XML-based like AtomPub or SOAP, but Omeka S acts as a mediator converting to JSON-LD internally) ([Next Gen Omeka | PPT](https://www.slideshare.net/slideshow/next-gen-omeka/56958514#:~:text=Omeka,Read%20less)). The existence of these connectors underscores that Omeka S’s architecture (with JSON-LD at its core) makes it feasible to integrate via multiple protocols. For instance, Omeka S can consume a DSpace OAI-PMH feed (XML) using a connector, map it to Omeka’s resource template (as JSON internally), and then provide it via its JSON-LD API to others. It essentially can straddle both worlds where needed. Another example is the IIIF integration: Omeka S can generate IIIF Presentation API manifests (which are JSON-LD) for each item or item set through the IIIF module, allowing deep integration with image viewers and annotation tools. This would have been an add-on in Classic (and indeed, Classic had some plugins for IIIF but not as seamless). **Summary:** API integration is an area where the benefits of moving from XML to JSON become very clear. In Classic, the API (once introduced) gave developers JSON, but it was built on an XML foundation that was **brittle** – described by some as arbitrary because it was unique to Omeka and not always consistent. Any external system had to write a custom adapter to parse Omeka’s XML or its JSON derivative. In contrast, Omeka S’s JSON-LD API speaks the language of the web and Linked Data out of the box. A developer can fetch data and immediately know (from the context) that, for example, `dcterms:creator` refers to the Dublin Core Creator property, or see that a value is a URI pointing to an external resource. This self-describing quality of JSON-LD improves maintainability: as long as Omeka S adheres to JSON-LD standards, clients don’t break with version changes as easily (the context and vocab terms remain consistent). Moreover, modern APIs expect JSON – *“in APIs based on REST architecture, [JSON] is actually a standard”* ([JSON And XML. API’s Data Interchange Formats | by Eugene Bartenev | Medium](https://medium.com/@bartenev/json-and-xml-apis-data-interchange-formats-96e58d0626a6#:~:text=JSON%20is%20one%20of%20the,the%20same%20amount%20of%20data)) – and Omeka S aligns with this expectation, making it much easier to use Omeka as part of a larger ecosystem of web applications. In practical terms, a developer integrating Omeka S with, say, a JavaScript front-end or a Python data analysis tool, can do so quickly using built-in JSON parsing libraries (all major programming languages have JSON support). If they tried the same with Omeka Classic’s XML outputs, they would need to bring in an XML parser and potentially deal with XML namespaces, schema validation, etc., adding overhead in development and execution. ## Why Does Omeka Still Use XML? Given all the modern advantages of JSON, one may ask why XML is still present at all in Omeka’s architecture. There are several reasons rooted in historical context and the needs of Omeka’s primary users (libraries, archives, museums): - **Legacy and Standards Compliance:** Omeka was born in a time when the cultural heritage sector was deeply invested in XML standards. Formats like **Dublin Core XML**, **MARCXML**, **METS**, **MODS**, **EAD** (archival finding aids in XML), and the **OAI-PMH protocol** (XML-based) were the lingua franca of digital collections. To integrate with these systems, Omeka had to speak XML. This is why Omeka Classic’s export formats included Dublin Core XML and why an **OAI-PMH module** was one of the first plugins – institutions needed to share Omeka content with aggregators via XML feeds. These standards change slowly; even today OAI-PMH (XML) is more commonly supported among repositories than a JSON-based equivalent. So Omeka continues to provide XML in those areas to remain compatible with the wider ecosystem of repositories and catalogs. - **Initial Architecture and Technical Momentum:** Omeka Classic was built on PHP and the Zend Framework, leveraging features like `Zend_View` and `ContextSwitch` which made serving XML or JSON relatively straightforward by writing alternate view scripts. At the time, XML was considered a safe, well-understood choice for data interchange. A custom XML schema (omeka-xml) was created to represent Omeka items in a structured way that could encompass Dublin Core, Item Type Metadata, and file information in one document. Once this schema and approach were in place, the core development and many plugins revolved around it. Re-engineering Classic to eliminate XML would have been a massive undertaking, so it persisted. Instead, the strategy was to develop Omeka S as a new product to embrace JSON and linked data, while maintaining Classic (and its XML outputs) for those who had it in production. This is a common scenario in software projects: *backwards compatibility* concerns lead to old formats sticking around. Classic’s user base had built workflows (some with XSLT transformations, etc.) around Omeka’s XML, and removing or radically changing it would break those workflows. Thus, XML lives on in Classic, and Omeka S still includes some XML facilities (like an **RDF/XML** option in the API ([REST API - Omeka S Developer Documentation](https://omeka.org/s/docs/developer/api/rest_api/#:~:text=%2A%20JSON,Triples%20%28%60ntriples)), or the ability to import XML vocabularies ([Omeka-S ontologies - TEI? - #4 by patrickmj - Development - Omeka Forum](https://forum.omeka.org/t/omeka-s-ontologies-tei/3108/4#:~:text=Omeka%20S%20uses%20the%20EasyRDF,for%20building%20and%20exporting%20vocabularies))) to not alienate users who rely on XML data. - **Tooling and Workflow Habits:** Many curators and developers in the GLAM (Galleries, Libraries, Archives, Museums) field have established tools for XML. For example, an archives specialist might be very comfortable taking an XML output and writing an XSLT to turn it into a nicely formatted report or to crosswalk it into another schema. These users might find JSON-LD foreign or lacking the mature transformation tools they have for XML. As an Omeka S module developer noted in a to-do comment, there was thinking about bringing back or normalizing an Omeka XML format for those who still want it ([ Omeka S - OAI-PMH Repository ](https://omeka.org/s/modules/OaiPmhRepository/#:~:text=TODO)), precisely because some users find a simple XML dump useful for their workflows (perhaps to archive the data or transform it offline). Furthermore, certain data (like rich text with markup) sometimes feels more naturally represented in XML (e.g., TEI for transcribed texts). There are modules like **XML Viewer** for Omeka S that allow storing and displaying XML files (TEI, ALTO OCR, etc.) ([Daniel-KM / Omeka-S-module-XmlViewer - GitLab](https://gitlab.com/Daniel-KM/Omeka-S-module-XmlViewer#:~:text=Daniel,TEI%20are%20available%20by%20default)) – showing that XML as content (not just as data format) is still important in digital humanities. Omeka accommodates this by letting XML files be attached as media and then providing stylesheets to view them. - **Complete Metadata Dump:** One interesting reason XML hung around is the idea of a **single, self-contained record** dump. The Omeka-XML format was designed to package all metadata about an item (all element texts, all files, collection info, etc.) in one XML document ([Incomplete XML outputs · Legacy Forums · Omeka](https://omeka.org/forums-legacy/topic/incomplete-xml-outputs/#:~:text=We%20want%20every%20metadata%20field,are%20building%20for%20radio%20producers)). This is useful as a **transfer format** or backup for a single item. Before JSON became prevalent, XML was the way to do that. Even though Omeka S could output JSON for an item, if you wanted all items in one file, XML (with its ability to have nested records) was often used. Classic’s `items/browse?output=omeka-xml` essentially gives an `<itemContainer>` XML with all items listed. Many collection managers appreciated that they could get one big XML file and save it as an export of their entire repository. While JSON can do this (an array of item objects), older habits die hard – some still trust XML as a durable, archivable format. There is also a perception that XML with a schema is “safer” for long-term storage (since the schema can be used to validate the data’s integrity over time), whereas JSON might be seen as more transient. This mindset contributes to XML’s persistence. - **Conservative Domain:** The cultural heritage domain can be conservative in technology adoption. JSON-LD and schema.org are relatively new (last decade) developments, whereas XML has been in use since the late 1990s. Some institutions simply have policies or IT requirements that prefer XML for data interchange. As a result, Omeka can’t completely drop XML support without cutting off part of its user base. For example, a museum aggregator might only accept LIDO XML; if Omeka couldn’t produce that, those museums might choose a different system. (Currently, Omeka S doesn’t natively export LIDO, but as seen in the forum discussion, one can extend it to do so ([Export Data and LIDO - Development - Omeka Forum](https://forum.omeka.org/t/export-data-and-lido/18271#:~:text=Hello%20again%2C%20i%20am%20still,is%20appreciated%20Image%3A%20%3Aslight_smile%3A%20Best)) ([Export Data and LIDO - Development - Omeka Forum](https://forum.omeka.org/t/export-data-and-lido/18271#:~:text=Yes%2C%20there%20is%20the%20module,the%20mapping%2C%20not%20the%20code)). Classic likely had a plugin at some point to export LIDO or others as well, via XSLT from Omeka-XML or similar.) In summary, XML continues to be used in Omeka’s architecture mostly for **compatibility and inertia** reasons, not because it is superior for new development. The core Omeka team has even expressed some regret about XML in certain contexts – for instance, regarding RDF data, one developer noted there is *“lots of regret about the XML serialization”* (referring to RDF/XML) ([Omeka-S ontologies - TEI? - #4 by patrickmj - Development - Omeka Forum](https://forum.omeka.org/t/omeka-s-ontologies-tei/3108/4#:~:text=There%E2%80%99s%20a%20variety%20of%20other,in%20theory)). That sentiment aligns with the broader shift toward JSON-LD in Omeka S. Nonetheless, Omeka must straddle two worlds: keeping XML where it’s needed for the community and protocols, while pushing forward with JSON where possible. ## Limitations and Pain Points of Omeka’s XML Approach From a developer and curator perspective, Omeka’s XML format has several **limitations and pain points** that have been documented over the years: - **Brittleness of Custom XML:** The proprietary **omeka-xml schema** in Classic was a double-edged sword. It captured all Omeka data in one package, but any change in Omeka’s data model required updating the schema and associated transformations. If those fell out of sync, the output could break. We saw an example where certain Dublin Core fields didn’t appear in the XML due to a bug ([Incomplete XML outputs · Legacy Forums · Omeka](https://omeka.org/forums-legacy/topic/incomplete-xml-outputs/#:~:text=annewootton%20March%2010%2C%202012)) – the format was brittle enough that a small bug blocked a substantial portion of metadata from exporting. Additionally, clients had to hard-code knowledge of this schema to use the data. If Omeka changed the schema version (and it did have versions v4, v5, etc.), clients had to update their code or XSLTs accordingly. This tight coupling is seen as arbitrary by some because Omeka’s schema wasn’t a standard format like METS or RDF, but something the project invented. External developers might ask, *“Why do I have to parse `<elementTextContainer>` and `<element>` tags for what is essentially key-value data that could be JSON or CSV?”* The lack of adoption beyond Omeka means the effort spent handling omeka-xml has little reuse value elsewhere. - **Complexity of Parsing:** Consuming XML generally requires an XML parser, dealing with namespaces, and more verbose code than consuming JSON. For web integrations (e.g., an interactive map on a separate site pulling Omeka data), working with XML is a hurdle. As one modern comparison states, *“JSON... takes up less space than XML for the same amount of data... and is supported by all popular programming languages”*, whereas XML, though widely supported, is heavier to handle ([JSON And XML. API’s Data Interchange Formats | by Eugene Bartenev | Medium](https://medium.com/@bartenev/json-and-xml-apis-data-interchange-formats-96e58d0626a6#:~:text=JSON%20is%20one%20of%20the,the%20same%20amount%20of%20data)). In practice, this meant some developers avoided Omeka’s XML and instead used CSV exports or wrote custom scripts to hit the Classic API and gather JSON. The extra steps (and often lower performance – JSON can be parsed much faster than XML in browsers ([FINALCAINE2009](https://www.cs.montana.edu/izurieta/pubs/IzurietaCAINE2009.pdf#:~:text=resource%20utilization%20and%20the%20relative,describes%20the%20structural%20attributes%20of)) ([FINALCAINE2009](https://www.cs.montana.edu/izurieta/pubs/IzurietaCAINE2009.pdf#:~:text=applications%3B%20thus%20providing%20significant%20performance,6%5D%20addresses%20such))) made XML a less attractive option unless absolutely required. - **Rigid Metadata Structure:** In Omeka Classic, because the data model was tied to Dublin Core and a single item could only have those fields plus one Item Type schema, the XML output mirrored that rigidity. If a project wanted to use a different metadata schema (say, VRA Core or a custom set of fields), they had to either shoehorn it into Dublin Core fields or create a new element set via a plugin. This data would then appear in the omeka-xml under a different element set name, but the overall structure was still an Omeka-specific interpretation. For developers and curators, this felt arbitrary – why only one additional element set (Item Type) per item? Why are some fields under `<elementSet name="Dublin Core">` and others under `<elementSet name="Item Type Metadata">`? Such questions arose especially when crosswalking data from Omeka to another schema. The answers lay in Omeka’s internal design, not in a broader standard, which could be frustrating. Omeka S alleviated this by flattening everything into properties, but Classic’s XML carries those structural idiosyncrasies, which can be seen as a pain point when trying to repurpose the data. - **Lack of Native Support in Modern Tools:** As the tech landscape shifted, tools for data visualization, web development, and analysis have favored JSON. For example, building a timeline or gallery using a JavaScript library nowadays typically expects data in JSON (perhaps through a REST API). Omeka Classic’s out-of-the-box outputs didn’t cater to that – a developer had to either transform the XML to JSON on their own or use the Classic API (which, as mentioned, initially didn’t provide all data). The community did create solutions (the API, plugins for specific JSON outputs, etc.), but these were workarounds for the fact that XML was the primary citizen. A common refrain among developers integrating with Omeka Classic was that the XML output was *“too verbose”* or *“not what we need.”* Many ended up using the CSV Import/Export plugin or direct database access to get data out, which bypasses the XML entirely. - **Maintenance Burden:** For the Omeka core team, maintaining XML-related code (schemas, XSLTs, etc.) is additional overhead. Every new feature or element potentially needs representation in omeka-xml and testing across different versions. The forum post where a developer says *“Bad on us... the documentation was wrong”* regarding adding a new output format ([Add a new XML output format - #4 by patrickmj - Import/Export - Omeka Forum](https://forum.omeka.org/t/add-a-new-xml-output-format/296/4#:~:text=Bad%20on%20us%E2%80%A6the%20documentation%20was,update%20on%20the%20readthedocs%20site)) hints that even the docs and code around XML outputs can be confusing. In Omeka S’s OAI-PMH module documentation, they mention that RRCHNM (the team behind Omeka) removed the old omeka-xml schema from their site, and that the format isn’t implemented in S because of limited support ([ Omeka S - OAI-PMH Repository ](https://omeka.org/s/modules/OaiPmhRepository/#:~:text=This%20output%20format%20uses%20an,schema%20from%20the%20last%20site)). This implies a conscious decision to drop something that was a headache to maintain (if it’s not widely used, it’s not worth the trouble). Instead, S provides a “simple XML” output that doesn’t rely on a schema or complex structure, just to satisfy the need for an XML dump ([ Omeka S - OAI-PMH Repository ](https://omeka.org/s/modules/OaiPmhRepository/#:~:text=NOTE%3A%20Because%20of%20its%20limited,that%20provides%20all%20metadata%20too)). The simpler that output, the less maintenance required. It’s telling that *“normalize simple_xml (or omeka xml v5/v6?)”* is listed as a TODO in the module ([ Omeka S - OAI-PMH Repository ](https://omeka.org/s/modules/OaiPmhRepository/#:~:text=TODO)) – perhaps considering reintroducing a standardized Omeka XML, but clearly with reservations. The presence of this note and warnings suggests the maintainers know the current state (no official Omeka XML in S) is not ideal for some users, but also that investing heavily in XML again is not a priority. - **Partial Tooling for Migration:** When moving from Classic to S (or any other system), there’s no one-click conversion of the entire site’s data format. Omeka S provides an **Omeka Classic Importer** module, which likely uses the Classic API or database to bring in content. If Classic’s data were purely in a nice interchange format like JSON-LD, one could imagine a simpler migration path. Instead, the migration module had to account for Classic’s structure, mapping DC fields, Item Types, tags, etc., to Omeka S resources. This kind of mapping is inherently complex, but XML adds an extra layer (if they had to parse XML exports). It’s likely the importer bypasses XML and queries the Classic database or API directly to avoid that. Still, the reliance on XML historically might have slowed down integration efforts with other systems. For example, to get Omeka Classic data into a repository that accepts only RDF, one would have to transform the XML to RDF (which could be done by writing an XSLT to output RDF/XML, or by a script to produce JSON-LD). Each extra transformation step is a point of failure or at least friction. In essence, Omeka’s XML approach served its purpose in a certain time and context but is seen as **cumbersome and outdated** by many current users and developers. The words “brittle” and “arbitrary” capture the sentiment: brittle, because the custom format can break easily with changes and doesn’t adapt gracefully; arbitrary, because it doesn’t align with external standards or modern practices, leaving one to wonder why it exists in that form at all. The community’s gradual move away from it – evidenced by the emphasis on JSON outputs, deprecation of omeka-xml in S, and advice to use APIs – highlights these pain points. ## JSON-Based Alternatives: Benefits and Trade-offs Modern JSON-based data formats, including **JSON-LD** and **schema.org JSON**, offer compelling solutions to the issues above. Omeka S’s adoption of JSON-LD demonstrates many of these benefits in practice: **Usability and Developer-Friendliness:** JSON is lightweight and easy to parse. Developers can fetch a JSON representation of Omeka data and use it immediately in web applications, without needing specialized XML parsing code. As one article noted, JSON is generally *“easier to read and takes up less space than XML for the same amount of data”*, and it’s essentially the standard for REST APIs ([JSON And XML. API’s Data Interchange Formats | by Eugene Bartenev | Medium](https://medium.com/@bartenev/json-and-xml-apis-data-interchange-formats-96e58d0626a6#:~:text=JSON%20is%20one%20of%20the,the%20same%20amount%20of%20data)). This is reflected in Omeka S’s design: the default API response being JSON-LD makes Omeka’s data accessible to a wide range of developers (not just those familiar with library science or XML). JSON-LD specifically is JSON with an added semantic layer; it strikes a balance between being human-readable and machine-processable. It uses idiomatic JSON (objects, arrays) with special keys like `@context` and `@id` to convey meaning. A developer can choose to ignore the Linked Data aspect (treating it like just JSON), or leverage it fully. This flexibility is a big win over a rigid XML schema. **Flexibility and Extensibility:** JSON is schema-less by nature (unless you impose a schema). This means adding a new field or property in JSON is as simple as inserting a new key in an object. In JSON-LD, you additionally provide a context definition for that key (linking it to a URI). Omeka S can thus accommodate new metadata standards without code changes to the format itself – you just load a new vocabulary (e.g., Bibframe or Schema.org terms) into Omeka S, and you can start using those properties on items. They will automatically appear in the JSON-LD output because they’re part of the item’s data. There’s no need to update an XSD or write a new output plugin; the system is data-driven. This is a huge improvement in maintainability and future-proofing. The **schema.org JSON** mentioned in the prompt refers to expressing data in terms of Schema.org vocabulary, typically for SEO. This is essentially a subset of JSON-LD usage (Schema.org provides a context, and the JSON structure follows Schema.org types like `Article`, `Person`, etc.). If Omeka were to output JSON specifically for Schema.org, it could do so by choosing Schema.org terms in the context. This could yield data that search engines can directly index and understand (e.g., marking an Omeka item as a `CreativeWork` or `MuseumObject` with certain properties). The advantage of JSON-LD is that it can include multiple contexts – an Omeka S item could potentially include both its original Dublin Core terms and equivalent Schema.org terms for the same data, to serve different audiences. Doing such dual encoding in XML would be far messier (it would likely require duplicating elements or separate outputs entirely). **Interoperability (Linked Data ready):** JSON-LD by design integrates with the Linked Open Data cloud. Each entity and property can have a URI. Omeka S items have URIs and can reference other resources (e.g., items linking to authority records by URI). When serialized as JSON-LD, these become native links that external systems can crawl or resolve. This was far less straightforward with Omeka’s XML. While one could embed URLs in XML text, the XML didn’t inherently convey that “this text is a URI for a resource of type X”. JSON-LD can do that with `@id` and `@type`. This opens Omeka content to be easily ingested by knowledge graphs, Semantic Web tools, or integrated via frameworks like Apache Jena or rdflib. It also means Omeka can better play in the world of **microservices** and **web APIs**, where JSON is the lowest common denominator. **Better Tooling and Ecosystem:** There is a rich ecosystem of tools for JSON and JSON-LD. For example, Google provides the Structured Data Testing Tool for JSON-LD (largely for Schema.org use). There are JSON-LD libraries in Python, Java, JavaScript, and other languages that can flatten or frame JSON-LD, convert it to RDF triples, etc. Developers can use these to transform Omeka S JSON-LD into other formats (like converting to XML-based RDF if needed, or directly querying it). We see this leveraged in Omeka S modules as well – the fact that Omeka S stores data as triples (internally or conceptually) allows modules like Value Suggest (for linking to external authorities) or Resource Sharing to function more generically. In contrast, the Omeka Classic approach often required one-off scripts or XSLTs tailored to Omeka’s XML. Those do exist (for instance, someone might have written an XSLT to convert Omeka XML to MODS XML), but each was a bespoke solution. JSON and JSON-LD benefit from modern developer attention – for instance, one could build a quick Node.js script to fetch Omeka S items in JSON and index them into Elasticsearch or another system with minimal fuss. Doing the same from Classic’s XML would either require an XML parsing step or using the API (which, as noted, was built on XML anyway originally). **Performance:** In general, JSON has performance advantages in web contexts – smaller payload size and faster parsing in browsers. A case study (Nurseitov et al. 2009) found JSON significantly faster to parse and less memory-intensive than XML in various scenarios ([FINALCAINE2009](https://www.cs.montana.edu/izurieta/pubs/IzurietaCAINE2009.pdf#:~:text=resource%20utilization%20and%20the%20relative,describes%20the%20structural%20attributes%20of)) ([FINALCAINE2009](https://www.cs.montana.edu/izurieta/pubs/IzurietaCAINE2009.pdf#:~:text=applications%3B%20thus%20providing%20significant%20performance,6%5D%20addresses%20such)). For Omeka, this means that an application exchanging a lot of data will benefit from JSON. For example, imagine a mobile app pulling 100 items from Omeka: getting JSON will likely use less bandwidth and parse faster on the device than XML would. On the server side, Omeka S saving data via JSON also avoids building large DOM documents; it can map JSON directly to its database models. However, it’s important to note some **trade-offs and challenges** with JSON adoption: - **Learning Curve for Linked Data:** Traditional Omeka Classic users used to simple key-value metadata might find JSON-LD a bit bewildering at first (all the `@` keys and nested structures). There is a need to educate users and developers on interpreting and using JSON-LD. Omeka S documentation and community discussions often involve clarifying how to format JSON-LD payloads or why the API returns data in a certain nested way. This is arguably easier to handle than XML schemas, but it’s a shift in thinking from the old straightforward XML trees. - **Validation and Constraints:** XML has XSD (schema) or DTD to validate documents. JSON lacks a universal validation unless one uses JSON Schema (which wasn’t widely used until recently). Omeka’s XML schema could enforce, for example, that each `<element>` had an `<name>` and one or more `<elementText>`. In JSON, such structure isn’t automatically enforced. Omeka S handles this at the application level (ensuring required fields, etc.), but external systems can’t just validate a JSON-LD document against a schema as easily. This could be a concern for data exchange when guarantees of structure are needed. That said, in practice JSON Schema could be used if really necessary, or one trusts the API to deliver well-formed data. - **Context and Verbosity in JSON-LD:** While JSON is less verbose than XML, JSON-LD adds some verbosity back in via context and repeating keys. In a large Omeka S JSON-LD output, you might see repeated `{ "@context": "...", "@id": "...", "@type": "o:Item", "o:id": 123, "o:media": [ ... ], "dcterms:title": [ { "type": "literal", "@value": "Title text" } ], ... }` for each item. This is very expressive but not as slim as a custom JSON might be (e.g., they include both `o:id` as an internal ID and the `@id` as a URI). In fact, as mentioned earlier, embedding JSON-LD in pages increased page size enough that Omeka S made it optional ([Significant slowness relative to Omeka Classic? - #4 by jflatnes - Omeka S - Omeka Forum](https://forum.omeka.org/t/significant-slowness-relative-to-omeka-classic/8089/4#:~:text=The%20page%20size%20there%20is,the%20latest%20version%20of%20S)). So one trade-off is slightly heavier responses compared to a minimal JSON. But this trade-off brings benefits of clarity and linkability. If needed, developers can reduce overhead by requesting only certain fields via the API (if supported) or by using the context URL (Omeka S could potentially return a context reference instead of embedding the full context each time to save space – indeed JSON-LD allows usage of a context URL). - **Ongoing XML Needs:** Even with JSON as the primary format, Omeka cannot fully escape XML because of protocols like OAI-PMH (which has no JSON equivalent widely used) and because of standards like IIIF Presentation v2 (which used JSON-LD, but IIIF v3 is still JSON-LD; not an issue) or other niche requirements. So Omeka S still has to maintain some XML code. For example, the OAI-PMH module in S can output Dublin Core and Dublin Core Terms in XML, and it even includes an option for METS (Metadata Encoding and Transmission Standard) which is an XML standard ([ Omeka S - OAI-PMH Repository ](https://omeka.org/s/modules/OaiPmhRepository/#:~:text=Metadata%20for%20mets%20,media)). Thus, Omeka’s adoption of JSON-LD doesn’t entirely remove XML from the picture; rather it scopes it to where it’s absolutely needed. The trade-off here is that developers might have to work with both JSON and XML depending on integration (e.g., use JSON for direct API calls, but still possibly use OAI-PMH XML for harvesting into another older system). That said, if both ends support JSON, one can avoid XML altogether – a freedom that didn’t exist before. **Comparison Table – XML vs JSON in Omeka:** To synthesize the differences, below is a comparison of Omeka’s XML-based approach (as in Classic) and the JSON-based approach (as in Omeka S and modern formats) across key criteria: | Aspect | **Omeka’s XML Approach** (Classic & legacy) | **JSON-Based Approach** (JSON-LD in Omeka S, Schema.org JSON) | |--------------------------|-----------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------| | **Standardization** | Proprietary *omeka-xml* schema (Omeka-specific tags). Partially standardized outputs (Dublin Core XML via OAI-PMH) for interoperability ([ Omeka S - OAI-PMH Repository ](https://omeka.org/s/modules/OaiPmhRepository/#:~:text=Omeka%20XML%20%28prefix%20%60omeka)). Generally not a format other systems natively consume without customization. | Leverages web standards. JSON-LD uses standard vocabularies (Dublin Core, FOAF, Schema.org, etc.) with a context, making the data self-descriptive ([REST API - Omeka S Developer Documentation](https://omeka.org/s/docs/developer/api/rest_api/#:~:text=HTTP%20responses%20will%20be%20formatted,transporting%20Linked%20Data%20using%20JSON)). Schema.org JSON is understood by search engines. Data aligns with Linked Data conventions. | | **Data Model** | Hierarchical XML with separate sections for each element set (e.g., DC, Item Type). Closely tied to Omeka’s internal DB schema (e.g., an `<elementSet>` grouping for each metadata set). Rigid structure defined by XSD – adding new fields outside defined sets is not straightforward. | Graph-oriented JSON-LD structure. Items are represented as objects with properties that can come from any loaded vocabulary. New properties can be added without structural changes (just include the term and context). Data model is flexible – essentially flat key-value pairs with possible references to other entities. | | **Completeness** | Omeka-XML was designed to include all metadata (except empty fields) ([Incomplete XML outputs · Legacy Forums · Omeka](https://omeka.org/forums-legacy/topic/incomplete-xml-outputs/#:~:text=Jim%20Safley%20March%2010%2C%202012)), but required maintenance (bugs could omit parts). OAI-PMH DC output covers only Dublin Core 15 elements by spec (extended DC via oai_dcterms for 55 terms) ([ Omeka S - OAI-PMH Repository ](https://omeka.org/s/modules/OaiPmhRepository/#:~:text=Dublin%20Core%20%28prefix%20)). Some information (like item files or relations) might not be represented in standard outputs. | Omeka S JSON-LD includes all resource data by default (metadata, links to media, item set memberships, etc.). It’s the same data the application uses, so it tends to be complete. If something is missing, it’s usually an oversight that can be fixed at the API level. JSON-LD can express relationships (e.g., item belongs to collection) in-line with identifiers, which in XML might have needed separate handling. | | **Ease of Parsing & Use**| Requires XML parsing. Developers need to handle XML namespaces and tree traversal. XSLT can be used to transform data, but XSLT 1.0 (commonly used) is limited in JSON output, often requiring workarounds. Many modern web environments (browser JavaScript, etc.) find XML cumbersome compared to JSON. | Directly consumable via ubiquitous JSON parsers. In web browsers, `fetch()` and `response.json()` provide native JSON handling. No need for special libraries in most environments. Errors are easier to pinpoint (malformed JSON is often just syntax; malformed XML can be more complex). The structure is more intuitive to developers familiar with JSON from web APIs. | | **Extensibility** | Adding a new metadata element in Classic might require creating a plugin to define a new element set or repurposing an existing field ([DC/XML How create an instance of an element - Development - Omeka Forum](https://forum.omeka.org/t/dc-xml-how-create-an-instance-of-an-element/18493#:~:text=The%20Item%20Type%20Metadata%20system,is%20done%20by%20a%20plugin)) ([DC/XML How create an instance of an element - Development - Omeka Forum](https://forum.omeka.org/t/dc-xml-how-create-an-instance-of-an-element/18493#:~:text=jflatnes%20%20July%207%2C%202023%2C,6%3A03pm%20%204)). To reflect it in XML outputs, the schema or output code might need updates. New output formats (like a new XML schema) require custom coding (as seen with KML, LIDO attempts) ([Export Data and LIDO - Development - Omeka Forum](https://forum.omeka.org/t/export-data-and-lido/18271#:~:text=Yes%2C%20there%20is%20the%20module,the%20mapping%2C%20not%20the%20code)). High friction to extend. | Very extensible. Omeka S can incorporate new vocabularies on the fly and they immediately appear in JSON-LD outputs. The JSON format itself doesn’t need a structural change to accommodate new data. To add a new export format, often it’s a matter of mapping JSON to that format (which can be done in-code or via generic JSON-LD to RDF tools). Lower effort to extend to new use cases (e.g., output to a new standard or integrate a new data field). | | **Tool Support** | XML is well-supported in traditional data management tools (lots of libraries in Java, .NET, etc., and XSLT/XQuery for XML). However, Omeka’s specific format has no off-the-shelf support – one must custom-code transformations or use Omeka’s own tools. Many newer tools (data vis, indexing pipelines) don’t ingest arbitrary XML easily; they prefer JSON or CSV. | JSON/JSON-LD is supported across virtually all programming languages and many web services. There are JSON-LD processors that can convert to RDF or other forms. Web APIs, JavaScript frameworks, and data analysis tools (like Python’s pandas) all speak JSON natively. Developers can integrate Omeka data with fewer intermediate steps. For SEO, Google and others directly consume JSON-LD embedded in pages. Overall, tooling for JSON is more aligned with current development practices. | | **Interoperability** | XML outputs like Dublin Core XML via OAI-PMH are interoperable in the library/archive domain (harvestable by aggregators). Custom Omeka XML is *not* interoperable without custom adapter (e.g., not understood by other systems out of the box) ([ Omeka S - OAI-PMH Repository ](https://omeka.org/s/modules/OaiPmhRepository/#:~:text=Omeka%20XML%20%28prefix%20%60omeka)). Some export to standard XML (METS, MODS, LIDO) can be done via plugins or external transforms, but not built-in. | JSON-LD improves interoperability in the web/linked data domain. Omeka S can directly interface with Linked Data platforms, or its data can be easily transformed to other linked data formats (RDF/XML, Turtle) if needed ([REST API - Omeka S Developer Documentation](https://omeka.org/s/docs/developer/api/rest_api/#:~:text=%2A%20JSON,Triples%20%28%60ntriples)). For general web integration, JSON is a de facto standard – Omeka S can communicate with other web services (e.g., a visualization service, or a CMS like WordPress via its REST API) using JSON as common ground. JSON-LD also means Omeka’s data can be part of the Semantic Web, enabling new discovery and reuse scenarios (like being indexed by knowledge graphs or connected via federated queries). | | **Performance** | XML is verbose (tags around every value). Transferring and parsing XML typically takes more time and memory than JSON for the same content ([FINALCAINE2009](https://www.cs.montana.edu/izurieta/pubs/IzurietaCAINE2009.pdf#:~:text=resource%20utilization%20and%20the%20relative,describes%20the%20structural%20attributes%20of)) ([JSON And XML. API’s Data Interchange Formats | by Eugene Bartenev | Medium](https://medium.com/@bartenev/json-and-xml-apis-data-interchange-formats-96e58d0626a6#:~:text=JSON%20is%20one%20of%20the,the%20same%20amount%20of%20data)). Classic’s reliance on XSLT for JSON conversion adds overhead. On the client side, large XML (like a big list of items) can be slow to handle in browsers. | JSON is compact and speedy. The same dataset serialized in JSON can be significantly smaller than in XML (no closing tags, simpler syntax). Parsing JSON is very fast in modern engines – up to 100x faster in browsers, according to some studies ([FINALCAINE2009](https://www.cs.montana.edu/izurieta/pubs/IzurietaCAINE2009.pdf#:~:text=applications%3B%20thus%20providing%20significant%20performance,6%5D%20addresses%20such)). This means snappier API responses and the ability to handle more data in one go. A caveat: JSON-LD context can add some size, but contexts can be cached or referenced by URL to mitigate repetition. Overall, JSON-based APIs like Omeka S’s tend to perform well and can be scaled with common web optimizations (caching, GZIP compression, etc.). | | **Human Readability** | XML is arguably human-readable, but large XML records are hard to read and edit manually. The structure is explicit but lengthy. For example, seeing an item’s XML with dozens of `<element>` blocks can be overwhelming to a person. | JSON is also human-readable, and many find its concise structure easier on the eyes. Key names and values are clearly delineated with punctuation. JSON-LD’s use of `@context` might be non-intuitive at first, but once understood, the JSON reads logically (especially if pretty-printed). Many developers prefer scanning JSON to scanning XML when debugging. Both XML and JSON can be indented and made readable, but JSON’s lack of closing tags reduces clutter. | | **Longevity** | XML is text-based and self-descriptive (through tags and optional schema references), which has made it a preferred format for long-term data storage in archives. There’s confidence that XML files will be interpretable in the future with standard tools. The Omeka-specific schema, however, might not be easily findable or might become unsupported (indeed, Omeka.org removing the schema file is a concern ([ Omeka S - OAI-PMH Repository ](https://omeka.org/s/modules/OaiPmhRepository/#:~:text=Omeka%20XML%20%28prefix%20%60omeka)) for long-term validation). | JSON is also text-based and now considered stable for data archiving in many domains (e.g., JSON exports for databases). JSON-LD, in particular, has the advantage of linking to definitions (contexts) which provide a way to understand the data even decades later, as long as the context URI or documentation is available. One trade-off is the lack of a fixed schema; however, because Omeka S data is fundamentally RDF, the “schema” is the vocabularies used (Dublin Core, etc.), which are themselves standardized and versioned by the community. In terms of future accessibility, JSON and JSON-LD are likely to be supported as long as the web exists, given their adoption. | This comparison highlights that moving to JSON-based formats tends to improve **usability, flexibility, and maintainability**, aligning Omeka with contemporary data practices. But it also notes that XML isn’t entirely made obsolete – certain strengths of XML (like deep tooling in specific domains and perceptions of stability) mean it won’t disappear overnight. ## Considerations for Transitioning to JSON If Omeka (especially Omeka Classic or parts of Omeka S) were to transition further toward JSON-based formats, there are several factors to consider: - **Backward Compatibility:** A abrupt removal of XML outputs in favor of JSON could disrupt users who depend on them. For example, some institutions might have automated scripts that periodically harvest Omeka Classic’s Dublin Core XML output or OAI-PMH feed to update a catalog. Any transition plan must ensure those use cases are still supported, perhaps by providing equivalent JSON outputs in parallel and giving ample time (and tools) to switch. In Omeka S, this is partially addressed by offering both JSON-LD and RDF/XML, etc., via the API ([REST API - Omeka S Developer Documentation](https://omeka.org/s/docs/developer/api/rest_api/#:~:text=%2A%20JSON,Triples%20%28%60ntriples)). A transitional approach could be to **encourage use of JSON** by making it the default and most feature-complete output (which is already the case in S), while **maintaining XML for legacy integration** until it naturally falls out of use. Over time, documentation and community examples should guide users toward JSON and API usage rather than older XML-based methods. - **Support Bridges:** To ease the move away from XML, Omeka could invest in or support tools that convert between formats. For instance, a command-line utility to export an Omeka Classic site’s content as JSON-LD (using the API under the hood) would help those who want to archive or migrate data in a modern format. Similarly, providing XSLTs or scripts to turn Omeka XML exports into JSON could assist those mid-transition. In the context of Omeka S, perhaps modules that output common XML standards (for those who need them) could be offered but implemented by transforming the JSON internally. That way, the core stays JSON-centric, and XML is just a view layer for specific needs. - **Documentation and Community Training:** The success of a transition depends on user acceptance. Clear documentation comparing the old XML methods and new JSON methods is vital. The question prompt itself suggests some in the community perceive the XML as “brittle” and likely are in favor of moving on. Providing guides (for example, “How to use the Omeka S API instead of OAI-PMH for harvesting data”) can gradually change habits. The Omeka team might consider deprecating the omeka-xml schema formally – i.e., announcing that it will not be enhanced and that JSON (or JSON-LD) is the recommended interchange format going forward. In fact, the absence of omeka-xml in Omeka S is a de facto deprecation. Users should be pointed to alternatives like the **Bulk Export** module or direct API calls for getting all data out. - **Potential Gains:** The gains from moving fully to JSON-based workflows include simpler integration with web services, the ability for Omeka to integrate into JAMstack or single-page applications, and easier maintenance. For instance, an Omeka S site could be used purely as a back-end, with a static front-end that fetches JSON data and renders a site. This was harder with Classic’s XML (developers sometimes resorted to hacks like generating JSON from XML or using PHP to output JSON to achieve that). Another gain is performance under heavy use: JSON being less verbose means smaller database of record if one stores a lot of exported data (some aggregators might store harvested data; JSON would save space over XML). - **Trade-offs and Cautions:** One trade-off is that JSON does not inherently carry comments or metadata about the data structure like an XML schema does. If Omeka were to be used as an archival repository (for the data itself, not just the objects), some archivists might trust an XML with XSD more for long-term preservation. A balanced approach might be to provide both: allow exporting a JSON-LD and include the JSON-LD context definitions or a small JSON Schema that describes the resource template. This could serve as documentation for future users. Additionally, while JSON-LD is powerful, there might be edge cases it doesn’t cover well (for example, ordering of values is not always guaranteed unless specified, whereas XML inherently has order to elements). If any metadata requires ordered list of values, JSON-LD might need an extension (like using `@list`). These details would need attention to ensure nothing is lost in translation. In practice, **Omeka S has already done 90% of the transition** for new users/projects. The recommendation for existing Omeka Classic installations that feel hampered by XML is usually to migrate to Omeka S if possible, rather than try to retrofit Classic. Classic will likely continue to be maintained minimally (security fixes, minor tweaks) but not fundamentally changed to use JSON over XML. Therefore, the “transition to JSON” is mainly relevant for Omeka S and for the ecosystem around it. ## Recommendations and Conclusion **Should Omeka move away from XML?** From a technical standpoint, yes – the benefits in usability, flexibility, and maintainability strongly favor JSON-based formats for almost all use cases examined. Omeka’s own evolution reflects this: Omeka S embraces JSON-LD for data interchange, which has improved integration capabilities dramatically (native REST API, multi-vocabulary support, easier theming for linked data, etc.). For new development, focusing on JSON and de-emphasizing XML is advisable. The core areas where XML should remain are those dictated by external requirements (protocols like OAI-PMH, or domain standards that are XML-only). Even there, Omeka should act as a translator at the edges, converting its internal JSON/RDF data to XML just for output. This isolates the complexity and keeps the core clean. **Recommended Adoption Path:** 1. **Maintain XML for Legacy Interop but Phase Out Omeka-Specific XML:** Omeka should continue supporting XML outputs required by standards (DC XML for OAI-PMH, maybe METS in future if needed, etc.), but it does not need to maintain its own Omeka-XML format. As the Omeka S OAI-PMH module has done, rely on “simple XML” or standard schemas instead ([ Omeka S - OAI-PMH Repository ](https://omeka.org/s/modules/OaiPmhRepository/#:~:text=NOTE%3A%20Because%20of%20its%20limited,that%20provides%20all%20metadata%20too)). For Omeka Classic, it’s reasonable to freeze the omeka-xml format at its current state (no new features) and encourage users to use the API or migrate. Official documentation can gently warn that omeka-xml is legacy. For instance, add notes that the Classic API (JSON) or Omeka S should be used for new projects requiring data export. 2. **Invest in JSON-LD and Schema.org Support:** Omeka S should continue improving its JSON-LD output and maybe offer easier ways to output Schema.org flavored JSON-LD for those primarily concerned with SEO and discovery. This could be a configuration where one can choose a context focusing on Schema.org types/properties if their goal is to have items indexed by Google as, say, “Dataset” or “ImageObject”. The underlying data is the same; it’s just a matter of context and maybe some property aliasing. By catering to that use case, Omeka would further reduce the need for any XML-based microformats or the like. 3. **Enhance API and Tools:** Make sure the JSON API remains comprehensive and well-documented. Any gaps (like if some piece of data isn’t accessible via API but only via XML) should be closed. For example, if in Classic some users used the XML to get certain info not in API, ensure Omeka S API covers that. Build or endorse clients in various languages for the API (Python library, etc.) to encourage its use. The easier it is to get data in JSON, the less anyone will think about XML. 4. **Migration Utilities:** Provide utilities to help users extract their data in JSON/CSV from Classic if they plan to move or archive. Even a simple script that iterates over all items via the Classic API and produces a JSON dump could be invaluable to some. This would mitigate the concern of someone being “stuck” with XML exports. Since the Classic API returns JSON (transformed from XML, but still JSON), one could theoretically get everything via that and save it, avoiding the XML layer altogether. 5. **Community Plugins/Modules:** Encourage the community to develop modules that output specific needed XML standards *from* the JSON data. For example, a module that takes an Omeka S item and outputs LIDO XML by mapping relevant properties. This way, institutions that need to submit data in those formats can do so, but the logic lives in a module, not in the core (keeping core lean). The forum discussion shows community interest in LIDO ([Export Data and LIDO - Development - Omeka Forum](https://forum.omeka.org/t/export-data-and-lido/18271#:~:text=Hello%20again%2C%20i%20am%20still,is%20appreciated%20Image%3A%20%3Aslight_smile%3A%20Best)) – this could be an opportunity for collaboration. Over time, as more standards themselves pivot to JSON or Linked Data (some are starting to – e.g., there’s talk of JSON versions of some library standards), Omeka will be ready. 6. **Performance Testing:** As more data goes through JSON, Omeka should monitor performance. The note about JSON-LD increasing page size ([Significant slowness relative to Omeka Classic? - #4 by jflatnes - Omeka S - Omeka Forum](https://forum.omeka.org/t/significant-slowness-relative-to-omeka-classic/8089/4#:~:text=The%20page%20size%20there%20is,the%20latest%20version%20of%20S)) suggests maybe optimizing the embedding. Possibly using compact context references or only embedding essential metadata in the HTML (with the rest retrievable via API if needed) might be a strategy. The goal is to not reintroduce a new kind of “brittleness” – e.g., if the JSON-LD is too slow, people might disable it and lose the benefit. So performance tuning and giving options (like the toggle they added) are important. In conclusion, **a move away from XML for Omeka is technically advisable for most interactions**. It aligns the platform with modern web standards and developer expectations, reduces duplication of effort (maintaining two parallel formats), and opens up richer integration possibilities. Omeka S effectively demonstrates the success of this approach; its use of JSON-LD and Linked Data principles makes it a more powerful and flexible platform than its XML-bound predecessor. The remaining uses of XML are either for backward compatibility or niche needs. Those should gradually become modular or external. Over the coming years, we can expect JSON-LD (or successor technologies like RDF-star in JSON, etc.) to further dominate in Omeka’s context, while XML support becomes purely an interoperability layer. Ultimately, this transition will make Omeka easier to use, extend, and maintain. As one Medium article succinctly put it, *“APIs can use JSON, XML, YAML, or any other… format… Developers often include support for several… JSON is one of the most popular… easier to read and takes up less space… supported by all popular programming languages”* ([JSON And XML. API’s Data Interchange Formats | by Eugene Bartenev | Medium](https://medium.com/@bartenev/json-and-xml-apis-data-interchange-formats-96e58d0626a6#:~:text=structuring,exchange%20formats%20in%20their%20APIs)). Omeka’s architecture should reflect this reality by centering JSON. The recommendation is to continue on the path Omeka S has set: **prioritize JSON-based interchange, provide XML only when required, and help users migrate their workflows accordingly.** This will ensure Omeka stays relevant and robust in comparison to other contemporary platforms that have long embraced JSON and schema.org for data sharing.