---
title: EZID Resolver Design
subtitle: ARK and DOI identifier resolution for the EZID service
---
# EZID Resolver Design
Identifier resolution is the process of determining the location of an identified resource. For the EZID application, the resolution service is provided for ARK identifiers and is accessed using the HTTP protocol, and so ARK and HTTP specifications apply.
```
PREFIX
SCHEME | SUFFIX
/--\ /---\ /--------------------\
http://example.org/ark:/12025/654xz321/s3/f8.05v.tiff
\________________/ \__/ \___/ \______/ \____________/
(replaceable) | | | Qualifier
| ARK Label | | (NMA-supported)
| | |
Name Mapping Authority | Name (NAA-assigned)
Hostport (NMAH) |
Name Assigning Authority Number (NAAN)
```
## Operations
There are two basic operations to be supported:
1. Resolution, redirection to the location registered with the identifier.
2. Instrospection, providing information about the identifier. Called "inflection" in the ARK spec.
Operation requirements include:
* Respond to HTTP GET and HEAD requests.
* Respect HTTP protocol semantics
### Resolution
The basic process of resolution is straight forward.
Current situation:
```plantuml
actor User as U
actor Owner as C
participant EZID as E
participant N2T as N
participant Target as T
== minting ==
C -> E: mint ark:123/a to target/foo
activate E
note right of E
pid = ark:123/a
url = https://n2t.net/ark:123/a
loc = https://target/foo
end note
E --> C: ok
E -> N: create ark:123/a to target/foo
N --> E: ok
deactivate E
== resolution ==
U -> N: ark:123/a
activate N
N --> U: 302, target/foo
deactivate N
U -> T: /foo
```
After transition of resolution to EZID:
```plantuml
actor User as U
actor Owner as C
participant EZID as E
participant Target as T
== minting ==
C -> E: mint ark:123/a to target/foo
activate E
note right of E
pid = ark:123/a
url = https://ezid/ark:123/a
loc = https://target/foo
end note
E --> C: ok
deactivate E
== resolution ==
U -> E: ark:123/a
activate E
E --> U: 302, target/foo
deactivate E
U -> T: /foo
```
A possible workflow for identifier resolution in EZID:
```plantuml
start
->IDENTIFIER;
:split identifier;
->t=SCHEME\np=PREFIX\ns=SUFFIX;
if (scheme is ARK?) then (no)
:response = {
status: 302,
location: doi.org/10.PREFIX/SUFFIX
message: "doi.org"
};
else (yes)
if (EZID prefix?) then (yes)
:find longest suffix match;
if (match?) then (yes)
:response = {
status: 302,
location: URL
message: found
};
else (no)
:response = {
status: 404,
message: "Not Found"
};
endif
else (no)
:response = {
status: 404,
message: "Not Found"
};
endif
endif
:return response;
stop
```
Questions:
1. Under what conditions should an existing identifier not be resolvable?
* Withdrawn?
* Deleted?
* Target known to be unavailable?
* Privacy?
* Reserved
Generally no change in policy, but edge cases need to be handled correctly.
2. If the prefix is not registered with EZID, should the response be 404 or redirect to another service (N2T)?
* Perhaps redirect to N2T, need to ensure redirect loops don't happen
* Better - provide information in the 404 response pointing to N2T for the place to go
4. What endpoint should be used for resolution?
* **Suggest `ezid.cdlib.org/{PID}`**
* ~~ark.ezid.cdlib.org/ark:/123/lkdfjg~~
6. OK to present the alternate link to metadata for the identifier (i.e. inflection URL) in the response?
This would provide the client with a hint that details about the identifier can be obtained through the inflection URL. It would be in a response header like:
```
Link: <https://ezid.net/info/ark:/12345/xyz>; rel="alternate";
type="application/ld+json"; profile="https://w3id.org/ark/metadata"
```
7. Consider supporting shoulder listing, e.g. https://n2t-stg.n2t.net/ark:/99999
8. Verify suffix passthrough is working as expected (e.g. Smithsonian)
9. Existing arks will be updated to resolve to the EZID resolver location instead of the target.
### Inflection
The ARK spec indicates that an inflection request can be made by including one or more question ("?") characters at the end of a request URL.
The basic workflow for inflection is much the same as for resolution. The main difference being the final action once the corresponding identifier record has been located.
```plantuml
start
->IDENTIFIER;
:split identifier;
->t=SCHEME\np=PREFIX\ns=SUFFIX;
if (EZID prefix?) then (yes)
:find longest shoulder match;
if (match?) then (yes)
:Get record;
:response = {
status: 200,
ID META
};
else (no)
-> no;
:Get prefix;
:response = {
status: 200,
PREFIX META
};
endif
else (no)
-> no;
:response = {
status: 404,
message: "Not Found"
};
endif
:return response;
stop
```
Questions:
1. Can the "?" char be reliably passed through the load balancer, Apache, Django stack?
* No, only a double "??" can pass through, appearing as a single "?" in the query parameters.
* May be possible to intercept at the load balancer and set a custom header forwarded to Apache / Django
3. What metadata should be presented in the response?
* Privacy? Only reserved identifiers have no response
* Different metadata for authenticated user? Owner? Nope.
5. Can we use "profile" recognition to implement inflection at the resolve end point?
* Seems sensible to support, especially since DataCite (DOIs in general) fail in support for content negotiation.
8. Content negotion should support ANVL and probably JSON?
9. What is the media type of ANVL? (check with Kunze)
## Deployment
1. Implement and deploy resolution and inflection functionality for EZID. New PIDs still have `asURL` set to N2T
2. Notify users of impending change
3. New PIDs start using EZID for `asURL`
4. Update the target URLs for EZID ARKs on N2T to point to the EZID `asURL`
5. No further updates from EZID to N2T.
## Notes on conflicts with HTTP and URI specification
These notes are general to ARKs, not specific to EZID.
> The Name and Qualifier parts are strings of visible ASCII characters and should be less than 128 bytes in length. The length restriction keeps the ARK short enough to append ordinary ARK request strings without running into transport restrictions (e.g., within HTTP GET requests). Characters may be letters, digits, or any of these six characters:
```
= # * + @ _ $
```
> The following characters may also be used, but their meanings are reserved:
```
% - . /
```
### Inflection
The ARK "inflection" is meant to "change the meaning" of the identifier, to reference the metadata associated with the ARK instead of the object identified by the ARK. The specification uses a "?" to do this. Also:
> "When the ARK is inflected by appending dual question marks ('??'), the returned metadata contains a commitment statement from the current provider."
The question char is a reserved character in the URI specification[^uri] which creates a behavior conflict when requesting ARK inflection from an ARK resolver over HTTP by way of embedding the ARK as part of the URL.
Since these "?" chars are not to be interpreted as URL query delimiters according to RFC 3986, they should be escaped as `%3F`.
Here's some examples of inflection with n2t:
`https://n2t.net/ark:/86084/b4057cw7z%3F`:
```
erc:
who: Tevel Gitlin. Award booklet, 1946
what: IS030_GITL_003
when: (:unav)
where: ark:/86084/b4057cw7z (currently
https://blavatnikarchive.org/item/2964)
how: (:unav)
# inflections under construction
# reference https://n2t.net/e/n2t_apidoc.html
```
`https://n2t.net/ark:/86084/b4057cw7z%3F%3F`:
```
erc:
who: Tevel Gitlin. Award booklet, 1946
what: IS030_GITL_003
when: (:unav)
where: ark:/86084/b4057cw7z (currently
https://blavatnikarchive.org/item/2964)
how: (:unav)
id created: 2021.08.02_09:31:42
id updated: 2021.08.02_09:31:33
persistence: (:unav)
# inflections under construction
# reference https://n2t.net/e/n2t_apidoc.html
```
`https://n2t.net/doi:10.21239/V9F61N?`:
```
erc:
who: (:unav)
what: (:unav)
when: (:unav)
where: doi:10.21239/V9F61N (currently
https://siam.invemar.org.co/download-alfresco-file/241512)
how: Text
# inflections under construction
# reference https://n2t.net/e/n2t_apidoc.html
```
`https://n2t.net/doi:10.21239/V9F61N??`:
```
erc:
who: (:unav)
what: (:unav)
when: (:unav)
where: doi:10.21239/V9F61N (currently
https://siam.invemar.org.co/download-alfresco-file/241512)
how: Text
datacite: <?xml version="1.0"?>
<resource
xmlns="http://datacite.org/schema/kernel-4"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://datacite.org/schema/kernel-4
http://schema.datacite.org/meta/kernel-4/metadata.xsd"><identifier
identifierType="DOI">10.21239/V9F61N</identifier><creators><creator><creatorName>Elías
Alberto Blanco
Mota</creatorName></creator></creators><titles><title
xml:lang="spa">Informe técnico de levantamiento batimétrico
Bahía de Buenaventura – Pacifico
Colombiano</title></titles><publisher>INVEMAR</publisher><publicationYear>2015</publicationYear><resourceType
resourceTypeGeneral="Text">Text</resourceType><subjects><subject>batimetria</subject></subjects><dates><date
dateType="Created">2015</date></dates><language>spa</language><sizes><size>3
MB</size></sizes><formats><format>PDF</format></formats><version>1</version><rightsList><rights>CC
BY 4.0</rights></rightsList><descriptions><description
descriptionType="Abstract" xml:lang="spa">Levantamiento
batimétrico de precisión (líneas cada 50m) de La Bahía de
Buenaventura sector Cascajal y esteros aledaños, en el
departamento del Valle del Cauca.
La información fue
tomada en el campo, en el periodo comprendido entre el 05 y 13
de mayo de 2015, incluyendo ademas perfiles de velocidad del
sonido y sus ubicaciones geográficas en el área de estudio,
así mismo la ubicación de ayudas a la navegación flotantes
en la zona. abarcando un total 2509 Ha aproximadamente, desde
una profundidad mínima promedio de -2.75 msnm hasta una
máxima promedio de 19
msnm.</description></descriptions><geoLocations><geoLocation><geoLocationPlace>Buenaventura,
Valle,
Colombia</geoLocationPlace></geoLocation></geoLocations></resource>
datacite.resourcetype: Text
id created: 2018.03.07_13:12:02
id updated: 2021.05.10_12:12:36
persistence: (:unav)
# inflections under construction
# reference https://n2t.net/e/n2t_apidoc.html
```
`https://n2t.net/ark:/53355/cl010066723?`
```
-> https://collections.louvre.fr/ark:/53355/cl010066723
```
`https://n2t.net/ark:/53355/cl010066723%3F%3F`
```
https://collections.louvre.fr/ark:/53355/cl010066723
```
`http://n2t.net/ark:/65665??`
```
ark:/65665:
date: 2014.08.18
manager: n2t
na_policy: NP | (:unkn) unknown | 2014 |
name: The Smithsonian Institution (=) TSI
redirect: http://collections.nmnh.si.edu/ark:$id
type: naan
ark:/65665/:
date: 2015.03.31
is_supershoulder: true
manager: ezid
minter:
name: National Museum of Natural History, Smithsonian Institution - empty shoulder
type: shoulder
ark:/65665/n6:
date: 2014.08.18
manager: ezid
minter: https://n2t.net/a/ezid/m/ark/65665/n6
name: National Museum of Natural History, Smithsonian Institution
type: shoulder
```
`http://n2t.net/ark:/65665/300008335-8d74-4c3f-873c-a9d8b4b3d6a8??`:
```
-> http://collections.nmnh.si.edu/id/ark:/65665/3000083358d744c3f873ca9d8b4b3d6a8??
```
Given the ambiguities of the ARK specification and conflict with HTTP URI structure, resolver support of inflection would be better provided through an alternate request and advertised via HTTP link headers [^link-headers].
Candidate [link header relations](https://www.iana.org/assignments/link-relations/link-relations.xhtml):
about
: Not applicable since the current URI is about the URI provided in the header. This needs to be inverted for ARK resolvers.
alternate
: Viable, with URI, type, profile, and optional title, lang. **This is the preferred option.**
describes, describedBy
: Encumbered by the POWDER spec, which appears to be essentially unused.
related
: Burdened by ATOM spec, not specific enough.
Recommendation: Use the HTTP Link Header response to advertise availability of ARK metadata.
For example:
```
GET https://ezid.net/resolve/ark:/12345/xyz
HTTP/1.1 302 Found
Location: https://example.net/data/12345/xyz
Link: <https://ezid.net/info/ark:/12345/xyz>; rel="alternate";
type="application/ld+json"; profile="https://w3id.org/ark/metadata";
```
Provides that an alternate representation serialized in JSON-LD according to the profile identified by `https://w3id.org/ark/metadata` is available from URL `https://ezid.net/info/ark:/12345/xyz`
[^uri]: https://datatracker.ietf.org/doc/html/rfc3986#section-2.2
### Forward Slashes and Periods
ARKs use the forward slash character "/" as a delimiter. Slashes are not as problematic as the question chars specified for inflection. Basically everything starting at the beginning of `ark:` should be handled by the resolver.
Note however:
> The characters `/' and `.' are ignored if either appears as the last character of an ARK.
> ARK Spec § 2.6
### Hash Characters
The hash character "`#`" is allowed as a character in an ARK. When appearing in a URI, the `#` denotes a URI fragment, and URI fragments are not transmitted as part of a URL request sent to the server by a client. This means that the portion of the ARK starting from the `#` char will never be received by the resolver service unless it is percent encoded, i.e. `%23`.
### Hyphens
Hyphens are completely arbitrary and meaningless in ARK identifiers:
> Hyphens are considered to be insignificant and are always ignored in ARKs. A '-' (hyphen) may appear in an ARK for readability, or it may have crept in during the formatting and wrapping of text, but it must be ignored in lexical comparisons. As in a telephone number, hyphens have no meaning in an ARK. It is always safe for an NMA that receives an ARK to remove any hyphens found in it.
> ARK Spec § 2.6
## Implementation
The EZID resolver will be implemented as a HTTP service.
### HTTP Methods
The principle HTTP method to be supported for resolution is GET. HEAD and POST should also be supported.
#### GET
The `GET` method requests transfer of a current selected representation for the target resource.
- https://datatracker.ietf.org/doc/html/rfc7231#section-4.3.1
- https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/GET
> "Request strings too long for GET may be sent using HTTP's POST command."
> ARK Spec § 5.2.
#### HEAD
The `HEAD` method is identical to `GET` except that the server MUST NOT send a message body in the response
- https://datatracker.ietf.org/doc/html/rfc7231#section-4.3.2
- https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/HEAD
#### POST
See GET. For ARKs, the POST is considered the same as GET, just for longer request strings. It is OK for a POST response to return a 302 status and redirect information.
### Caching
Caching can significantly improve performance and reduce server load. At a minimum, the resolver should provide time stamps (e.g. `Last-Modified` response header) indicating when a resource was last modified.
## ARK Normalization
ARK Spec 2.7:
Normalization of an ARK for the purpose of octet-by-octet equality comparison with another ARK consists of four steps. First, any upper case letters in the "ark:" label and the two characters following a '%' are converted to lower case. The case of all other letters inthe ARK string must be preserved. Second, any NMAH part is removed(everything from an initial "http://" up to the next slash) and allhyphens are removed.
Third, structural characters (slash and period) are normalized.Initial and final occurrences are removed, and two structuralcharacters in a row (e.g., // or ./) are replaced by the firstcharacter, iterating until each occurrence has at least one non-structural character on either side. Finally, if there are any components with a period on the left and a slash on the right, either the component and the preceding period must be moved to the end of the Name part or the ARK must be thrown out as malformed.
The fourth and final step is to arrange the suffixes in ASCII collating sequence (that is, to sort them) and to remove duplicate suffixes, if any. It is also permissible to throw out ARKs for which the suffixes are not sorted.
[^uri-parts]: https://datatracker.ietf.org/doc/html/rfc3986#section-3
[^link-headers]: https://datatracker.ietf.org/doc/html/rfc8288
## References
- [ARK General Info](https://arks.org/about/)
- [ARK Specification v.18](https://www.ietf.org/archive/id/draft-kunze-ark-18.txt)
- [HTTP Specification](https://datatracker.ietf.org/doc/html/rfc7231)
- [Link Headers]()