---
title: Indentifier Generation
tags: identifier, minting, isamples
---
# Identifier Generation
Assignment of identifiers to physical samples collected in the field.
## General Pattern
A general pattern for field work is something like the following sequence.
```plantuml
@startuml
actor "Field\nResearcher" as field
entity "Local\nIdentifier\nAuthority" as auth1
participant "The Natural\nEnvironment" as env
database "Sample\nCollection" as coll
entity "Global\nIdentifier\nAuthority" as auth2
participant iSamples
activate field
field --> env: collect sample
env --> field: unidentified sample
field --> auth1: get identifier
auth1 --> field: id-01
field --> field: Create record\nwith id-01
deactivate field
... Back to office ...
field --> coll: store sample id-01
activate coll
coll --> auth2: get identifier
auth2 --> coll: id-02
coll --> coll: preserveRecord(id-02)
coll --> field: ok
deactivate coll
iSamples --> coll: Harvest Records
field --> iSamples: get id-01
iSamples --> field: Huh?
@enduml
```
Samples are collected, documented, and assigned a "field identifier" which comes from some local authority (e.g. a sequential list). The samples are eventually accessioned to a collection where the collection management system assigns a "real" identifier. The researcher then tries to find their records using their "field identifier", which fails because it is not recognized.
## Preserving local identifiers
```plantuml
@startuml
actor "Field\nResearcher" as field
entity "Local\nIdentifier\nAuthority" as auth1
participant "The Natural\nEnvironment" as env
database "Sample\nCollection" as coll
entity "Global\nIdentifier\nAuthority" as auth2
participant iSamples
activate field
field --> env: collect sample
env --> field: unidentified sample
field --> auth1: get identifier
auth1 --> field: id-01
field --> field: Create record\nwith id-01
deactivate field
... Back to office ...
field --> coll: store sample id-01
activate coll
coll --> auth2: get identifier
auth2 --> coll: id-02
coll --> coll: preserveRecord(id-02, alt=id-01)
coll --> field: ok
deactivate coll
iSamples --> coll: Harvest Records
field --> iSamples: get id-01
iSamples --> iSamples: search any\nidentifier = id-01
iSamples --> field: OK, but there's lots of results
@enduml
```
Preserving the original "Field Identifier" in the collection and ensuring records are discoverable with any assigned identifiers improves recall at the sacrifice of precision, since there may be many Field Identifiers with the same value. This approach is [used by GEOME](https://fims.readthedocs.io/en/latest/fims/identifiers.html).
GEOME example:
Field id = `CMU_161`
ARK root = `ark:/21547/mx2`
Full identifier = `ark:/21547/mx2CMU_161`
## Delegation to local authority
```plantuml
@startuml
actor "Field\nResearcher" as field
entity "Local\nIdentifier\nAuthority" as auth1
participant "The Natural\nEnvironment" as env
database "Sample\nCollection" as coll
entity "Global\nIdentifier\nAuthority" as auth2
participant iSamples
field --> auth2: get identifier delegates
activate field
auth2 --> field: identifiers[]
field --> auth1: identifiers[]
auth1 --> field: Ready
deactivate field
... Start field work ...
field --> env: collect sample
activate field
env --> field: unidentified sample
field --> auth1: get identifier
auth1 --> field: id-01
field --> field: Create record\nwith id-01
deactivate field
... Back to office ...
field --> coll: store sample id-01
activate coll
coll --> coll: Existing identifier is good
coll --> coll: preserveRecord(id-01)
coll --> field: ok
deactivate coll
iSamples --> coll: Harvest Records
field --> iSamples: get id-01
iSamples --> field: OK!
@enduml
```
One option is to generate a global identifier early and maintain its use for the life of the object. Support for this requires changes to the sample collection process to use the (possibly verbose) pre-generated identifier and in the collection management system to recognize and use the existing identifier.
A variation of this pattern would provide a prefix to which field generated identifiers could be appended. The shorter field generated ids could be used for identifiying the samples, and expanded to the full form when loaded to the collection.