# OCA REPO Search mechanism
OCA Bundle that is a blend of capture base (head) and overlays (heavy tail) includes metadata that is either more or less valuable for search purposes, especially considering human purposes.
Metadata useful for search purposes is:
- capture base attrs
- meta overlay
- label overlay
- information overlay
This is how an individual can express the search intent by entering words that the search mechanism uses for full text search (FTS). This is search based on words that are part of given OCA Bundle.
Not all overlays use words to describe metadata they add to the ecosystem, i.e., unit overlay adds units, format overlay can specify some weird RegExp etc. Their value for FTS is negliglible.
In a system (our repo in this case) that doesn't have state set as latest, or in other words, a system that promotes immutability, where changes constitute a DAG, FTS search cannot therefore rely on "the newest" value, because such value doesn't exist. Lets consider the example of a DAG given below:

It demonstrates evolution of an OCAFile over time (via DAG). The diagram shows that at some time ocafile 1 is ingested into repo. Some time later ocafile 2 and finally ocafile 3. The content of ocafile 2 and 3 relies on what has been created via ocafile 1.
All three ocafiles are indexed by search mechanism even though they look very similar. The difference among them is very small, yet they are different.
For example, all these ocafiles have the same meta overlay, same attributes, labels etc. Therefore for a human the tiny additions given by ocafile 2 and 3 are negliglible as a search result, yet, the same human would like to know probably how ocafile 1 actually evolved over time. Such search mechanism is kind-of exploratory, because it doesn't serve end result (due to the nature of DAG's such result doesn't exist), but rather anchors to some meaningful information that has been created in the past. On the diagram, the common denominator for ocafile 2 and 3 is the ocafile 1 (DAG third node).
## Search use cases
search use cases aim to visualize functions and methods which user would be potentially interested while search for specific objects in oca repository.
Reputation of the objects is very much relevant for such searc no matter if that would be community reputation or authority reputation.
User would like to search:
- full text over all human readable items
- capture base attrs (classification only, PIIs)
- meta overlay
- label overlay
- information overlay
- any item which touces specific attribute (attribute in context of specific capture base or it's counterparts in / similarity/proximity).
- operation over specific attribute e.g. when give attribute was added or removed.
Capture base as an object is not very much interesting for the user since in there quite often attributes are not very descriptive. User should not have a need to search through the `attribute-name` since those name could be simply `UUIDs`.
Since result is more start for exploring objects would suggest to expose those information via some "graph viwer" where user can easily navigate through nodes and find more. In addition showing just junkt of the bundle with found phrase is as well not very much usefull for the user, rendering bundle in huamn readable way e.g. as a form or clear data structure would allow user evaluat quickly if this is what is he is looking for.
Adding to search result some statistics could be helpful as well e.g. count of all translations (labels overlays) done for such object, references (how many objects depend on it, or how many is referenced),
Practical API:
- Fetch list of all bundles SAIDs - so client can travers over them
- Fetch list of all capture bases
- Fetch all objects referencing given capture base (list all overlays)
- List all references of the object (who is using me, what I am using)
- Fetch list of all bundle using given capture base ("latest" do not show steps between)
## TDA Spec
TBD