# Modularize Agent PR Notes
This [PR](https://github.com/deislabs/akri/pull/252) allows discovery handlers to run outside the Agent or be embedded in the Agent. It is a dev level feature. In order to get it to the quality of being in in a official release, improvements and decisions need to be made.
## What the PR contains
1. A newly designed Configuration CRD that supports generic protocols
2. A gRPC interface for a Discovery service hosted by DiscoveryHandlers and a Registration service hosted by all Agents.
3. All the functionality of each of Akri's supported protocols was put in their own library (in the `/discovery-handlers` directory), so as to enable the same code to be used by the Agent (when Discovery Handlers are embedded -- see `embedded-discovery-handlers.rs`) and the Discovery Handler Pods (when they are not included in the Agent binary -- see the `/discovery-handler-modules` directory).
4. Akri's currently supported discovery handlers can be run inside the Agent by setting the feature `agent-all-in-one`. Or if the Agent is run with the feature `agent-slim`, they will be excluded. They can be deployed separately using Helm. Setting `agent.slim.enabled=true` and `onvif.discovery.enabled=true` will deploy the ONVIF discovery handler pod on all nodes. However, the Agent will not be built in a `slim` format until a later PR, so this functionality will not be documented and not should be used until then, as all the Akri supported discovery handlers will already be in the Agent binary. This does not exclude new discovery handlers, such as for CoAP or Zeroconf, from being deployed.
5. A template for kick-starting discovery handler development was created and currently hosted in my [personal repository](https://github.com/kate-goldenring/akri-discovery-handler-template). It is referenced from extensibility.md.
## To improve
- Increase test coverage, particularly in the discovery handler libraries and programs.
- The `DiscoveryOperator` should be testable without performance costs. See [this document for more details](https://hackmd.io/@akri/SJeTlzJbd).
- `DiscoveryResponseTime` metric was removed due to `DiscoveryHandlers` not always being internal. A new metric should be added in its place, such as a `DiscoveryHandler` count.
- All mutexes should use `std::sync::Mutex`. The `tokio::sync::Mutex` that is used in the `DevicePluginService` adds unneccessary overhead. The [`tokio::sync::Mutex` documentation](https://tokio-rs.github.io/tokio/doc/tokio/sync/struct.Mutex.html) even recommends using the standard library in most cases.
- Handle improper Configuration `DiscoveryDetails` more explicitly. Currently, the Agent just closes the connection and logs the error. The hope is that the operator would check the logs, see the error and re-apply the configuration. More thoughts on Configuration validation can be found in the [spec associated with this PR].
- Create an akri hosted GitHub for the discovery handler template, which kickstarts discovery handler developement via `cargo generate`.
## Next steps
- A build should be made for a `slim` Agent that does not contain any embedded discovery handlers. A decision needs to be made on how to name the images (ie does `ghcr.io/deislabs/akri/agent` become `ghcr.io/deislabs/akri/agent` and `ghcr.io/deislabs/akri/agent/slim` or `ghcr.io/deislabs/akri/agent-slim`).
- Workflows should be created for building Akri's supported Discovery Handler Pods.
- Documentation should be added on how to deploy each of the discovery handler pods.
- Redo Akri diagrams to represent new architecture
## Considerations
- Where should discovery handlers live?
- Should our akri-discovery-utils library be a published crate?
- How should we version our `discovery.proto` file.
- How should discovery handlers handler a connection with the Agent being lost. This isn't often an error, as the Agent will drop its end of the channel when a configuration is deleted. Currently, Akri's supported discovery handlers try to re-register with the Agent. The Agent handles registration of already registered DHs by doing nothing. If the Agent has crashed, the discovery handler will continue to try, sleeping between tries.
- Should the debugEcho discovery handler always be included, even when the Agent is slim/modular? Currently it is not, so instead it can be used for testing the slim Agent by being deployed separately.
- Should debugEcho still be configurable to be shared? It may be good for testing; however, it can be confusing because each Agent independently checks if the device is "offline" by checking a path in their container. Therefore, the same shared device could be seen as offline by one Agent and online by the other.
## Fixes that were tacked onto this PR
This PR covered A LOT of the Akri code. Therefore, some necessary patches were added along with it.
- Removed `units` from Configuration CRD. This PR already makes a breaking change to the CRD so why not remove this dinosaur.
- Moved the `mockall` crate to dev dependencies.
- Moved ONVIF utilities out of shared and into it's discovery handler library
- Agent [reporting udev devices as DeviceSpecs instead of Mounts](https://github.com/deislabs/akri/issues/145).