Audit & Metrics Specs

# Audit & Metrics Specs ## Audit feature :::info * DataNodes or Nodes (Tech): Abstract representation of a MetaModel. * Flux (Job), Snapshot (Tech): Set of DataNodes produced by a pipeline run, usually with a target stock. * Stock (Job/Tech): aggreation of the datanodes of snapshots commited, with most up to date datanodes if there are multiple of the same name. * An Audit: Task that reports the contents of a new flux (set of data nodes produced by a pipeline run) that may be added to a stock. * The audit will identify if there is an intersection between the data nodes in a Stock and the data nodes in the audit snapshot. (For instance it is better for flux PTAB) * The audit will provide a report under a table format (csv or xlsx) that will be downloadable once the audit was finished. * Here is the format of the [table](https://docs.google.com/spreadsheets/d/1ea8BKcNlqKQjdtZR-38IRkAiVqKd5kxhEMZ-fKvhiMQ/edit#gid=1206604242) * The table will contain new case and modified, seperated * either by sheets if in xlsx * either by two seperated tables in a single file. * The audit could take some time. You should be able to view the progression of an audit when viewing a snapshot. * A Notification should be provided at the front level showing that an audit has been completed. * The termes currently related to `delta computations` should be removed and replaced with termes related to audit. * Delta computation feature is to be removed and replaced by the audit feature. * A Snapshot has to be audited to be commited ::: :::warning Comparaison in iter 2 (keep in mind) * SQL table will permit fast comparaison. * Might be interesting for PTAB * INPI has updates, thus would be beneficial * Do we really need it? * NO Audit has to be fast ::: :::success * `POST /api/v1/stocks/{stockId}/diff` will be deprecated. * New Endpoint `POST /api/v1/stocks/{stockId}/snapshots/{snapshotId}/audit` * Used aswell to rerun an audit (a new one is created instead) * Generates the difference between * Get Lists of New Nodes and Modified Nodes * Creates a new Audit Task entity -> similar behavior to the Zip Task entity ```json { "id" : GUID, "snapshotId": GUID, "triggerBlobPath": string, "outputBlobPath": string, "status": ENUM[Created, Running, Finished, Error_Trigger, Error_Runtime] } ``` * Create a new blob with datanodes to audit, sorted between modified and new * This will launch the Audit Service * New Service: Audit Function * Two ways to do it * Single function that will proceed iteratively -> Fast developement, but longer audit times and less fun to develop. * Open each blob, retrieve info, create row, add row to table, repeat. Once done, save the table to a blob. * Durable orchestrated function "Serverless MapReducer" -> More fun to develop and faster execution, but longer develop time * Orchestrate multiple "Mappers" to retrieve each audit for a single data node * Reduce mapper outputs into single table which will be saved as a blob. * Only if we can ::: :::success * New Endpoint `GET /api/v1/stocks/{stockId}/snapshots/{snapshotId}/audit` * Downloads the table of the audited snapshot. ::: :::success * Add the status of the audit to a snapshot response. If there is not audit, the status will be null ::: ## Metric feature * Metrics feature corresponds only to the evolution of the stock metric (no Competitor Comparaison / README) :::info * Delivery: Process representing the contents and the state of the data to be delivered to a client. * Delivery States: * Created: New Delivery with already bounded stocks of the previous delivery * Validated awaiting Metrics (Tech: Started): The Delivery has been checked and can now proceed to next states (packaging and sending) * System check: All flux have been commited or aborted * Validated with generated Metrics (Tech: Started and metrics entities) * the Metrics report have been generated * A README has been generated / Uploaded* * Zip in progress: Delivery packaging is in progress. (delivery can be restarted) * Zip finished: Delivery packaging is finished.(delivery can be restarted) * Transfer in progress: Delivery is being transfered to the client by FTP*(delivery can be restarted) * Transfer finished: Delivery is in the hands of the client (delivery can be restarted) * Finished: Delivery has been closed by the User (No changes possible to the delivery). A new Delivery is created. * Once the User is ready to validate a Delivery, he will be able to press the `Validate and generate Metrics` button. * A Metrics table will be generated. * This is how [it would look like](https://docs.google.com/spreadsheets/d/1ea8BKcNlqKQjdtZR-38IRkAiVqKd5kxhEMZ-fKvhiMQ/edit#gid=1332823129) * The Metrics table will be downloadable from the front side once it has been generated. (Delivery has to be validated) * You should be able to follow the generation of the metrics on the front. * Metrics table will be packaged along side the data at the root of the zip. ::: :::warning User should be able to cancel: * Metrics generation * Zip generation * Transfer * in other words any long ass processing. ::: :::success * Metrics Entity ```jsonld= { "id": GUID, "stockId": GUID, "deliveryId": GUID, "triggerBlobPath": string, "outputBlobpath": string, "status": ENUM[Created, Running, Finished, Trigger_Error, Runtime_Error] } ``` * Start Delivery endpoint to be changed. * Gets Snapshots commited in this delivery and related audit task to the snapshot * Snapshots are to be sorted between their stocks * Gets previous metrics (Metrics from the most recent finished delivery containing the stock) * Creates a Metrics task (zip task like) per Stock to be delivered, the metrics tasks are to be triggered by a blob trigger * The Blob will contain: * The Previous Metrics Blob path * The Paths to the audits of the commited snapshots * A Metric task will: * Compute metric values from the given audits * copy the content of the given Metrics and complete/add the computed metrics from the audit. * Endpoint rerun metrics generation for delivery X and Stock Y * In case the metrics generation failed, the user will be able to restart a metrics generation of a stock ::: ## Readme feature :::info * Readme is set format in the backend * User will add a change log at each delivery before validating :::

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.