# AI/ ML workflow and use case in O-RAN
###### tags: `Spec`
## General Principle
* GP 1:
> In O-RAN, there will always be offline training (even in reinforcement learning).
> Online traning is where the model learns when it is executing in the network.
> For example, the reinforce learning is using in online training case.
* GP 2:
> A model needs to be trained and tested before deployment.
* GP 3:
> The ML applications are designed in a modular manner --->
> They are decoupled from one another.
> They can share data without knowing each other's data need.
* GP 4:
> Given the criteria for determing the deploment scenarios.
* GP 5:
> Take a better leverage between efficiency and accuracy.
> Because these two are trade-off factors.
>
> Need to set an acceptable loss.
> And the optimization parameters should be obtained based on this threshold.
## Control loops

## AI/ ML Procedure

* Procedure
```mermaid
graph LR
A(host & inference capability query/ discovery) -->B(traning & generation)-->C(selection)
C-->D(Deploy & Inference)-->E(perform. monitoring)
E-->F(retraining update) -->G(reselcetion) -->H(termination)
```
1. Host and Inference capability query/ discovery
> The SMO/ Non-RT RIC will discover various capabilities and properties of the MF(ML inference host).
> Can be executed at start-up or runtime.
>
> a.) Check if a model can be executed in the target MF.
> b.) CHeck which model can be executed in the MFs.
2. Model Training and Generation
> Design time selection and training.
> The degisner or SMO/ Non-RT RIC will select relevant meta data into ML training host.
> Once the model is trained and validated, the model will be published back into SMO/ Non-RT RIC catelogue.
> A new model can be generated by compression or by truncation etc., and this new model will be also published into SMO/ Non-RT RIC catelogue.
3. ML Model Selection
> The designer can check whether the trained model from SMO/ Non-RT RIC catelogue cab be deployed in the ML inference host.
> Once the model is successful validation, the designer will inform SMO/ Non-RT RIC to initiate the model deployment.
4. ML model Deployment and Inference
> The ML model can be deployed via containerized image to MF(ML inference host).
> After the ML model is depolyed and activated, ML online data shall be used for inference in ML-assited solutions.
> Based on the output of the model, the ML-assited solutions will inform the Actor to take necessary actions towards the Subject.
5. ML model performance monitoring
> The MF(ML inference host) is expected to feedback ot report the performance of the model to SMO/ Non-RT RIC.
> So the SMO/ Non-RT RIC can monitor the performance(e.g. accuracy, running time...) of the ML model.
> And then, potentially update the model and reselect the model to be executed.
6. ML model retraining update
7. ML model reselection
>Based on the feedback and data rteceived from MFs,
>
> retraining :
> the SMO/ Non-RT RIC can inform the ML designer that the update of current model is required.
> reselection :
> the SMO/ Non-RT RIC can inform the ML designer or the repective module that running the ML model does not comply the equirements.
> And thus, the differet model is necessary.
8. ML model termination
> Non-RT RIC needs to have access to ML model’s termination conditions to determine whether a ML model is working properly or not
### Brief Summary
* Step 1: SMO/ Non-RT RI discove the capabiliies and properties
* Step 2: Designer or SMO/ Non-RT RIC generate and train model into MF, and publish it to SMO/ Non-RT RIC catelogue.
* Step 3: Designer select the model in catelogue to deploy.
* Step 4: The selected models are deployed via image container.
* Step 5: The MF feedback and report the model performance to SMO/ Non-RT RIC.
* Step 6: Based on feedback, the SMO/ Non-RT RIC inform the designer to update the model and retrain and re-do Step 3.
* Step 7: Based on feedback, the SMO/ Non-RT RIC inform the designer to reselect the model and re-do step 4.
* Step 8: If the model is degradation or out of the threshold, Non-RT RIC could terminate the model via ML termination funciton.
## ML model lifecycle

## AI/ ML Model Deployment (Scenario 1.2)

* Image Based deployment
```
The AI/ ML model will be deployed as xApp or within a xApp instance.
In this case, the ML xApps are treated the same with other xApps.
Pros:
Faster and flexible deployment.
Less requirement of MFs.
Cons:
The MF's efficiency is depended on the container's capability.
```
* File Based deployment
```
The AI/ ML model will be deployed based on the AI/ ML model file.
Decoupled with xApp software version.
Can be enabled and updated via xApp file configuration.
Usaully require " ML model catelog" and " inference engine/ platform" for the MF.
Pros:
Better customization and efficiency by exploiting the on-device model optimization and update capabilities.
Potential use of standard file formats for ML models.
Cons:
Additional function requirement for the MF.
Require the matching of the ML model format and the inference engine.
```
## Deployment Scenarios
### Scenario 1.1

### Scenario 1.2

### Scenario 1.3

### Scenario 1.4

1. O-CU/O-DU data for offline training is collected over O1 interface and the initial offline model is trained in the SMO/Non-RT RIC.
2. The offline trained model or the backup model is moved to the Near-RT RIC.
3. The AI/ML model is deployed to the ML inference host in the Near-RT RIC.
4. O-CU/O-DU data for inference in the Near-RT RIC is collected over E2 interface.
5. The ML inference host performs inference using the deployed model and collected O-CU/O-DU data.
6. The ML inference host enforces control action/guidance via E2 interface.
7. O-CU/O-DU data for online learning in the Near-RT RIC is collected over E2 interface.
8. The ML inference host provides performance feedback to the ML training host in the Near-RT RIC for monitoring and training data for online learning.
9. The ML training host in the Near-RT RIC updates the model.
10. A well-performing model can be added to the model repository.
* The AI/ML model is offline trained by training host in the SMO/Non-RT RIC.
* Training host in Near-RT RIC is used for continuous online learning.
* The MFs are located in part of Near-RT RIC
* ML model repository in SMO/ Non-RT RIC is used to store the backup model.
> When the training host in the Nrea-RT RIC observes severe performance degradation, it can request stored model which is well-performed previosly.
* The training host in Near-RT RIC updates the online model using input data over E2 and from the MFs.