# Unity and ML-Agents
[Toc]
## Install
### Install **Unity**
* [Ref](https://docs.unity3d.com/Manual/GettingStartedInstallingHub.html)
* Download and install [Unity Hub](https://unity3d.com/get-unity/download?_ga=2.126136909.1714281345.1617063030-1632127043.1615762880) and open it
* Go to the install tab and install Unity (preferably choose LTS version)
### Install **ML Agents** :male-detective:
* Go to the github [link](https://hackmd.io/@jitesh/mlagents)
* :::spoiler Download the "Verified Package 1.0.7" from the following,

:::
## Make your Environment in **Unity**
:::spoiler Show details
Open Unity Hub and click "NEW"

Select 3D and enter project name and location. Then click CREATE button.

:::
## Train
### Config
[Ref](https://github.com/Unity-Technologies/ml-agents/blob/main/docs/Training-Configuration-File.md)
### Training command
After opening Unity scene, check if the behavior name is same as the name inside config ".yaml" file.
Check if Behaviour type is set to default.
## Test Examples - 3D Ball :8ball:
### Setup
* Open Unity and install MLAgents from `Window -> Package Manager -> MLAgents -> Install`

* Open 3DBall Scene from the assets folder.

* The screen should look like this

### Test the pretrained model
* Click on the play button to see pretrained model's inference result balancing the ball on the cube.
### Train
* Train the 3Dcube by entering the following comand in the console.
`mlagents-learn config/trainer_config.yaml --run_id ANY_NAME --train`
* It will create a folder named 'ANY_NAME' and store training results like tensorboard log info, weight file (.nn) in this folder.
### Test your trained model
* To test the training output, drag the .nn file to Model input in Behavior Parameters component of Agent in Unity.

* Finally click on play button to check the inference result.
## BallJump :basketball:
### Objective
Move Ball to the target box in 3D space.
### Setup
Install MLAgents and ProBuilder package into Unity from `Window -> Package Manager`

### Unity Environment
Your environment should include an agent with components:
- Collider component (box, shpere or mesh)
- Rigid body: To get affected by forces of the environment
- Decision Requester component
- Agent Logic Script component (Behaviour Parameters Component is added automatically with this component)
The screenshot of the BallJump environment I created is shown below.

### Agent Logic Script
- The complete script of Agent Logic is here.
:::spoiler BallAgentLogic.cs
```csharp=
using UnityEngine;
using Unity.MLAgents;
using Unity.MLAgents.Sensors;
public class BallAgentLogic : Agent
{
Rigidbody rBody;
// // Start is called before the first frame update
void Start()
{
rBody = GetComponent<Rigidbody>();
}
public Transform target;
// Update is called once per frame
// void Update()
// {
// }
public override void OnEpisodeBegin()
{
this.rBody.angularVelocity = Vector3.zero;
this.rBody.velocity = Vector3.zero;
this.transform.localPosition = new Vector3(-9, 0.5f, 0);
target.localPosition = new Vector3(12 + Random.value * 8, Random.value * 3, Random.value * 10 - 5);
}
public override void CollectObservations(VectorSensor sensor)
{
sensor.AddObservation(target.localPosition);
sensor.AddObservation(this.transform.localPosition);
sensor.AddObservation(rBody.velocity);
}
public float speed = 20;
public override void OnActionReceived(float[] vectorAction)
{
Vector3 controlSignal = Vector3.zero;
controlSignal.x = vectorAction[0];
if (vectorAction[1] == 2)
{
controlSignal.z=1;
}
else
{
controlSignal.z = -vectorAction[1];
}
if (this.transform.localPosition.x < 8.5)
{
rBody.AddForce(controlSignal * speed);
}
float distanceToTarget = Vector3.Distance(this.transform.localPosition, target.localPosition);
if (distanceToTarget < 1.42f)
{
SetReward(1.0f);
EndEpisode();
}
if (this.transform.localPosition.y < 0)
{
EndEpisode();
}
}
public override void Heuristic(float[] actionsOut)
{
actionsOut[0] = Input.GetAxis("Vertical");
actionsOut[1] = Input.GetAxis("Horizontal");
}
}
```
:::
- We need to import this 3 classes,
```csharp=
using UnityEngine;
using Unity.MLAgents;
using Unity.MLAgents.Sensors;
```
- Make a custom class by inheriting Agents class from MLAgents.
- Overwrite the following 3 methods from the Agents class
- OnEpisodeBegin
- CollectObservations
- OnActionReceived
- Heuristic (Not necessary to write): This function allows us to control the agents using keyboard or any way other than Neural Network.
### Observations (9)
1. Target's location (3 observation)
2. Agent's location (3 observation)
3. Agent's velocity (3 observation)
### Actions (2; discrete)
1. move vector x (+1 = right, -1 = left)
2. move vector z (+1 = forward, -1 = backward)
### Rewards
1. Agent touches the target: +1
### Training
:::spoiler Config parameters
```yaml
BallAgent:
trainer: ppo
batch_size: 128
beta: 1.0e-3
buffer_size: 5000
epsilon: 0.15
hidden_units: 128
lambd: 0.92
learning_rate: 3.0e-4
learning_rate_schedule: linear
max_steps: 2.0e6
normalize: true
num_epoch: 3
num_layers: 3
time_horizon: 128
summary_freq: 50000
use_recurrent: false
reward_signals:
extrinsic:
strength: 1.0
gamma: 0.99
```
:::
Shell command
`mlagents-learn PATH_TO_CONFIG --run-id RUN_ID --train`
### Tensorboard result
:::spoiler See Graphs



:::
## Humingbird :bird:
<!-- ### Comming soon :soon: :bird:
:::spoiler -->
### Project Setup
* [Download](https://learn.unity.com/tutorial/initial-setup-asset-import-default-contact-offset?uv=2019.3&courseId=5e470160edbc2a15578b13d7&projectId=5ec82fa7edbc2a0d36845ae2#)
* Create new project in Unity Hub by selecting "Universal Render Pipeline"
* Drag the downloaded Humingbird scene to Asset folder inside Unity
* Double click the Training file inside the folder Humingbird/Scenes
* (Optional) Delete the folders "Example", "Scenes", "Tutorial" and a file named "Readme" inside the Assets folder. (Don't delete anything inside Hummingbird folder)
* Go to "edit/Project Settings/Physics", change Default Contact Offset from 0.01 to 0.001 so that the collision works better in this project.
* Fix lighting by going to
* "Windows/Rendering/Lighting Settigs", click "Generate Lighting" button.
* Assets/Materials, drag Skybox_Mat to "Skybox Material" inside Lighting/Environment.
*
### C# scripts
* Reset flower
* Set nectar-extract speed
* Turns from red to purple after nectar is extracted
* Search flower_plant tag
* Get flowerNector
* Find the nearest flower
* Calculate the distance between Bird's beak-tip and the nearest flower
* Bird move to safe random position when episode starts
* freeze the bird
* Triggers if bird enter nectar area
* Heuristic: keybord controls
### RL
#### Observations (10)
1. Bird's rotation (4 observations)
2. Distance between Bird's beak-tip and the nearest flower (3 observation)
3. Check if Bird's beak-tip is in front of flower or not (-1 to 1)
4. Check if Bird's beak-tip is pointing towards the flower or not (-1 to 1)
5. Relative distance from the Bird's beak-tip and the nearest flower to the flower area diameter (0 to 1)
#### Actions (5; continuous)
1. move vector x (+1 = right, -1 = left)
2. move vector y (+1 = up, -1 = down)
3. move vector z (+1 = forward, -1 = backward)
4. pitch angle (+1 = pitch up, -1 = pitch down)
5. yaw angle (+1 = turn right, -1 = turn left)
#### Rewards
1. Inside nectar area: 0.1
2. Inside nectar area and drinking it: 0.2
3. Colliding boundary: -0.5
### Scene Setup
Load FloatingIsland from Prefabs
- Create 3 tags in Project Settings, then
- Add boundary tag in the IslandBoundaries
- Add nectar tag in the FlowerNectarCollider
- Add flower_plant tag in the FlowerPlant
- Add script components
- Add FlowerArea script as a component to FloatingIsland
- Add Flower script as a component to Flower inside FlowerBud
- Add Hummingbird script as a component to Flower inside FlowerBud
- Add DecisionRequester component; 5 steps
- HummingbirdAgent script and Behaviour component
- Ray perseption
- Heuristic
<!-- ::: -->
### Tensorboard result
## 3D Ball

### Observations (8)
1. rotation.z
2. rotation.x
3. ball.transform.position - gameObject.transform.position
4. ball.velocity
### Actions (2; continuous)
1. rotate along z axis
2. rotate along x axis
### Rewards
1. -1
* |ball.position.y - agent.position.y|
* |ball.position.y - agent.position.y|
* |ball.position.y - agent.position.y|
2. 0.1
## Basic

### Observations (3)
1. Agent position
2. small goal position
3. big goal position
### Actions (1; discrete)
1. 1 axis direction
### Rewards
1. -0.01: no goal
2. 0.1: small goal
3. 1: big goal
## Bouncer

### Observations (6)
1. Agent position (3 observation)
2. target position (3 observation)
### Actions (3; continuous)
1. x axis force
1. y axis force (scale from 0 to 1)
1. z axis force
### Rewards
1. -0.05 x (mean square of actions)
2. -1: Agent.position.y < -1
3. -1: outside the floor
4. +1: Touch the target
## Crawler
<p float="center">
<img src="https://i.imgur.com/mrlkE88.gif" width="360" />
<img src="https://i.imgur.com/wQZG3bF.gif" width="360" />
</p>
### Observations (126)
1. For each body part (bp)
1. is touching the ground?
2. velocityRelativeToLookRotationToTarget
3. angularVelocityRelativeToLookRotationToTarget
2. For each body part except center-body
1. localPosRelToBody
2. bp.currentXNormalizedRot
3. bp.currentYNormalizedRot
4. bp.currentZNormalizedRot
5. strength ratio (current strength/ max strength)
### Actions (20; continuous)
### Rewards
## ToDo
<p float="center">
<img src="https://i.imgur.com/xSr2F1E.gif" width="360" />
<img src="https://i.imgur.com/QG8vh9p.gif" width="360" />
</p>





### Observations (6)
### Actions (3; continuous)
### Rewards
{%hackmd theme-dark %}