# Unity and ML-Agents [Toc] ## Install ### Install **Unity** * [Ref](https://docs.unity3d.com/Manual/GettingStartedInstallingHub.html) * Download and install [Unity Hub](https://unity3d.com/get-unity/download?_ga=2.126136909.1714281345.1617063030-1632127043.1615762880) and open it * Go to the install tab and install Unity (preferably choose LTS version) ### Install **ML Agents** :male-detective: * Go to the github [link](https://hackmd.io/@jitesh/mlagents) * :::spoiler Download the "Verified Package 1.0.7" from the following, ![](https://i.imgur.com/JlnpecJ.png) ::: ## Make your Environment in **Unity** :::spoiler Show details Open Unity Hub and click "NEW" ![](https://i.imgur.com/IY562Db.png) Select 3D and enter project name and location. Then click CREATE button. ![](https://i.imgur.com/J5kHsGO.png) ::: ## Train ### Config [Ref](https://github.com/Unity-Technologies/ml-agents/blob/main/docs/Training-Configuration-File.md) ### Training command After opening Unity scene, check if the behavior name is same as the name inside config ".yaml" file. Check if Behaviour type is set to default. ## Test Examples - 3D Ball :8ball: ### Setup * Open Unity and install MLAgents from `Window -> Package Manager -> MLAgents -> Install` ![](https://i.imgur.com/ZxBIeU3.png) * Open 3DBall Scene from the assets folder. ![](https://i.imgur.com/mvpMUyE.png) * The screen should look like this ![](https://i.imgur.com/3ZHrtFe.png) ### Test the pretrained model * Click on the play button to see pretrained model's inference result balancing the ball on the cube. ### Train * Train the 3Dcube by entering the following comand in the console. `mlagents-learn config/trainer_config.yaml --run_id ANY_NAME --train` * It will create a folder named 'ANY_NAME' and store training results like tensorboard log info, weight file (.nn) in this folder. ### Test your trained model * To test the training output, drag the .nn file to Model input in Behavior Parameters component of Agent in Unity. ![](https://i.imgur.com/VnhV7F6.png) * Finally click on play button to check the inference result. ## BallJump :basketball: ### Objective Move Ball to the target box in 3D space. ### Setup Install MLAgents and ProBuilder package into Unity from `Window -> Package Manager` ![](https://i.imgur.com/oTMJCSW.png) ### Unity Environment Your environment should include an agent with components: - Collider component (box, shpere or mesh) - Rigid body: To get affected by forces of the environment - Decision Requester component - Agent Logic Script component (Behaviour Parameters Component is added automatically with this component) The screenshot of the BallJump environment I created is shown below. ![](https://i.imgur.com/MK5A4dg.png) ### Agent Logic Script - The complete script of Agent Logic is here. :::spoiler BallAgentLogic.cs ```csharp= using UnityEngine; using Unity.MLAgents; using Unity.MLAgents.Sensors; public class BallAgentLogic : Agent { Rigidbody rBody; // // Start is called before the first frame update void Start() { rBody = GetComponent<Rigidbody>(); } public Transform target; // Update is called once per frame // void Update() // { // } public override void OnEpisodeBegin() { this.rBody.angularVelocity = Vector3.zero; this.rBody.velocity = Vector3.zero; this.transform.localPosition = new Vector3(-9, 0.5f, 0); target.localPosition = new Vector3(12 + Random.value * 8, Random.value * 3, Random.value * 10 - 5); } public override void CollectObservations(VectorSensor sensor) { sensor.AddObservation(target.localPosition); sensor.AddObservation(this.transform.localPosition); sensor.AddObservation(rBody.velocity); } public float speed = 20; public override void OnActionReceived(float[] vectorAction) { Vector3 controlSignal = Vector3.zero; controlSignal.x = vectorAction[0]; if (vectorAction[1] == 2) { controlSignal.z=1; } else { controlSignal.z = -vectorAction[1]; } if (this.transform.localPosition.x < 8.5) { rBody.AddForce(controlSignal * speed); } float distanceToTarget = Vector3.Distance(this.transform.localPosition, target.localPosition); if (distanceToTarget < 1.42f) { SetReward(1.0f); EndEpisode(); } if (this.transform.localPosition.y < 0) { EndEpisode(); } } public override void Heuristic(float[] actionsOut) { actionsOut[0] = Input.GetAxis("Vertical"); actionsOut[1] = Input.GetAxis("Horizontal"); } } ``` ::: - We need to import this 3 classes, ```csharp= using UnityEngine; using Unity.MLAgents; using Unity.MLAgents.Sensors; ``` - Make a custom class by inheriting Agents class from MLAgents. - Overwrite the following 3 methods from the Agents class - OnEpisodeBegin - CollectObservations - OnActionReceived - Heuristic (Not necessary to write): This function allows us to control the agents using keyboard or any way other than Neural Network. ### Observations (9) 1. Target's location (3 observation) 2. Agent's location (3 observation) 3. Agent's velocity (3 observation) ### Actions (2; discrete) 1. move vector x (+1 = right, -1 = left) 2. move vector z (+1 = forward, -1 = backward) ### Rewards 1. Agent touches the target: +1 ### Training :::spoiler Config parameters ```yaml BallAgent: trainer: ppo batch_size: 128 beta: 1.0e-3 buffer_size: 5000 epsilon: 0.15 hidden_units: 128 lambd: 0.92 learning_rate: 3.0e-4 learning_rate_schedule: linear max_steps: 2.0e6 normalize: true num_epoch: 3 num_layers: 3 time_horizon: 128 summary_freq: 50000 use_recurrent: false reward_signals: extrinsic: strength: 1.0 gamma: 0.99 ``` ::: Shell command `mlagents-learn PATH_TO_CONFIG --run-id RUN_ID --train` ### Tensorboard result :::spoiler See Graphs ![](https://i.imgur.com/BlfxxDy.png) ![](https://i.imgur.com/VHBZfhA.png) ![](https://i.imgur.com/h4bsOAN.png) ::: ## Humingbird :bird: <!-- ### Comming soon :soon: :bird: :::spoiler --> ### Project Setup * [Download](https://learn.unity.com/tutorial/initial-setup-asset-import-default-contact-offset?uv=2019.3&courseId=5e470160edbc2a15578b13d7&projectId=5ec82fa7edbc2a0d36845ae2#) * Create new project in Unity Hub by selecting "Universal Render Pipeline" * Drag the downloaded Humingbird scene to Asset folder inside Unity * Double click the Training file inside the folder Humingbird/Scenes * (Optional) Delete the folders "Example", "Scenes", "Tutorial" and a file named "Readme" inside the Assets folder. (Don't delete anything inside Hummingbird folder) * Go to "edit/Project Settings/Physics", change Default Contact Offset from 0.01 to 0.001 so that the collision works better in this project. * Fix lighting by going to * "Windows/Rendering/Lighting Settigs", click "Generate Lighting" button. * Assets/Materials, drag Skybox_Mat to "Skybox Material" inside Lighting/Environment. * ### C# scripts * Reset flower * Set nectar-extract speed * Turns from red to purple after nectar is extracted * Search flower_plant tag * Get flowerNector * Find the nearest flower * Calculate the distance between Bird's beak-tip and the nearest flower * Bird move to safe random position when episode starts * freeze the bird * Triggers if bird enter nectar area * Heuristic: keybord controls ### RL #### Observations (10) 1. Bird's rotation (4 observations) 2. Distance between Bird's beak-tip and the nearest flower (3 observation) 3. Check if Bird's beak-tip is in front of flower or not (-1 to 1) 4. Check if Bird's beak-tip is pointing towards the flower or not (-1 to 1) 5. Relative distance from the Bird's beak-tip and the nearest flower to the flower area diameter (0 to 1) #### Actions (5; continuous) 1. move vector x (+1 = right, -1 = left) 2. move vector y (+1 = up, -1 = down) 3. move vector z (+1 = forward, -1 = backward) 4. pitch angle (+1 = pitch up, -1 = pitch down) 5. yaw angle (+1 = turn right, -1 = turn left) #### Rewards 1. Inside nectar area: 0.1 2. Inside nectar area and drinking it: 0.2 3. Colliding boundary: -0.5 ### Scene Setup Load FloatingIsland from Prefabs - Create 3 tags in Project Settings, then - Add boundary tag in the IslandBoundaries - Add nectar tag in the FlowerNectarCollider - Add flower_plant tag in the FlowerPlant - Add script components - Add FlowerArea script as a component to FloatingIsland - Add Flower script as a component to Flower inside FlowerBud - Add Hummingbird script as a component to Flower inside FlowerBud - Add DecisionRequester component; 5 steps - HummingbirdAgent script and Behaviour component - Ray perseption - Heuristic <!-- ::: --> ### Tensorboard result ## 3D Ball ![](https://i.imgur.com/m6wwfd2.gif) ### Observations (8) 1. rotation.z 2. rotation.x 3. ball.transform.position - gameObject.transform.position 4. ball.velocity ### Actions (2; continuous) 1. rotate along z axis 2. rotate along x axis ### Rewards 1. -1 * |ball.position.y - agent.position.y| * |ball.position.y - agent.position.y| * |ball.position.y - agent.position.y| 2. 0.1 ## Basic ![](https://i.imgur.com/h54NdX8.gif) ### Observations (3) 1. Agent position 2. small goal position 3. big goal position ### Actions (1; discrete) 1. 1 axis direction ### Rewards 1. -0.01: no goal 2. 0.1: small goal 3. 1: big goal ## Bouncer ![](https://i.imgur.com/oCh64XV.gif) ### Observations (6) 1. Agent position (3 observation) 2. target position (3 observation) ### Actions (3; continuous) 1. x axis force 1. y axis force (scale from 0 to 1) 1. z axis force ### Rewards 1. -0.05 x (mean square of actions) 2. -1: Agent.position.y < -1 3. -1: outside the floor 4. +1: Touch the target ## Crawler <p float="center"> <img src="https://i.imgur.com/mrlkE88.gif" width="360" /> <img src="https://i.imgur.com/wQZG3bF.gif" width="360" /> </p> ### Observations (126) 1. For each body part (bp) 1. is touching the ground? 2. velocityRelativeToLookRotationToTarget 3. angularVelocityRelativeToLookRotationToTarget 2. For each body part except center-body 1. localPosRelToBody 2. bp.currentXNormalizedRot 3. bp.currentYNormalizedRot 4. bp.currentZNormalizedRot 5. strength ratio (current strength/ max strength) ### Actions (20; continuous) ### Rewards ## ToDo <p float="center"> <img src="https://i.imgur.com/xSr2F1E.gif" width="360" /> <img src="https://i.imgur.com/QG8vh9p.gif" width="360" /> </p> ![Grid-Unity](https://i.imgur.com/WOAAMi8.gif) ![Hallway](https://i.imgur.com/dbG8ByL.gif) ![PushBlock](https://i.imgur.com/hOAb44H.gif) ![Food collector Unity](https://i.imgur.com/xSr2F1E.gif) ![](https://i.imgur.com/QG8vh9p.gif) ### Observations (6) ### Actions (3; continuous) ### Rewards {%hackmd theme-dark %}