# Implementation
[Discussion Here](https://hackmd.io/uun9tRn9TFq2TfbwNkj9dA)
## Mirage Mazes
### RL Algorithm: TD-lambda
- Actions Space:
- Traverse maze
* Up
* Down
* Left
* Right
- Check wall
- State Space:
- all the cells in the maze are states
- Reward Function:
- Traversing maze = -1
- Reaching the Goal = 0
- Finding a mirage wall = 1/($distance_{manhattan}(end-current)$)
- Checking a normal wall for mirage nature = -1/($distance_{manhattan}(end-current)$)
(Rewards and functions used to deliver rewards are subject to experimentation)
To be Explored: changing the costs of traversal to be 0 and reaching the end-state to be 1. Expectation: finding mirage walls will be more rewarding, as absolute reward will increase (traversal has penalty in the other case and hence overshadows the reward of finding a mirage wall).
**Updation** : Rewards for finding a mirage wall will have to be delivered after 'k' traversals to observe the effect of the identification.