# Week 1b - Intelligent Agents
###### [Video Link](https://www.dropbox.com/s/tgv7miz51jp0iqz/Week%201b%20-%20Agents.mp4?dl=0)
## Outline
- Basic idea behind rational agents and the agent model (PEAS description):
- Performance measure
- Environment
- Actuators
- Sensors
- Understand the environment type in terms of the six characteristics
## Recap: Definitions of AI
Previously, it was mentioned that one of the definitions of AI as a discipline is one that creates machines that:

Intelligent Agents have to do with developing a machine that **acts rationally**.
Acting rationally means doing the right thing, which means always making the best decision given the available resources. The resources selected / optimization criteria will be user-defined and will vary across applications.
## Agent
An **agent** is anything that can be viewed as **perceiving** its **environment** through **sensors** and **acting** upon that environment through **actuators**.
e.g. Human agent:
Sensors: eyes, ears, nose.
Actuators: hands, legs, mouth.
e.g. Robotic agent:
Sensors: cameras, IR range finders
Actuators: motors, limbs.
### The Agent Model

Agent problem specification/formulation:
- **P**erformance Measure of the desirability of environment states.
- **E**nvironment in which the agent exists.
- **A**ctions which may affect the environment, made by actuators.
- Percepts/observations of the environment, made by **S**ensors.
### Rational Agents
A rational agent selects actions that maximise its expected utility.
The characteristics of the percepts, environment, and action space dictate techniques for selecting rational actions.
More formally, a rational agent is defined as a function:
$f: P \rightarrow A$ that maps a sequence of percept vectors $P$ to an action $a$ from a set $A$, where $P=[p_0,p_1,\ldots,p_t]$ and $A=\{a_0,a_1,\ldots,a_k\}$
The agent will iteratively obtain percepts from th environment, and based on its evaluation, carry out the respective action that would expectedly increase the performance measure.
#### Example: Vacuum-cleaner World

A list of possible sequences is as follows:

These sequences could be summarized by the following pseudocode:

A rational agent always does the right thing, meaning that every action is filled out correctly depending on the percept sequence. However, what if the table is infinitely large?
## Rationality
The right thing can now then be determined through approximation along with the concept of performance measure.
A performance measure should be objective and quantifiable. In the example above, possible (alternative) performance measures could be:
- +1 for each cleaned location
- +5 for each cleaned location, -1 per move
Performance measure should be based on a desired state in the environment instead of how the agents should behave.
e.g. total time spent cleaning (Agent behavior) vs. total number of clean tiles (Environment state). Using total time spent cleaning would encourage the agent to keep performing cleaning operations regardless of dirt condition.
A rational agent chooses whichever action that maximizes the expected value of the performance measure given the percept sequence to date and prior environment knowledge.
## Limits of Rationality
The ideal case is to have an agent maximizes its actual performance. However, this is almost impossible:
- Rationality ≠ omniscience. Percepts may not provide all the required information.
- Rationality ≠ clairvoyant. Actual outcome of actions may not be as expected.
Hence, we aim for a "bounded" rationality based on expected performance, not actual performance.
### Examples: PEAS



## Environment
Here are the different environment types depending on its parameters:
- Agent's awareness of **complete state** of environment (**Fully observable** or **Partially observable**)
- How the **next state** of environment is **completely determined** by the agent's action on current states (**Deterministic** vs **Stochastic**). If the environment is deterministic except for the actions of other agents, then the environment is **Strategic**. An example of a stochastic environment is skill-based game like basketball, where the outcome of the next state is not just determined by the current state and the player's action but also the player's skill level (2 players could try to shoot the ball from the same position, but the more pro one will be able to score while the noob one might miss). The difference between deterministic and stochastic is a matter of probability; if environment is deterministic, it means that state S1 + action A -> state S2 will happen with a 100% probability. In stochastic it could be state S1 + action A could -> state S2 with 50% probability or -> state S3 with 50% probability.
- Whether the **choice** of agent's action in the current "episode" is **based on previous** "episodes" (**Episodic** vs **Sequential**). Each episode consists of the agent perceiving and then performing a single action. In an episodic environment, the choice of action in each episode depends only on the episode itself.
- Whether the environment **changes** while agent is **considering actions** (Static vs Dynamic). An environment is **semidynamic** if the environment itself does not change with the passage of time but agent's performance score does (e.g. the agent gets penalized after certain amount of time passes).
- **Distinctness** of the number of percepts and actions (Discrete vs Continuous)
- **Number** of agents acting/operating in the same environment (Single Agent vs Multi Agent)
The real world is mostly:
- Partially observable
- Stochastic
- Sequential
- Dynamic
- Continuous
- Multi-Agent
making it the hardest environment to come up with a solution in.
In contrast, the simplest environment is:
- Fully observable
- Deterministic
- Episodic
- Static
- Discrete
- Single-Agent
which will be assumed for the next lecture.
### Exercise: Environment Types
