The following are used to monitor and mitigate risks involved with automated trading systems:
## Performance
#### Monitoring
- Portfolio Net leverage (total notional long, minus total notional short, divided by net equity (margin + unrealized PnL))
- Portfolio Gross leverage (total notional long, plus total notional short, divided by net equity)
- Portfolio PnL
- Live vs theoretical PnL
- Instrument/Asset PnL (Identify which asset or instrument are losers, due to idiosyncratic properties, such as volatility, mean reversion, rug pull, tick size, ...)
- Return after fill at different frequencies (Identify bad quoting)
#### Alerting
- Alert on certain net, gross or PnL limits
- Alert on certain asset underperforming
## Infrastructure:
#### Monitoring
- Trading instance health (memory, cpu usage, syscall usage, GC time)
#### Alerting
- Trading instance heartbeat (ping)
- Trading instance running out of memory
- Trading instance high on CPU usage
#### Safeguards
- Backup instance with whitelisted ips for emergency closing
## Execution
#### Monitoring
- Order per minute per instrument
- Volume per minute per instrument
- Order expiration rate
- Live vs theoretical portfolio allocation (portfolio distance metric)
- Order slippage: Difference between the expected price of a trade and the actual price at which the order is executed
- Trade fill rate: Percentage of orders that are successfully filled
- Time to execution: Amount of time it takes for an order to be executed once it's placed
- Trade size distribution: Look at the distribution of trade sizes
- Trade error rate: Monitor the ratio between successful order placement request and errors
#### Alerting
- High or low instrument volume
- High or low instrument order placement
- Biased order placement
- High order expiration rate
- High distance from theoretical portfolio
- Out of distribution trade size
- High trade error rate
#### Safeguards
- Execution layer enters in reduce only if individual position breaches a net/gross exposure soft limit
- Execution layer reduces position if it breaches a net/gross exposure hard limit
- Soft/hard limits depend on coin liquidity and volatility
- soft/hard limits on margin utilization, enter in global reduce only mode (soft) or in global position reduction mode (hard)
- Reduce-only kill switch
- Open-order kill switch
## Market
#### Monitoring
- Market volatility
- Asset/instrument volatility
- Asset correlations
#### Safeguards
- Block execution or reduce-only on abnormal market activities
- Block execution or reduce-only on abnormal instrument activities
## Exchange
#### Monitoring
- Exchange API disconnection or non-responding
- Exchange API latency
- Instrument state (post only, ...)
#### Alerting
- Alert on high latency
- Alert on high failure rates
#### Safeguards
- Rebuild positions live using fills and compare periodically with position snapshot