# **meeting 10/03**
**Advisor: Prof. Chih-Yu Wang \
Presenter: Shao-Heng Chen \
Date: Oct 03, 2023**
<!-- Chih-Yu Wang -->
<!-- Wei-Ho Chung -->
## **Environment**
- Action space and observation space



- MSE Reward design


- The MSE reward upper limit is bound to the number of RIS elements, but I haven't figured out the exact relationship between them yet
- For example, if we set ```num_RIS_elements=16```, the potential upper bound of the MSE reward may be around ```1000-1200```
- And if we set ```num_RIS_elements=256```, the possible upper bound may increase to around ```2500-2600```
- Downlink rate reward

## **Current Progress**
- Hardwares (My PCs)
- ```i7-8700``` + ```RTX 2060 (6GB)```, it requires nearly ```11 hours``` to complete ```1M steps```
- ```i7-12700``` + ```RTX 3060Ti (8GB)```, in contrast, only takes ```8-9 hours```
- I have tried several common methods that are supported by ```Stable-Baselines3``` with their default hyper-parameter settings, just to see how things work
###

## **Future works**
- Try normalizing the ```Box``` action space and making it symmetric

(source: https://stable-baselines3.readthedocs.io/en/master/guide/rl_tips.html)
- Additionally, consider normalizing the observation space, although I'm not entirely certain about the boundaries yet
- Try using 'SVD + water filling' to replace 'MRT'
- Consider refactoring the code using ```PyTorch```, for GPU Acceleration, as ```numpy``` runs only on the CPU and might be relatively slow
- True discrete action space
