###### tags: `chalmers`
# Imsys (guest lecture)
Within Trancoso's Sustainable Computing course.
## Levels (bottom-up)
### Cloud
- data inference ($\neq$ training -> requires global data sharing)
### DNN
- problem: complexity
- solution: "cambrian explosion" many different families (and species) of NN solutions
### Devices
- problem: energy efficiency!
### Sensors
sensors everywhere
- problem: energy efficiency, again!
- solutions:
- edge computing
- distribution, two possible approaches:
- well defined interface
- extend NN to include edge devices (big black box)
## Trends
### NN complexity
- exponential increase in the number of parameters in the average model (doubles every 3.5 months)
### Hardware development
- end of Moore's Law since 2012
- memory has became the most limiting factor -> avoid moving data around
### NN learning algorithms
- convolution (depthwise = on the same layer, vs point-wise "across channels"* kernels)
*like R B and G for a single pixel
### Hardware architecture (accellerators)
- hard to program (almost set-in-stone algorithms)
- Software on Chip (SoC)
- on-off, high energy demanding -> sudden spikes in energy consumption (energy spiking)
- Network on Chip (NoC) -> fast data tranfer between topolgically-near cores
## Work distribution among different hardware
```
CPU -> GPU -> TPU -> ΣPU
scalar vector tensor kernel
```
### 8-bit data quantization
approximate floats using unsigned integers
- pros: much more efficient
- const: less accurate, has some hoverhead (time)
### Optimization
- operation fusion (basically multiply&accumulate)
- layer fusion (save memeory by basically reordering loops in doing convoluitons)
- skipping layers -> memory
```
e.g. Tensorflow -> NN Compiler -> Debugger + Simulator/FPGA-Emulator
```
## Future
### Reserch opportunities
- Accellerator Evaluation Platforms
### Industry
- Iterative Algorithm Development @Imsys
https://en.wikipedia.org/wiki/Quantization_(signal_processing)