# Custom CPP and CUDA extensions
## Why use CPP/CUDA extensions?
- PyTorch provides a convenient way to write a CPP extension.
- sometimes code may be better optimized in terms of speed if run on CPP extensions. In some other cases, your code may also require interacting with C or CPP libraries
- the custom CPP extension mechanism is for developers to create **PyTorch Operators** out-from-source (seperated from the PyTorch backend).
- PyTorch provides spares much of the boilerplate in integrating PyTorch operations while giving high flexibility in creating PyTorch based projects.
- once the operation has been defined as a CPP extension, you can then turn it into a native PyTorch function.
- turning the operation into a native function is only a matter of **code organization**.
## The setup tool method
## The fusion method
- as PyTorch only recognizes the operations involved in an algorithm, it will launch as many CUDA kernels as needed to run your operations. This may create a significant amount of overhead.
- Therefore, the fusion method is introduced to re-write parts of the CPP extension and fuse particular groups of extension.
- Fusing means combining a few function implementations into a single function. This profits from fewer kernel launches.
- This also encourages increased visibility of global data flow.