This note explains the scalar decomposition step on the efficient EC multiplication algorithm proposed in this article.
Some implementations of the algorithm:
Python implementation
Go implementation
Rust implementation
In-circuit implementation in halo2wrong PR
Here is a good explanation of the whole algorithm along with an implementation: