# Final evaluation
### Work that is merged
PR: [pymc4/#306](https://github.com/pymc-devs/pymc4/pull/306)
Issues: [pymc4/#187](https://github.com/pymc-devs/pymc4/issues/187), [pymc4/#171](https://github.com/pymc-devs/pymc4/issues/171)
The main goal of the work during the summer was to provide the support for various samplers and sampling of discrete variables. We have written and pushed and an inteface that could easily expanded for various number of sampling algorithms.
First, we have provided the base class for all the `pymc4` sampling algorithms:
```python
class _BaseSampler(metaclass=abc.ABCMeta):
def _sample(self, ...):
...
@abc.abstractmethod
def trace_fn(self, ...):
"""
Support a tracing for each sampler
"""
pass
```
The class has an abstract `trace_fn` method which defines the tracing logic for each subclass of sampling algorithm. Example of `NUTS` subclass:
```python
class NUTS(_BaseSampler):
_name = "nuts"
_grad = True
_adaptation = mcmc.DualAveragingStepSizeAdaptation
_kernel = mcmc.NoUTurnSampler
def trace_fn(self, current_state, pkr):
return (
pkr.inner_results.target_log_prob,
pkr.inner_results.leapfrogs_taken,
pkr.inner_results.has_divergence,
pkr.inner_results.energy,
pkr.inner_results.log_accept_ratio,
) + tuple(self.deterministics_callback(*current_state))
```
We see the `_grad` to identify sampling algorithms that calculate gradient in the step method. Also we easily support `tensorflow_probability` logic by also including policy algorithm. For base nuts the adaptation policy is `mcmc.DualAveragingStepSizeAdaptation`. We can easily provide another policy by changing the appropriate class attribute. Tracing method can be also easily modified by subclassig the class.
The next thing was to support compound step. For that we can easily implement another sampling algorith class:
```python
class CompoundStep(_BaseSampler):
_name = "compound"
_adaptation = None
_kernel = _CompoundStepTF
_grad = False
def trace_fn(self, current_state, pkr):
...
def _assign_default_methods(
self,
...
):
"""
Assign the appropriate sampling algorithm for each
variable and merge equal samplers
"""
```
Then we need to implement the sub-class of `tfp...TransitionKernel` to provide a compound logic of `one_step`:
```python
class _CompoundGibbsStepTF(kernel_base.TransitionKernel):
def one_step(self, current_state, previous_kernel_results, seed=None):
...
def bootstrap_results(self, init_state):
"""
Returns an object with the same type as returned by `one_step(...)[1]`
Compound bootrstrap step
"""
...
```
Also we have provided a support for custom proposal generation functions:
```python
class Proposal(metaclass=abc.ABCMeta):
@abc.abstractmethod
def _fn(self, state_parts: List[tf.Tensor], seed: Optional[int]) -> List[tf.Tensor]:
"""
Proposal function that is passed as the argument
to RWM kernel
"""
pass
@abc.abstractmethod
def __eq__(self, other) -> bool:
"""
Comparison of the proposal func
"""
pass
class CategoricalUniformFn(Proposal):
"""
Categorical proposal sub-class with the `_fn` that is sampling new proposal
from catecorical distribution with uniform probabilities.
"""
_name = "categorical_uniform_fn"
def _fn(self, state_parts: List[tf.Tensor], seed: Optional[int]) -> List[tf.Tensor]:
with tf.name_scope(self._name or "categorical_uniform_fn"):
part_seeds = samplers.split_seed(seed, n=len(state_parts), salt="CategoricalUniformFn")
deltas = tf.nest.map_structure(
lambda x, s: tfd.Categorical(logits=tf.ones(self.classes)).sample(
seed=s, sample_shape=tf.shape(x)
),
state_parts,
part_seeds,
)
return deltas
def __eq__(self, other) -> bool:
return self._name == other._name and self.classes == other.classes
```
Compound sampling algorithm accepts the list of variables and optionally the list of sampling algorithms (if not then the appropriate sampling logic is chosen), and assigns the appropriate proposal generation functions for each distribution.
### Work in progress
PR: [pymc4/#287](https://github.com/pymc-devs/pymc4/pull/287)
Support Sequential Monte Carlo (SMC). Thanks to [@junpenglao](https://github.com/junpenglao), for providing the `tfp` implementation of SMC algorithm. Due to this, providing the support in `pymc4` was much easier. Our job was to just provide `sample_smc` logic with modified `logp` functions, which could separately calculate probabilities for prior and likelihood.
This is implemented in separate `SamplingState` sub-class:
```python
class SMCSamplingState(SamplingState):
"""
Subclass of `SamplingState` which adds the support of
log probability collection separately for likelihood and
prior.
"""
__slots__ = ()
def collect_log_prob_smc(self, is_prior):
"""
Collects log probabilities for likelihood variables in sMC.
Since sMC requires the `draws` dimension to be kept explicitly
while the graph is evaluated, we can't combine sMC prbability
collection with the NUTS log probability collection.
"""
...
```
Execution for `SMC` is also separated from the main execution logic to provide distinct logic `_sample_unobserved` function.
### Ideas in proposal:
- [x] Support for more samplers, tricky because of multimodal/discrete distribution
- [x] Support for sampler method assigner as in `pymc3` *(Not quite like in `pymc3` but shouldn't be)*
- [x] Supporting optimized samplers for various samplers, i.e. BinaryMetropolis and etc.
*This on is not required. The logic of `pymc3` `BinaryMetropolis` is provided by using the proposal function for (e.g.) `Bernoulli`*.
- [ ] Progress bar for samplers, includes some hacking on tfp side too.
- [x] Add support for SMC *(Work stil not merged)*
- [x] Fix all the issues with discrete distribution sampling, design more user friendly interface. Additionally, fix issues with xla .
- [x] Support for step methods CompoundStep and Gibbs . But should be discussed if there is a need for that with current design.