Grammar for brms-like models sketches

# Grammar for brms-like models sketches ## Building blocks - Variable - Primarily real number, but possibly with bounds and possibly of some other more complex types (see below) - Optionally a data.frame context it can be predicted over - Arrays of variables - Optionally a data.frame context of the same length - “Clamp” variable to a value - Latent function - Probably limited to evaluation only at data-derived values - Linear predictors - Only need to predict real numbers (but have to respect bounds) - Link functions - Subsetting - Blocks - Parametrized by variables, latent functions and possibly other blocks - Native vs. expressed in the grammar - Block types/constraints/concepts ## Linear predictors Explicit formulas (akin to non-linear formulas in brms) are the basis. R-style formula are just syntactic sugar. ## Blocks Basically just containers binding together its parameters. Native blocks then define code generation outputs. Maybe some inspiration from modular Stan would help for sub-blocks (e.g. transition model for HMM) - i.e. blocks have members that can be accessed... ## HMM ``` s <- data.frame(state_id = c(“A”,”B”,”C”)) # By default reference transition A -> A always present t <- data.frame(from = c(“A”,”B”), to = c(“B”, “C”)) t$trans_id <- paste0(t$from, "-", t$to) d <- tribble(~serie_id, ~time, ~y, …) ``` ``` hmm( states = s, series_data = d, obs = binary_hmm_obs( y /*Ref to series_data*/, :theta /* Name for new var, linked to crossing(series_data, states) */), trans = categorical_hmm_transition(t, :rho /* Name for new var, linked to crossing(series_data, t) */), init = known_hmm_init(“A”) ) rho ~ from * to + condition theta ~ mo(state_id) ``` Subsetting allows for nice grouping of predictors, i.e. to have separate predictors for each transition I'd do: ``` rho[trans_id == "A_B"] ~ something ``` or to share info just between transitions from the same source state I'd have ``` rho[from == "A"] ~ (1 + condition| to) ``` ## Joint model ``` # Define a latent function latent <- function(t, patient) bs(t) + (1 + t | patient), variable mu_marker(context = longitudinal) lognormal(data = longitudinal /*data.frame context*/, y = marker /* outcome reference to longitudinal*/, mu = mu_marker /* new var name linked to longitudinal *//), mu_marker ~ latent(time, patient) cox(data = survival, event = ev, time = time, log_hr = :h ) h ~ 1*latent(time, patient) ``` ## Missing data This would probably be built-in, but we can implement it from more basic building blocks ``` # define a new array of variable, tied to the "df" data.frame (same length) array(name = missing_x, context = df) # Outcome model normal(data = df, y = resp, mu ~ x1 + x2 + missing_x) # Missing data model normal(data = df, y = missing_x, mu = :missing_mu ) missing_mu ~ x1 + x2 # Set some elements of missing_x as known clamp(missing_x[!is.na(x3)], x3) ``` ## Some extra complexities ordinal models require thresholds, multivariate models require correlation matrix -> need for non-real variable types - those don't need to allow predictors. Maybe a `varies_by(var, grouping)` operation that duplicates the values (and keeps prior) and works for all var types when a data.frame context is present. (i.e. for a real unbounded variable `varies_by(var, group)` is functionally identical to `var ~ 0 + group`). Categorical and MVN models require multiple predictors -> Arrays of variables

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.