<style>
.reveal {
font-size: 32px;
}
<!-- To change font size for a single slide, see: https://github.com/hackmdio/hackmd-io-issues/issues/87 -->
.small {
font-size: 24px;
}
.red {color: red;}
code {color: red;}
strong {
color: green;
font-weight: 900
}
span.slide-number-a {
font-size: 32px;
margin: 16px;
}
.two-column-layout {
columns: 2; /* Set column number */
font-size: 28px;
max-width: 100%;
overflow: hidden;
}
</style>
# Histopathology Process Simulation
Yin-Chi Chan <[ycc39@cam.ac.uk](mailto:ycc39@cam.ac.uk)>
Institute for Manufacturing
 <img src="https://hackmd.io/_uploads/B1k9SX6RR.png" alt="drawing" width="200"/>
University of Cambridge
---
Link to this presentation
https://hackmd.io/@elasticdt/Hkua1daaR
<img src="https://hackmd.io/_uploads/SJ_K3RHkJx.png" alt="drawing" width="400"/>
---
## Background
- Collaboration with Addenbrooke's Hospital Histopathology department
- Part of Cambridge University Hospitals NHS Foundation Trust
- Model the process of creating *stained glass slides* from biological specimens
- For example, for diagnosis of cancer and other diseases
- Key performance indexes (KPI):
- **Main:** Turnaround time (*laboratory*, overall)
- Staff/machine utilization
- Specimens in progress
---
## Histopathology
- Analysis of tissue specimens for study/diagnosis of disease
- Specimens ⟶ blocks ⟶ slides
- Slides are then stained and digitally scanned to be analyzed by histopathologist
- Current focus on **routine** stains using the dyes hematoxylin and eosin (H&E)
---
## Research Flowchart
<span class=small>Research process steps for defining digitalization opportunities in a healthcare setting</span>

<span class=small>Source: [Moretti *et al.*, LoDiSA 2023](https://doi.org/10.1049/icp.2023.1735)</span>
Current presentation focuses on the steps in <span class=red>red</span>
- **2\.** Process logic modeling
- **4c\.** Process simulation
- **5c\.** Process performance analysis
---
## Process modeling
- **von Neuman**: "truth [...] is much too complicated to allow anything but approximations"
- **George Box**: "All models are wrong, but some are useful"
<hr/>
- Identify high-level tasks, abstract away how they are performed
- Example: current model iteration ignores staff specializations based on tissue types
- Simple 3-point estimations for task durations: low, most likely (mode), high
- Focus on core processes, minimize decision branches
---
## Identified process stages
```mermaid
flowchart TB
subgraph x3["<b>3. Processing</b>"]
direction TB
decalc["3a. Decalcification (optional)"] --> proc["3b. Processing machine"]
end
subgraph one[" "]
direction LR
start(START) --> reception["1. Specimen Reception"] --> cutup["2. Cut-up"] --> x3 --> embedding["4. Embedding"]
end
subgraph two[" "]
direction LR
micro["5. Microtomy"] --> stain["6. Staining"] --> label["7. Cover-slipping"] --> scan["8. Digital scanning"]
end
subgraph three[" "]
direction LR
collate["9. Collation"] --> qc["10. Block & quality check"] --> alloc["11. Case allocation"] --> report["12. Reporting"] --> stop(END)
end
one --> two --> three
style start fill:green,color:white
style stop fill:red,color:white
```
---
## Simulation features
- Entities are hierarchical
- Specimen ⟶ Block ⟶ Slide
- Steps in each stage can operate on specimens, blocks, slides, or **batches** of these
- Flow control:
- **Branching:** different paths based on entity attributes
- **Batching:** forming groups of like entities
- **Collation:** only groups entities with the same parent entity
- i.e. Collect all slides of a specimen before continuing
- **Timed gates:** Only start certain jobs when a timed event is triggered (e.g. at 4:30PM daily)
- **Bootstrapping** of initial specimen states
---
## <img src="https://hackmd.io/_uploads/SkElv3R6R.png" alt="drawing" width="150"/> model
- Arena is a graphical discrete event simulation (DES) tool
- DES
- Single simulation thread with a **clock**
- Ordered list of pending **events**, which are generated by **processes**
- Simulation clock jumps directly from event to event, processing each event in turn
- Events may spawn new events, e.g. processing a `load machine` event generates a `unload machine` event
- Arena arranges processes into a **flowchart-like** visualization with blocks such as Create, Seize, Delay, Release, Batch, Split,...
---
## Arrival processes
<div class="two-column-layout">
- Two arrival processes:
- Cancer pathway
- Non-cancer pathway

</div>
- Use a **time-varying** Poisson process with rates defined per hour
- **Rejection** sampling based on the highest hourly rate as the base process
- Rates (like most other parameters) loaded from an Excel file
---
## Staffing
- Arena has the concept of **schedules**
- Designating the total number units of a resource over time
- We use this for setting staff levels
- Schedules in our model are cyclic (one week) at half-hourly resolution
- In contrast, machine resources are assumed to have fixed levels
- **End-of-shift policy**: when number of staff is reduced, ensure non-replaced staff only leave after completing their current task
---
## Task duration distributions
<div class="two-column-layout">

- We typically used **triangular** distributions for task durations
- Defined using `min`, `mode`, `max`
- **Exception**: machine tasks assumed to have fixed durations
- Mean = 1/3 × (`min` + `mode` + `max`)
- Parameters estimated from staff interviews + standard operating procedure documents
</div>
---
## Example: Machine batch jobs (1/2)
- Processing machines in our model take a long time and are typically run overnight (except for urgent specimens)
- Processing step works at the **block level**
- Machine has a capacity of 300 regular or 36 mega-sized blocks
- **Batching policy**: do not separate blocks from the same specimen
- **Hold policy for batches**: non-urgent batches are only started at the end of day, collected by staff the next morning
---
## Example: Machine batch jobs (2/2)

---
## <img src="https://hackmd.io/_uploads/SkElv3R6R.png" alt="drawing" width="150"/> simulation outputs
- Excel .xlsm file (Default)
- Outputs a large number of statistics by default (queue statistics, resource statistics, counter statistics, etc.)
- Requires macros 🤨
- Simple text file
- Outputs a smaller set of statistics in text form
- I/O blocks
- Streams custom output to a file **during** the simulation run itself
- CSV and free formats supported
- Most versatile and portable output format
---
## **P**rocess **AN**alyzer (PAN)
- Auxiliary program bundled with Arena
- Can run a series of related simulations (change the model file, input values, etc.)
- We use this to observe the effect of changing a single variable on the system performance

---
### Example: Effect of adding a single staff member in different roles
<div class="small"> Only adding staff to microtomy leads to statisically significant change in turnaround time (decrease) </span>
<img src="https://hackmd.io/_uploads/ryFenyRa0.png" alt="drawing" width="700"/>
---
## From <img src="https://hackmd.io/_uploads/SkElv3R6R.png" alt="drawing" width="150"/> to Python <img src="https://hackmd.io/_uploads/H1B2mpAp0.png" alt="drawing" width="40"/>
- Arena is hard to integrate into a workflow
- Models (`.doe`) are compiled into binary (`.p`) via an intermediate text format (`.mod`, `.exp`), but Excel-read values are hard-baked
- Command-line tools (compiler / runtime) are poorly (or not at all) documented
- Alternative — open-source simulation libraries
- We chose <img src="https://hackmd.io/_uploads/rka1qaaT0.png" alt="drawing" width="100"/> which is written in Python
- Based on **coroutines** which interact with the **event loop** using the `greenlet` library
- Each coroutine corresponds to a process
---
## An example process in Python (`salabim`)
```python
import salabim as sim
class Customer(sim.Component):
def process(self):
self.request(clerks)
self.hold(30)
self.release() # not really required
```
- By default, all new `Customer` instances enter the `process` function automatically.
- `clerks` is a `Resource` instance; units of this instance can be requested or released
- Delays (e.g. job processing times) are represented using `hold()`, which accepts both constants and `Distribution` instances
---
## Defining some process building blocks in Python

---
## Defining tasks using method injection
- The `Process` class relates to tasks that do actual work on the specimens/blocks/slides
- Tasks are diverse, thus we need the ability to define custom processes
- Use **method injection** to associate a task's Python function to the matching `Specimen`/`Block`/`Slide`/`Batch` class
```python
# class Process(BaseProcess)
def setup(self, in_type: Type, fn: Callable[[Component], None]) -> None:
"""Set up the component, called immediately after initialisation."""
super().setup()
self.in_type = in_type
setattr(self.in_type, self.name(), fn)
```
---
## Revisiting task duration distributions
- In the Arena model, we used the triangular distribution in most cases
- Python makes it easier to define new distributions not in the existing libraries
- We implemented the **PERT** distribution, which concentrates more probability mass around the mean than the triangular distribution
<img src="https://hackmd.io/_uploads/SyktYAjRA.png" alt="drawing" width="800"/>
---
## Automatic statistics collection with `salabim` Monitors
- Some `salabim` elements such as `Resources` have built-in `Monitor` objects
- We can also add our own
- For `Resource`, automatic monitors include the current total/in-use units and the size of the queue
- Can return mean/variance/etc. or the full table of values over time (`pandas` export)
- Two types of `Monitor`
- **Level:** track a value over time — $x(t_0)$, $x(t_1)$, $x(t_2)$, ...
- **Non-level:** track a series of values — $x_0$, $x_1$, $x_2$, ...
- Affects how averages are calculated (time-weighted vs. regular mean)
---
## Integration with Building Information Modeling
- We model the time taken to move specimen (batches) between rooms in the histopathology lab
- Distances are extracted from a geometric model of the lab
- `ifcopenshell` library (**IFC** = Industry Foundation Classes)
- Walls and doors converted into shapes and overlaid on grid
- Travel on grid permitted in 8 directions (as in King chess piece)
- Additional path definitions for inter-floor travel (lift, stairs)
- Moretti *et al.*, <https://ssrn.com/abstract=4827727>
---
## BIM model of histopathology lab building
<span class=small>Digital scanning (Stage 8) room is on a different floor than main lab</span>
<img src="https://hackmd.io/_uploads/ByuqJy0a0.png" alt="drawing" width="50%"/>
---
## Grid overlay in Python `shapely` library
<span class=small>Red grid squares show doors `d1` to `d16`</span>
<img src="https://hackmd.io/_uploads/HJ-_x1RT0.png" alt="drawing" width="50%"/>
---
### Graph and heatmap showing path segments between doors in the lab
<span class=small>Highlighted squares in right figure denote lift travel</span>
<img src="https://hackmd.io/_uploads/ryYX-10aA.png" alt="drawing" width="50%"/> <img src="https://hackmd.io/_uploads/Sk7U-106R.png" alt="drawing" width="40%"/>
Final step: map doors to process stages and compute shortest paths between stages
---
## Scenario comparison
<img src="https://hackmd.io/_uploads/ryt0MyCa0.png" alt="drawing" width="40%"/>
<span class=small>A small change in total travel time (avg. 70 additional seconds, caused by lift breakdown) causes a sigificant change in lab performance (proportion of specimens completed in 10 days)</span>
---
## Advantages of Python model
- Only open-source software was used — reduce potential costs for healthcare administrators/analysts
- "[Shoestring](https://www.digitalshoestring.net/shoestrings-first-hospital-pilot-begins/)" paradigm — integration of existing free/low-cost technologies to deliver new tech solutions
<hr/>
## Drawbacks of Python model
- Python simulation model is missing some features from the Arena model (timed gates, bootstrapping)
- Implementable, but much more work than in Arena
- Harder to iterate/improve model than using visual tool
- (Can potentially try existing open-source visual simulation tools, e.g. [JaamSim](https://jaamsim.com/) and [Warteschlangensimulator](https://a-herzog.github.io/Warteschlangensimulator/) — both Java-based)
---
## Some Python libraries used
<div class="two-column-layout">
<p>
- `ifcopenshell`: Building information modelling
- `shapely`: Geometric representation of histopathology lab
- `networkx`: Path-segement representation of histopathology lab, shortest-paths computation
- `openpyxl`: parse Excel configuration files
- `pandas`: dataframes
</p>
<p>
- `pydantic`: input validation
- `salabim`: Discrete event simulation
- `matplotlib/plotly`: plotting
- `jupyter`: Python notebooks
- `dash`: web UI
- `pymongo`: database
- `rq`: job queue
</p>
</div>
---
## Future work
- Data integration
- Bootstrap the initial simulation state
- Find out data about planned disruptions (e.g. lift maintenance schedule)
- Obtain staff rota
- Asset management (e.g. processing machine cannot run if chemical stores empty)
- Model iteration/refinements and validation
---
# Thank you!
### Any questions?
{"title":"Histopathology Process Simulation","slideOptions":"{\"slideNumber\":true,\"controls\":false}","description":"Yin-Chi Chan <ycc39@cam.ac.uk>Institute for Manufacturing","contributors":"[{\"id\":\"abcd5047-92cf-41d1-ab38-8f92336732d0\",\"add\":17156,\"del\":2958}]"}