# output analysis - use cases
related documents:
* this: https://hackmd.io/@rig/rJcMZpUPt
* terminology suggestion:
https://hackmd.io/@rig/HJosMHf_K
## Typical use cases
**Prioritization:** by functionalities which are
* **A:** _important_ and already _included_ in the existing scripts/tools
* **B:** _important_ but only _partly included_ in the existing scripts/tools
* **C:** nice-to-have but do _not_ need to be _prioritized_.
They have been _partly implemented_ already.
**Use cases:**
---
### @Johannes
#### Modes
- [x] What is the modal **split** ?
(for this scenario, w/ or w/o filters)
absolute and relative?
- by trip: trips_df
-> trip counts by mode (main or ldm)
- by leg: legs_df
-> leg counts by mode
+ _possible check_ for discrepancies
due to main mode classifications
- [x] What is the modal split including intermodal trips?
- trips_df -> mode_chain
by transformer, generate additional column with required mode specification
e.g.: "bike+pt", ...
- [x] What is the modal split for people living in a specific area?
(or traveling from one area to another)?
- trips df + person_selection + location_selection + other selections
> [name=Gerald Richter]**implementation:** by person attribute filter `home_location` (even from additional table),
> or source/destination traffic by trips
- [x] What is the modal shift from one scenario to another?
What was the trip mode in the alternative scenario?
- Comparing trip ids of two trip dfs
- output: e-sankey
> [name=Gerald Richter]matching criterion understood ?
>
> Jo: matching by trip_IDs between scenarios
#### Counts
- [A] How many trips take place in a particular time interval in a defined area?
- trips df + time selection + location selection + ...
> [name=Gerald Richter]again source/destination traffic by trips, within time frame
#### Distances
- [B] What are the vehicle kilometers driven by people traveling (through) or living in an area
or people with a particular attribute (e.g. owners of a driving license)?
- legs df + location selection + ...
> [name=Gerald Richter]+ again: source/destination traffic
> selection by crossing through an area would be link (or geometric) analysis of `events` or `executed_plans`
#### Emissions
- [x] Emissions: What are the emissions in one scenario?
- either: with emissions module emissions.xml
format just as events, only 7-fold (emission types) size
- or: vkm * factors from Umweltbundesamt
#### Sampling features
- [C] Grossing-up factors (Hochrechnungsfaktoren) of the population
Agents represent differently large parts of the population
__NOTE:__
- *filter* implementation by provising simple `person_id-root` table from initial survey data set
e.g.: hh_pers_day -> Nsamples, \<other attributes\>, ...
then having a filter doing string match
- *count* implementation by string transform of person_ids from population
- [C] Population distribution for validation with the actual population data
- e.g trips df or person df if first home location is stated
- further analysis: Visualisation in a map (important if ??)
__NOTE:__
- again by some `person_id-root` table
- control of the features of the population as it was done for the Singapore MATSim model:
https://www.research-collection.ethz.ch/bitstream/handle/20.500.11850/306926/1/ab790.pdf Table 1
> [name=Gerald Richter]but not fully clear what to extract. @Jo: examples?
> is that a selection by certain person_ids and then doing some mobility analyses?
#### Spatial evaluation
- [C] Traffic load in a specific area at specific times:
- `executed_plans.xml ``
- or `events.xml`
- Iterate over links by time
> [name=Gerald Richter]seems to be going into events details
- [C] Emissions in a specific area at specific times
> that would be link summaries, most likely
> or what else?
- [C] Which pt interchanges do agents use (at particular times)
- legs_df -> locations (x,y coordinate) when pt leg starts and ends
__NOTE:__ filtering on `access_stop_id`, `egress_stop_id` or `link_id`
- [C] In which area at what time do congestions appear?
- Calculate traffic volumes per link in a particular time
- Use capacity of links
- [C] Have congestions disappear / increase?
- Comparison between multiple scenarios
> from which output data?
> - linkstats?
> - or accumulating events?
> - or legs filter by mode + by links accumulation (**THIS** might be good)
> capacity of links for comparison?
---
### @Paul
#### distances / times*
- **outputs:**
- KFZ-KM/Zeitintervall (e.g. 24h),
- Personen-KM/Zeitinterval
- KFZ-h/Zeitintervall,
- Personen-h/Zeitinterval
- Global (gesamtes Netzmodell)
- Selected area (z.B.: alle Trips mit Beginn in Stadtgebiet)
> **implementation:** by origin facility/link selection
- Link-wise (für jede Strecke / Streckentyp)
> **implementation:** by link selections
- Leg-wise (für jeden durchgeführten Weg)
> indifferent from trip-purpose or mode used, just for the trip/leg
- Purpose-wise (für jeden Wegezweck / destination purpose)
> **implementation:** selection by purposes, but need destination
- [x] Mode-wise (für jeden Verkehrs-Modus)
- [ ] Umsteigezeiten (Wechsel des Modus ?)
defines the sum of time used for bording and alighting vehicles in total over the whole trip.
there could be two values
1. uni-modal and
- e.g. pt-pt
3. multi-modal (e.g. when changing between modes)
> **implement by:** legs + ???
> - [x] `wait_time` is just result of drt module
> - [ ] not clearly mapped
> same as depart - arrive from leg to next (for same trip) ?
> - [ ] **CHECK** in simulation
#### Modes
- **outputs:**
- Modal Split (counts)
- Modal Share (by time, distance)
- Global (gesamtes Netzmodell)
- [x] Selected area (Quelle in Stadtentwicklungsgebiet)
- [x] Wegezweck
- [x] Personengruppe (falls möglich)
**implementation:** like person filter
- [x] modal split nach Entfernungsklassen
"Entfernungsklassen (in Kombination mit Reiseweiten)"
> e.g. 5 classes of distances travelled by employees
#### Specials
- [x] Kinds of traffic streams
(Anzahl, e.g. Binnen, Quell-,Ziel-)
- [x] traffic loads
- daily average
- hourly traffic
- [ ] waiting times
> legs + `wait_time`
- [ ] Number of stops (traffic QoS / service level / congestion)
> Number of stops by vehicles in simulaiton, as mentioned by Markus Straub, as vehicles don't stop in MATSim
> we need some sort of bandwith where we define a "stop". e.g. VISSIM defines a stop/traffic jam at speeds below 5-10km/h
> NOTE: that mean speed on the link would be a measure calculable on link exit, also meaning `events` analysis is required.
* from link travel times?
very imprecise - below min speed: estimate #stops from queue length?
**TRICKY** stuff
- [x] traffic count deviations
(Erfinder Geoffrey E. Havers)
[$GEH_h$-Wert](https://de.wikipedia.org/wiki/GEH-Wert) für Umlegungs-Qualität
$$
GEH_h = \sqrt{2 (M-C)^2 \over M+C}.
$$
Where $GEH_h$ refers to hourly traffic volumes, $M$ is the modeled traffic volume and $C$ the measured / observed traffic
* implementation is clear
* innit used in calibration ?
@Johannes
- [ ] Einzugsbereiche (z.B.: P+R-Anlagen)
- [ ] wie is P&R genau implementiert (Diskussion Paul, Stubi & Gerald)
> would this be geometric constraint, or facility based?
Could be both:
* selecting an area and show a number of facilities covered by P+R
within 15 mins
> **implementation unclear:** selection on what ?
> would at least need map of P+R facilities and surrounding facilities reachability
> * **geometric constraints** are rather easy
> * **isochrones** are more challenging
* or agent based nearby P+R facilities
> meaning by agent's household location?
> **implementation unclear:** "nearby" - again requires definition
> * see above
- [x] QSV-Wert aus HBS
(A-F zur "Qualität des Verkehrsablaufs", **code** aus ODYSSEUS **vorhanden**)
QSV = "Qualitätstufen des Verkehrs", value describes the term of lane based traffic-density
$$
k_{FS} = {q \over V_F} * f_{FS} \\
QSV = categorize(k_{FS})
$$
Where $f_{FS}$ refers to a limiting faktor based on the form of lanes and area the road is located
$q$ : cross-sectional 1 direction flow of vehicles
$V_F$ : mean speed

"anbaufrei": L | R kane häusa
- [x] seal of Stubi-approval: **SOLVED**
- Ganglinien
**implement by:** plots of _something_ along several timeslices
Yes, e.g.
* link-based hourly number of vehicles
* model-wide begin-times of trips, with filter on purpose/length/main mode
---
## task structure / architecture draft
### use cases

### conceptual
for building this, use strategy pattern
* data context (handler):
can load different data sources, e.g. network, population, ...
? maybe provides methods to write out some aggregates of data
or move to result writer
* provides the data stock to use in filters, aggregation steps and analysis
* extensible by user/use case
* filters / selectors:
* can work on different elements in data context
* provides methods to filter from this data
* extensible by user/use case
? result writer
see above
* aggregator:
does some transformations e.g. statistical aggregates on (filtered) data
* provides methods to do that
* extensible by user/use case
## implementation details
as mentioned, using strategy pattern
### filtering / selecting
general agreement: (@Johannes suggestion/demand)
seems to be that it can be based on facility IDs
### extensibility:
suggestion:
do this by registering functions by a certain name
then callable by those names in an analysis configuration
as is done currently for the Levitate filtering,
which right now is not convenient to extend