output analysis - use cases

# output analysis - use cases related documents: * this: https://hackmd.io/@rig/rJcMZpUPt * terminology suggestion: https://hackmd.io/@rig/HJosMHf_K ## Typical use cases **Prioritization:** by functionalities which are * **A:** _important_ and already _included_ in the existing scripts/tools * **B:** _important_ but only _partly included_ in the existing scripts/tools * **C:** nice-to-have but do _not_ need to be _prioritized_. They have been _partly implemented_ already. **Use cases:** --- ### @Johannes #### Modes - [x] What is the modal **split** ? (for this scenario, w/ or w/o filters) absolute and relative? - by trip: trips_df -> trip counts by mode (main or ldm) - by leg: legs_df -> leg counts by mode + _possible check_ for discrepancies due to main mode classifications - [x] What is the modal split including intermodal trips? - trips_df -> mode_chain by transformer, generate additional column with required mode specification e.g.: "bike+pt", ... - [x] What is the modal split for people living in a specific area? (or traveling from one area to another)? - trips df + person_selection + location_selection + other selections > [name=Gerald Richter]**implementation:** by person attribute filter `home_location` (even from additional table), > or source/destination traffic by trips - [x] What is the modal shift from one scenario to another? What was the trip mode in the alternative scenario? - Comparing trip ids of two trip dfs - output: e-sankey > [name=Gerald Richter]matching criterion understood ? > > Jo: matching by trip_IDs between scenarios #### Counts - [A] How many trips take place in a particular time interval in a defined area? - trips df + time selection + location selection + ... > [name=Gerald Richter]again source/destination traffic by trips, within time frame #### Distances - [B] What are the vehicle kilometers driven by people traveling (through) or living in an area or people with a particular attribute (e.g. owners of a driving license)? - legs df + location selection + ... > [name=Gerald Richter]+ again: source/destination traffic > selection by crossing through an area would be link (or geometric) analysis of `events` or `executed_plans` #### Emissions - [x] Emissions: What are the emissions in one scenario? - either: with emissions module emissions.xml format just as events, only 7-fold (emission types) size - or: vkm * factors from Umweltbundesamt #### Sampling features - [C] Grossing-up factors (Hochrechnungsfaktoren) of the population Agents represent differently large parts of the population __NOTE:__ - *filter* implementation by provising simple `person_id-root` table from initial survey data set e.g.: hh_pers_day -> Nsamples, \<other attributes\>, ... then having a filter doing string match - *count* implementation by string transform of person_ids from population - [C] Population distribution for validation with the actual population data - e.g trips df or person df if first home location is stated - further analysis: Visualisation in a map (important if ??) __NOTE:__ - again by some `person_id-root` table - control of the features of the population as it was done for the Singapore MATSim model: https://www.research-collection.ethz.ch/bitstream/handle/20.500.11850/306926/1/ab790.pdf Table 1 > [name=Gerald Richter]but not fully clear what to extract. @Jo: examples? > is that a selection by certain person_ids and then doing some mobility analyses? #### Spatial evaluation - [C] Traffic load in a specific area at specific times: - `executed_plans.xml `` - or `events.xml` - Iterate over links by time > [name=Gerald Richter]seems to be going into events details - [C] Emissions in a specific area at specific times > that would be link summaries, most likely > or what else? - [C] Which pt interchanges do agents use (at particular times) - legs_df -> locations (x,y coordinate) when pt leg starts and ends __NOTE:__ filtering on `access_stop_id`, `egress_stop_id` or `link_id` - [C] In which area at what time do congestions appear? - Calculate traffic volumes per link in a particular time - Use capacity of links - [C] Have congestions disappear / increase? - Comparison between multiple scenarios > from which output data? > - linkstats? > - or accumulating events? > - or legs filter by mode + by links accumulation (**THIS** might be good) > capacity of links for comparison? --- ### @Paul #### distances / times* - **outputs:** - KFZ-KM/Zeitintervall (e.g. 24h), - Personen-KM/Zeitinterval - KFZ-h/Zeitintervall, - Personen-h/Zeitinterval - Global (gesamtes Netzmodell) - Selected area (z.B.: alle Trips mit Beginn in Stadtgebiet) > **implementation:** by origin facility/link selection - Link-wise (für jede Strecke / Streckentyp) > **implementation:** by link selections - Leg-wise (für jeden durchgeführten Weg) > indifferent from trip-purpose or mode used, just for the trip/leg - Purpose-wise (für jeden Wegezweck / destination purpose) > **implementation:** selection by purposes, but need destination - [x] Mode-wise (für jeden Verkehrs-Modus) - [ ] Umsteigezeiten (Wechsel des Modus ?) defines the sum of time used for bording and alighting vehicles in total over the whole trip. there could be two values 1. uni-modal and - e.g. pt-pt 3. multi-modal (e.g. when changing between modes) > **implement by:** legs + ??? > - [x] `wait_time` is just result of drt module > - [ ] not clearly mapped > same as depart - arrive from leg to next (for same trip) ? > - [ ] **CHECK** in simulation #### Modes - **outputs:** - Modal Split (counts) - Modal Share (by time, distance) - Global (gesamtes Netzmodell) - [x] Selected area (Quelle in Stadtentwicklungsgebiet) - [x] Wegezweck - [x] Personengruppe (falls möglich) **implementation:** like person filter - [x] modal split nach Entfernungsklassen "Entfernungsklassen (in Kombination mit Reiseweiten)" > e.g. 5 classes of distances travelled by employees #### Specials - [x] Kinds of traffic streams (Anzahl, e.g. Binnen, Quell-,Ziel-) - [x] traffic loads - daily average - hourly traffic - [ ] waiting times > legs + `wait_time` - [ ] Number of stops (traffic QoS / service level / congestion) > Number of stops by vehicles in simulaiton, as mentioned by Markus Straub, as vehicles don't stop in MATSim > we need some sort of bandwith where we define a "stop". e.g. VISSIM defines a stop/traffic jam at speeds below 5-10km/h > NOTE: that mean speed on the link would be a measure calculable on link exit, also meaning `events` analysis is required. * from link travel times? very imprecise - below min speed: estimate #stops from queue length? **TRICKY** stuff - [x] traffic count deviations (Erfinder Geoffrey E. Havers) [$GEH_h$-Wert](https://de.wikipedia.org/wiki/GEH-Wert) für Umlegungs-Qualität $$ GEH_h = \sqrt{2 (M-C)^2 \over M+C}. $$ Where $GEH_h$ refers to hourly traffic volumes, $M$ is the modeled traffic volume and $C$ the measured / observed traffic * implementation is clear * innit used in calibration ? @Johannes - [ ] Einzugsbereiche (z.B.: P+R-Anlagen) - [ ] wie is P&R genau implementiert (Diskussion Paul, Stubi & Gerald) > would this be geometric constraint, or facility based? Could be both: * selecting an area and show a number of facilities covered by P+R within 15 mins > **implementation unclear:** selection on what ? > would at least need map of P+R facilities and surrounding facilities reachability > * **geometric constraints** are rather easy > * **isochrones** are more challenging * or agent based nearby P+R facilities > meaning by agent's household location? > **implementation unclear:** "nearby" - again requires definition > * see above - [x] QSV-Wert aus HBS (A-F zur "Qualität des Verkehrsablaufs", **code** aus ODYSSEUS **vorhanden**) QSV = "Qualitätstufen des Verkehrs", value describes the term of lane based traffic-density $$ k_{FS} = {q \over V_F} * f_{FS} \\ QSV = categorize(k_{FS}) $$ Where $f_{FS}$ refers to a limiting faktor based on the form of lanes and area the road is located $q$ : cross-sectional 1 direction flow of vehicles $V_F$ : mean speed ![](https://i.imgur.com/skkGAVi.png) "anbaufrei": L | R kane häusa - [x] seal of Stubi-approval: **SOLVED** - Ganglinien **implement by:** plots of _something_ along several timeslices Yes, e.g. * link-based hourly number of vehicles * model-wide begin-times of trips, with filter on purpose/length/main mode --- ## task structure / architecture draft ### use cases ![](https://i.imgur.com/SB4wW3N.jpg) ### conceptual for building this, use strategy pattern * data context (handler): can load different data sources, e.g. network, population, ... ? maybe provides methods to write out some aggregates of data or move to result writer * provides the data stock to use in filters, aggregation steps and analysis * extensible by user/use case * filters / selectors: * can work on different elements in data context * provides methods to filter from this data * extensible by user/use case ? result writer see above * aggregator: does some transformations e.g. statistical aggregates on (filtered) data * provides methods to do that * extensible by user/use case ## implementation details as mentioned, using strategy pattern ### filtering / selecting general agreement: (@Johannes suggestion/demand) seems to be that it can be based on facility IDs ### extensibility: suggestion: do this by registering functions by a certain name then callable by those names in an analysis configuration as is done currently for the Levitate filtering, which right now is not convenient to extend