### WISHI virtual meeting
November 7th (Thursday), 7:00-8:30 PST (16:00-17:30 CET)
Draft agenda:
* IETF 106 [hackathon planning](https://github.com/t2trg/wishi/wiki/Preparation:-Hackathon-Planning)
* Data model versioning
* Continue OneDM discussion from [the Kista work meeting](https://hackmd.io/YcgV56AMR6a33R7P6pltiQ?view)
Attending:
Ari Keränen
Carsten Bormann
Michael McCool
Michael Koster
Bruce Nordman
Klaus Hartke
Ivaylo Petrov
Niklas Widell
# Hackathon planning
Carsten: could add specific points to OneDM
McCool: for WoT will focus on last two points: discovery and vocabulary models. Question: if CoRAL should be used for discovery and semantic search? CoRAL more interaction model, WoT too, good to compare and contrast the two. Want to look at SDF and look at OneDM model to annotate TDs. Use annotations to find stuff. Can do with CoRE RDs etc. CoAP protocol binding and ACE probably low priority. Could re-order the bullet points.
Ari: can edit the wiki
Ivaylo: Planing to work on YOUPI draft and implementation. Also CoRE conf.
Ari: could explore how to connect YOUPI descriptions to OneDM descriptions
Koster: what we'd like to do and talk about, not just structure of SDF but how it's used. Having a topic in the WS how semantic annotation put into TD and how the system works around it, something I'd like to request.
McCool: two parts: how to turn OneDM into TD. And how we use TDs; like queries for discovery.
Koster: yes, do you use stuff straight out of SDF in queries or the RD version. RDF versions.
McCool: hope there is tool that spits out RDF out of SDF. Good to have section in the wiki for implementations. For example, if want to work with CoRE RD, is there implementation to hack with?
Ari: we have per hackathon: https://github.com/t2trg/wishi/wiki/IETF-105-Hackathon#proto-setups
McCool: links to basic tools and libraries useful
Ari: also WISHI implementations: https://github.com/t2trg/wishi/wiki/Wiki:-Implementations
McCool: missing CoAP and CoRE implementations. CoRAL and RD in particular.
(e.g., https://github.com/t2trg/wishi/wiki/IETF-105-Hackathon#implementations)
Koster: Christian or someone might know of RD. We should make sure we know of all reference implementations. In hackathon could spend a few hours to get one working. We want implementation space going forward. Should ask Christian
Ivaylo: there's python implementation. We also have old implementation that's not open source.
McCool: if someone could provide as a service would be OK. Open source would be even better.
Koster: someone should start one if there isn't. Could start with Python.
McCool: where are the good CoAP implementations?
Koster: it's really dependent on link format more than CoAP
Ari: Californium has one : https://github.com/eclipse/californium.tools/blob/master/cf-rd/src/main/java/org/eclipse/californium/tools/ResourceDirectory.java
(but looks rather old)
Carsten: Christian has Python implementation. Probably most compatible at the moment. I have old Ruby implementation. But requires lots of attention before can compete.
McCool: could ask link from Christian
Ari: https://github.com/chrysn/aiocoap
Koster: could do semantic query extensions for posting TDs
AP (all): let's update the implementations wiki page
McCool: old version of TD also in NodeWot.
Koster: let's put pointer to that too
# Data model versioning (Carsten)
Slides: https://github.com/t2trg/wishi/blob/master/slides/2019-11-07-versions.pdf
Sometimes seem that old problems in data modeling can be solved by adding versioning. Usually makes things worse. Also we all have very different conceptions of versions.
Background in software versions. Useful if you have stand-alone piece of software developed by single organization that doesn't take input from customers and can move linearly. Not the case in data modeling.
Closer to data modeling is library issue. If want to compose software of components. Need pieces to check that pieces of other pieces are compatible. Now version doesn't describe progression of software itself but interface. Software has lots of interfaces; we typically talk about library versions. Doing all this to provide independent evolution of independent entities. Still SW but interfaces guiding the discussion.
What kind of changes we have? Adding something; new procedure you can call in library. Different version -- to use the feature need to know it's there. Another thing: something existing gets new semantics. For example library supported 24 character sets but would go to Unicode, would re-interpret parameters etc. Two terms used incorrectly: backward compatibility and forward compatibility.
Backward: your evolved code can work with legacy system. Sometimes need bug compatibility. Reason why you sometimes leave backward compatibility.
Forward: the inverse; means that during evolution you make sure your existing systems tolerate new input. Depending on perspective, same as backward, or something different. If you want to have extensibility, you need to have a way for one end of relationship to evolve independent of the other one. If get into situation where someone might send new version of format, you try to make sure the existing system can tolerate this new version. Or have strong dis-incentive to evolve. That's why forward compatibility important. Some changes can be ignored, or built like they can be ignored by less evolved part. Other changes that are designed to suppress interop.
Format compatibility. Interface set of interactions, format is very simple interface: convey an instance of the format. Only some ways to handle forward compatibility available. Can do negotiation outside of the format, like SMTP server talks to client, they exchange feature strings that enables to find what the other endpoint can do. After that know which formats they can use between. Sometimes don't have this relatively tight coupling: simply send things and have indicated evolution. Producer declares evolution state and consumer has no say. Have to get by with what they get -- or not understand. Indicated evolution done often where interaction complicated and need human input. For example when writing Python program you don't ask which version of Python coming and generating different versions of the program. Assumption: indicated evolution when talking about data models.
Versioning: project evolution to linear number space. Purpose: prevent false interop. When get new version, you know you can't use it until have upgraded your software. In library interop we have "semantic versioning". Distinguishes 3 different kind of changes: major, minor, patch. For major: don't want to interop, would not work. Minor: no need to prevent interop but increased feature number means new features. Patch number should be inconsequential. But problem is that this expresses intent, not reality (there are bugs). Sometimes increased minor version prevents interop.
Better way: set of identifiers. Can be used to introduce backward compatible changes. Especially good when some implementations implement different features. Don't need all implementations to implement particular set to participate in evolution. Even here might want to do features that are required "must understand features". Served by major update of semantic version. But combines with linear progression of versions. Two aspects: features allow more decoupling of progress 2) version force you into linear progression. Some people use Github commits as versions. That avoids need for mapping all to linear progression, but will have combinatorial explosion. Features use set of identifiers and the IDs need to be managed in some way. More complicated than versions, but can be better reflection of reality.
HTTP example: mother of all version upgrades. For a decade used 1.0. Lots of features added as header fields. With assumption of ignore-unknown. Over time had feature set of 1.1 but expressed with header fields that sometimes contradict each other. But essentially had all features needed. 1.1 was actually cleanup. Was major update. Didn't use semantic versioning. Re-interpreted some header fields and made other requirements. Typical example of roll-up: multiple independent features together to have something simpler. No new version in 20 years. New headers included but no new versions. HTTP version number was useless until people wanted to tackle version 2. Complete replacement of the message layer. HTTP will again completely change layer below. Not as drastic, but no idea that would be interoperability. HTTP has both feature and major version evolution. No intend that HTTP 3 will replace HTTP 2. They just have different domains of application.
McCool: built-in extension mechanism with headers. Built-in system.
Carsten: yes, header fields the major extension mechanism. Versions just for really big transitions. Not used in day-to-day evolution
Example: HTML. Just handful of browsers. Weird situation: can't do changes without backward compatibility, or forward compatibility, but the second one more aspect of lip service. Browser vendors motivated to have you updating. New browsers will need to be able to show old pages. And good for old browsers to work with new pages too -- at least for a time window, like 5 years. No version numbers but lots of feature strings like "css" and property names. Weird user agent interpretation schemes. But not something we want to emulate.
What we learned from this? Assuming we will have multiple implementations that evolve in different time scales, linear version mech doesn't make lots of sense for format evolution. Should use features with ignore-unknowns. But also need must-understand features. Not much you gain by rolling up features. But sometimes a reason to do that. Not major mechanism.
McCool: other thing struggling with TD: optional vs mandatory. If change later, can cause compatibility problems.
Carsten: yes, mandatory would be must understand and optional one that you can ignore if don't want it
McCool: if change mandatory to optional confuses
Koster: encourages to make profiles to deal with issue
McCool: still arguing what profiles really are
Carsten: essentially profiles might be a way to provide composition for feature names. Less drastic than versioning. Whole area of composition the next big area of fun in this space.
McCool: in WoT we have multiple profiles that have subsets that overlap -- gets complicated
Koster: have experience from ZigBee: profiles that overlap. With lighting for example: home automation has lightning that works differently. Fixed by getting rid of profiles and have a set you can pick and choose.
McCool: issue in WoT if have industrial and home profile, which to use in smart building
Koster: would be OK if compatible but if minimum feature sets different. That's why profiles were created. Almost guaranteed to have conflicts if have different sets.
McCool: need to be very careful how to define profiles. Was push-back for multiple profiles in WoT. Needs to be nested.
Koster: like levels
Carsten: one more thing: deprecation. Need ways to get rid of mistakes. One way is to do new major version; clean slate. Maybe some deprecation mechanisms, like replaces mechanism, are useful. When doing models, how the model that is getting deprecated does that. Same problem with RFCs where the original can't be changed. Bit of model management issue. Profiles were an important addition to this. Next step to write this down in more detail. Who ever wants to help, please contact me. Will upload slides.
Niklas: key use of deprecation, terminology not used early on was not good one. Feature space needs to be updated.
Carsten: nice thing about numbers: don't have additional connotations. Sometimes Github commits are good for that. Re-naming is something you may need to than.
McCool: in my talk in WS will have short slot how we did this in WoT.
Ari: good timing now; related activities ongoing for IPSO models and OneDM. Anyone who wants to participate in the activity please contact Carsten. Could turn into WISHI note and RG draft.
Bruce: difference in how to do versioning for protocol vs data model.
Carsten: data model is protocol but very specific kind of protocol. With protocols have superset of considerations compared to data model. OTOH, not all protocols are as complicated as usual data models.
# One DM (Koster)
For the workshop will be going through what we identified as gaps in the model. Originally actions, events, properties, data types. Maybe more of RDF model as well needed. UML for visual but RDF model for machine readable schema of the model. Relate events,actions, properties objects to schema. And that things can have objects, that can have properties, etc. Lots of discussion of the processing rules of SDF. Like we have name spaces, how JSON pointers get resolved. Had discussions how different pointers get interpreted -- need to be specific on this. Calling processing rules. When combine lots of SDF documents in namespace can look at as if single doc. Want different entry points to this "virtual doc" to look at things with data types but also with re-usable actions, properties, events. Maybe not reusable as such but defined in model as reusable. Processing rules like how does the schema look like and definition of meta model. More documentation on how things go together and how references work. More clarity on that now. Will make slides and presentation on that. Also working on model updated. Some changes gone to the schema already. That's for the work shop. In the Friday meeting had good T2TRG focus.
In terms of what we should be doing with it -- question comments? Other gaps not recalling? Have slides for odmRef and semantic bread-crumbs but maybe don't need to go over now.
Ari: ID mapping and protocol binding seems like open issue
Koster: what we decided is how IDs are constructed and used is a construct of protocol binding. Apparent when consider sometimes they are re-usable and sometimes not. Apparent how used in LwM2M and Zigbee; different models. Want to focus on the model not how short-hand IDs used. Wanted to focus on the model and URIs that define the terms. JSON pointer path part of it. They are the canonical pointer. URIs are the source of truth. IDs a scheme to compress things on wire. Proposed to split IDs into separate files. Some would prefer in the same file. Could have ID optional and have it in the file. Profiles. Could be SDF profile where the IDs work certain kind of way. Could be hints that are embedded into file. Could have other annotations in SDF. Does play directly on what we could do with SDF. Will make issue that we put ID as optional parameter.
Ari: should make pattern out of this? For protocol binding stuff etc.
Koster: yes, could have different ways for different ecosystems how IDs work. Different name spaces.
Ari: This can apply not only to the IDs, but to other parameters as well and how do we do that. We can call them optional.
Koster: calling optional makes lots of sense. To formalize that there could be something like feature sets, github branch ID style. Could be versioned that way. Could be managed that way and have validation of particular scheme. Could be ecosystem specific extensions; if all allowed as qualities would not be able to modify how things work. Like Google asking new features in model could be hard but if optional qualities would work better. In terms of IDs would not need to have even string or int but could be different for each ecosystem. but could be different for each ecosystem.
Carsten: who is the authority? Grouping is important. And pointint to what does this mean is important.
Koster: could have different orgs responsible of different things. Like fork in github repo. Could have feature branches that would be valid SDF. Could be authority source for models.
Carsten: less thinking in terms of Github branches. Want to consolidate to single model. Can have compartments where ecosystems can put stuff. Looking at the compartments can figure out what who means.
Koster: ..also would be in different orgs namespace. Would allow people to use OneDM as internal thing too. Can support supporting common models and be binding free.
Carsten: danger is people create models that have ecosystem specific info without which the model doesn't make sense. Ability to ignore the ecosystem specific things is important.
Koster: if have strict validation of canonical SDF, would be harder to do, but not impossible. Once have all required info, you define set of E/A/P -- shouldn't prevent using that model in another ecosystem. Could be the danger. Not sure how effective we can be preventing it.
Koster: also what we want to do with SDF. Are we doing GW and translations? We're not dev manufactures. Not using SDF to target making devices. But in general for our research purposes, how applications use SDF. TD one way to do that. How we do discovery and target different ecosystems. What are the use cases and tools. GW project is something OneDM folks interested doing. How to use SDF in a system. Working on a proposal for GW. Bu tin workshop could also thin how to use SDF in a system. How to orchestrate interaction. Highly related to RG.
McCool: in WoT also looking in to multi ecosystem support. Layer on top to have unified discovery. Can we use existing systems for this?
Koster: starting with RD one option
McCool: privacy and security absolute requirements for this. Some mechanisms don't do a good job here. Will bring up in the WS.
Koster: will be available most of the time remotely for the WS
McCool: can we post WebEx in advance? Can provide speaker phone and dedicate a laptop.
Ari: we'll use github repo to share slides; and folks presenting can join the webex session. Can setup WebEx right away.