Rethinking the Notebook Cells Weekly Meeting Minutes

When: Tuesday 8AM Pacific Time
Where: https://numfocus-org.zoom.us/j/82580785418?pwd=T0lhUVhhbzVRNm1NcCtXZVBudlFrdz09#success

June 20, 2023

Name	Affiliation	GitHub
tonyfast		@tonyfast
Afshin T. Darian	QuantStack	@afshin
Gabriel Fouasnon	Quansight Labs	@gabalafou
Stephannie Jimenez	Quansight Labs	@steff456

Agenda

https://github.com/jupyter/enhancement-proposals/pull/108
- a place to host schema: notebook, kernelspec, configuration, rest (open)api, event system, lsp.
https://github.com/jupyter/enhancement-proposals/pull/97
- replaces nbformat parameters with schema to
https://gameaccessibilityguidelines.com/

June 13, 2023

Name	Affiliation	GitHub
tonyfast		@tonyfast
Afshin T. Darian	QuantStack	@afshin
Gabriel Fouasnon	Quansight Labs	@gabalafou

Agenda

If there's time, maybe walk through the nbconvert variants
- @tonyfast response to @gabalafou: We are testing different representations of notebooks and cells with automated and manual testing. The notebook variants allow us to track the different versions of notebooks we've been testing for accessibility purposes. Some of the variants were specifically designed for user testing. Other experiments are designed to explore idealized representations of the notebooks and their annotation object model.
  - @gabalafou to @tonyfast: Thanks! What I was really trying to find out when I put this on the agenda was not so much a walk through to understand the architecture and how these variants are generated, but more specifically, because I don't have time to test every variation, which variants I should explore and test. Perhaps we can cover this in the next meeting.
each variant is defined by a jupyter configuration file. the configuration files we used to generate notebooks are found in this list https://github.com/Iota-School/notebooks-for-all/blob/main/pyproject.toml#L132

Some of the variants are from a parametric study to explore how cells would be configured as ordered lists, unordered lists, definition list we represent them as tables and feeds, too. Through the parametric study we could explored the space of possible semantics

Notes

Discussion of work related to scrolling and virtual windowing.
Analogy: notebook as feed.
- late edit: ordered lists might have preferred semantics over a feed, but we can address this when we test with a screen reader.
There's a separate push to make JupyterLab (Notebook?) completely usable by keyboard only.
top level main > feed
we hope modify the semantics for of the jupyter notebook interface. there would be no vision changes. we will add roles and aria to improve the primary navigation of page with assistive technology.

Summary: we spent this session discussing what a quality annotation object model.

we spent this session discussing what it would take to implement a more explicit accessibility object model based for the new jupyter notebook like. we reviewed the accessibility affordances of the notebooks for all project. our goal is try to capture a similar annotation object model for jupyter notebook release and live up the accessible v7 promise. this effort would knock some items on the @manfromjupyter audit https://github.com/jupyter/notebook/issues/6800

in the near term, it would help to split up this issue like we did 9399.

cc: @steff456

May 23rd, 2023

Name	Affiliation	GitHub
tonyfast		@tonyfast
Nick Bollweg	GTech	@bollwyvl
Afshin T. Darian	QuantStack	@afshin

Agenda

cells/outputs for curriculum
accessibility in curriculum design
jupyter REST API for execution

Notes

jupyter kernel protocol over REST API
- for now, POST for code execute_request?
- mapping protocol to www-formencoded/multipart?
- ??? for get_inspection
- OpenAPI?
- what needed to be plausible
  - do a POST of execute_request
    - comes back with a 201 with a new URL
    - do a GET on the response
- with the accesibility work
  - every cell is a form
  - each cell has a separate end point
- the knowledge of context of compute is tacit, not shared
- jupyterlab-blockly
  - have this cell
    - renders the cell
    - uses a kernel as a service
    - uses a cell
  - see also
    - jupyterlab-outsource
    - jupyterlite "just one single cell" app
capturing tacit knowledge as shared knowledge
- e.g. outputs for sys.modules, user_ns consumed and created
reusing a single cell
- once it becomes one cell, that unit can be moved around
we always wrap our REST APIs
- websocket wrapper about as easy as it is
use case:
- working on a notebook
- how do i go find a thing that i did last week?
- open a (throwaway) cell
  - do a search in python
  - talks to search API (some way)
  - see some things
    - mime bundles with the type application/jupyter-cell+json
      - drag cell (by reference+cache or copy/fork) into the current document
    - code blocks are implictly interactive
      - promote them to a cell
if a computer only has bluetooth inputs…
- should it allow you to turn off bluetooth?
if you are running your server as a kernel…
- should you be able to turn off kernel endpoints?
this conversation hits on things happening at a bigger level
- codifying as schema/mime
  - mime
  - kernel
    - provisioning
      - the jupyter machine
  - client
the work can't happen fast
- it's boring
- but not having to advocate

May 16th, 2023

Name	Affiliation	GitHub
tonyfast		@tonyfast
Nick Bollweg	GTech	@bollwyvl

Agenda

several outstanding JEPs need issue tags

May 2nd, 2023

Name	Affiliation	GitHub
tonyfast		@tonyfast
Angus Hollands	Princeton University	@agoose77

Agenda

several outstanding JEPs need issue tags
how do we actually implement extensions?
intermediate products should have compromising representations along the compute according to POUR-CAF

April 25th, 2023

Name	Affiliation	GitHub
tonyfast		@tonyfast
Nick Bollweg	GTech	@bollwyvl

Agenda

we collaborated on another JEP for jupyter.org subdomain and repository for publishing schemas. it was submitted by @zsailor https://github.com/jupyter/enhancement-proposals/pull/108

April 4th, 2023

Name	Affiliation	GitHub
tonyfast		@tonyfast
jeremy ravenal	naas	@jravenel
Angus Hollands	Princeton University	@agoose77
Afshin T. Darian	QuantStack	@afshin

Agenda

schema provide solutions for validation and ui
angus on extra schemas
- platform to design a different notebook that will allow different input cells.
- extension authors can encode some other validation logic
- mainly useful for the front end.
what is the history of this meeting
- spun out of a workshop that discussed modifying the notebook format
- we talked about different ways to create new cells types: is there one code cell or many different cells?
- there is no nice way for myst to store metadata, there is no way to enshrine metadata in the schema
naas - push notebooks to production seamlessly. package software, data, and chats.
sell things with demos
- raw cell bolt-on
- add a MIME / text entry widget to the cell view
  - specify the mimetype of the contents
- add a custom renderer based upon mime type
  - HTML
  - SVG
  - Form generation (one way, code generation, hacky!)
- Execute cell has two steps
  - Two steps, one "executes", another "renders"
- ability to select output mimetype as well?
we talked a lot about cell types
- use raw cells for another cell type
- use the kernelspec in cells to identify cell actions
steps to getting JEP accepted
- what implementation will need updating?
Split these conversations:
- $schema - uncontentious
- extraSchemas - motivates extension validation, needs discussion
- @context, annotation, etc. additional discussions!

March 28th, 2023

Name	Affiliation	GitHub
tonyfast		@tonyfast
Nick Bollweg	GTech	@bollwyvl
Steve Purves	Curvenote	@stevejpurves
Afshin T. Darian	QuantStack	@afshin

Agenda

markdown text format
- the jep has been submitted.
- does not modify the existing schema
- doesn't define a specific markdown flavor
problem with jeps: they aren't validated
- how could we use notebooks to validate jeps? how can we validate schema?
Nick need to be able to reference schema from schema
- portable cells that can copy and paste across documents
- treat all cells as the same
- attachments are broken (not discoverable)m UI is busted
- slugifying headers isn't consistent across implementations
- evenutally register cells as a mimetype
extra schema uses cases
- jupyter.org could/should host ui
- the format should avoid content validation unless explicitly in the purview of the schema
- vendors could provide content validation
how to demo?
- nbformat, jupyter server
- traitlets to schema
rjsf jep?
formal schema specification

March 14th, 2023

Name	Affiliation	GitHub
tonyfast		@tonyfast
Steve Purves	Curvenote	@stevejpurves
Jason Grout	Databricks	@jasongrout
Angus Hollands	Princeton University	@agoose77
Nick Bollweg	GTech	@bollwyvl

Agenda

discuss open jeps
- https://github.com/jupyter/enhancement-proposals/pull/97
  - top level $schema is not contentious
- https://github.com/jupyter/enhancement-proposals/issues/96
  - extra schemas might be contentious
    - need to resolve the root notebook schema and extra schemas can fail
    - we need to be able to turn things on and off in case of failure
deprecation notes
- bump the metaschema to draft 2020/12, currently version 4 doesn't support deprecation, it wasn't introduced until draft 2019.
- at least a year for the deprecation. find a good reference for the deprecation cycle as precedent
- old validators will feel when additionalProperties: false, which will require updating existing nbformat schema
- precedence in nbformat for changes
- $schema takes precedence over nbformat and nbformat_minor
  - Require that $schema validates against a URI-template that captures major, minor version
  - Encode this in the schema with const
  - Can also do this in the metaschema, though it's less important.
what is the jep process
- ask SSC what this process will look like
- software steering council is still being formed. jeps will be priority
discuss work in progress
- Text based Format - https://hackmd.io/CmAhY_3tRK6ge4tqANflTg
- Cell's Markdown Format - https://docs.google.com/document/d/1B8mhaHud7DMY55q1mg5sSDhZ96FGC6cbJpypYO1BocA
- Persist user expressions - https://docs.google.com/document/d/110OJnl7baNeCz6Y5KnKaA4dpLdltB_fvpr2Q0Rf_36M

March 7th, 2023

Name	Affiliation	GitHub	Favorite Schema Key
tonyfast		@tonyfast	properties
fcollonval	QuantStack	@fcollonval
Angus Hollands	Princeton University	@agoose77	Image Not Showing Possible Reasons The image file may be corrupted The server hosting the image is unavailable The image path is incorrect The image format is not supported Learn More →
Rowan	Curvenote / ExecutableBooks	@rowanc1	@context
Nick Bollweg	Georgia Tech	@bollwyvl

Agenda

first meeting of the notebook cells schema group outside of the nbformat workshop.

Meeting logistics
- use hackmd for notes
- use google meet for video because jovyan is crowded
  - Image Not Showing Possible Reasons
    The image file may be corrupted
    The server hosting the image is unavailable
    The image path is incorrect
    The image format is not supported
    Learn More →
    this account is limited to our hour so we have a real hard stop.
- the textual format team is working in other channels to submit their jeps.
Research
- which schema draft are we using?
- should only be adding cells and metadata
- how is this file format going to be reused?
- introduction of notebook mimetype. how do we carry around the mimebundle across documents and use that information.
- how do we use attachments better? where do attachments belong?
  - could attachments just be a cell? hold the whole mimebundle
- Distinguish between saving and reading - always uphold $schema, but not extraSchemas?
- Should extraSchemas allow embedding schema?
- Do we include @context?
  - Probably a separate JEP because the value proposition is a different learning curve.
Interests
- Rowan - standardization of notebooks in scientific publishing. dealing with authorship, title, subtitles, scholarship.

to do

follow up JEP shepherd
post an issue to the team compass
add the event to the community calendar

`$vocabulary`

does this provide the convention (and therefore the tools) we need

https://gregsdennis.github.io/Manatee.Json/usage/schema/vocabs.html

"$vocabulary": {
    "https://json-schema.org/draft/2019-WIP/vocab/core": true,              // 2
    "https://json-schema.org/draft/2019-WIP/vocab/applicator": true,
    "https://json-schema.org/draft/2019-WIP/vocab/validation": true,
    "https://json-schema.org/draft/2019-WIP/vocab/meta-data": true,
    "https://json-schema.org/draft/2019-WIP/vocab/format": true,
    "https://json-schema.org/draft/2019-WIP/vocab/content": true,
    "https://myserver.net/my-vocab": true
  },

Angus' understanding of vocabulary^[1]:
- Vocabularies allow meta schemas to define custom keywords, e.g. a units keyword that adds units to an integer:
```
 {
     "type": "number",
     "units": "kg/s"
 }
```
- One must create a new metaschema that defines these vocabularies, and copies the meta-schema that it "inherits" from (or use allOf?)
- The $vocabulary section of a metaschema lists the vocabularies, and a boolean flag of whether they constitute a failure if they cannot be located. The units keyword above does not affect validation, so it can safely be ignored if the validator cannot find the URI (it's metadata). Other keyword schemas might not be so permissive:
```
 {
     "type": "number",
     "isEven": True
 }
```
  This schema would incorrectly validate documents with odd integers, but the essence is still upheld. A keyword that changed the "type" would not be ignorable if the validator is at-all to be useful.
  
  Modern JSON Schema introduces vocabularies, which allow you to define a group of keywords and identify them with a URI. Schema authors can then use that URI to tell implementations that the need to support the vocabulary in order to use the schema. If they can't, instead of failing validation, the implementation refuses to run the schema and indicates which vocabularies it doesn't understand.^[2]
- i.e. $vocabulary solves the problem of "is this failure a 'unrecoverable' error?".
- We could use this to introduce a top-level extraSchemas field (?)
  - Crucially, it means that validators that don't understand what to do with extraSchemas don't try and validate the document.

Challenges

Extra schemas: Failure modes
- How can our approaches fail?
  - two conflicting extra schemas
- How can users save themselves if we break stuff? what happens code/clients break?

Reference

Notes from the workshop: https://docs.google.com/document/d/1DMMUOYEhFxoAEKITOrCUK9x0vkTy68mfZ9clof3UrMc/edit#heading=h.2q8mfjoa85k9

JEP Drafts

$schema - https://hackmd.io/@u1M5398WTl6qOUg8YdOH0Q/r1ZInYjCi
extraSchemas - https://hackmd.io/9QZ8YibfQHm9l1B6JPSQsg
Pre-proposal JEP is out
Cell types - https://hackmd.io/EmDM0wm1Tli3VVW7KrTwJQ

Rethinking the Notebook Cells Weekly Meeting Minutes

June 20, 2023

Agenda

June 13, 2023

Agenda

Notes

May 23rd, 2023

Agenda

Notes

May 16th, 2023

Agenda

May 2nd, 2023

Agenda

April 25th, 2023

Agenda

April 4th, 2023

Agenda

March 28th, 2023

Agenda

March 14th, 2023

Agenda

March 7th, 2023

Agenda

to do

$vocabulary

Challenges

Reference

JEP Drafts

References

Read more

Writing docs sucks! Quirkshop

Quansight Data Science Residency Jam Session

Untitled

`$vocabulary`