Try   HackMD

Rethinking the Notebook Cells Weekly Meeting Minutes

June 20, 2023

Name Affiliation GitHub
tonyfast @tonyfast
Afshin T. Darian QuantStack @afshin
Gabriel Fouasnon Quansight Labs @gabalafou
Stephannie Jimenez Quansight Labs @steff456

Agenda

June 13, 2023

Name Affiliation GitHub
tonyfast @tonyfast
Afshin T. Darian QuantStack @afshin
Gabriel Fouasnon Quansight Labs @gabalafou

Agenda

  • If there's time, maybe walk through the nbconvert variants

    • @tonyfast response to @gabalafou: We are testing different representations of notebooks and cells with automated and manual testing. The notebook variants allow us to track the different versions of notebooks we've been testing for accessibility purposes. Some of the variants were specifically designed for user testing. Other experiments are designed to explore idealized representations of the notebooks and their annotation object model.
      • @gabalafou to @tonyfast: Thanks! What I was really trying to find out when I put this on the agenda was not so much a walk through to understand the architecture and how these variants are generated, but more specifically, because I don't have time to test every variation, which variants I should explore and test. Perhaps we can cover this in the next meeting.

    each variant is defined by a jupyter configuration file. the configuration files we used to generate notebooks are found in this list https://github.com/Iota-School/notebooks-for-all/blob/main/pyproject.toml#L132

    Some of the variants are from a parametric study to explore how cells would be configured as ordered lists, unordered lists, definition list we represent them as tables and feeds, too. Through the parametric study we could explored the space of possible semantics

Notes

  • Discussion of work related to scrolling and virtual windowing.
  • Analogy: notebook as feed.
    • late edit: ordered lists might have preferred semantics over a feed, but we can address this when we test with a screen reader.
  • There's a separate push to make JupyterLab (Notebook?) completely usable by keyboard only.
  • top level main > feed
  • we hope modify the semantics for of the jupyter notebook interface. there would be no vision changes. we will add roles and aria to improve the primary navigation of page with assistive technology.

Summary: we spent this session discussing what a quality annotation object model.

we spent this session discussing what it would take to implement a more explicit accessibility object model based for the new jupyter notebook like. we reviewed the accessibility affordances of the notebooks for all project. our goal is try to capture a similar annotation object model for jupyter notebook release and live up the accessible v7 promise. this effort would knock some items on the @manfromjupyter audit https://github.com/jupyter/notebook/issues/6800

in the near term, it would help to split up this issue like we did 9399.

cc: @steff456

May 23rd, 2023

Name Affiliation GitHub
tonyfast @tonyfast
Nick Bollweg GTech @bollwyvl
Afshin T. Darian QuantStack @afshin

Agenda

  • cells/outputs for curriculum
  • accessibility in curriculum design
  • jupyter REST API for execution

Notes

  • jupyter kernel protocol over REST API
    • for now, POST for code execute_request?
    • mapping protocol to www-formencoded/multipart?
    • ??? for get_inspection
    • OpenAPI?
    • what needed to be plausible
      • do a POST of execute_request
        • comes back with a 201 with a new URL
        • do a GET on the response
    • with the accesibility work
      • every cell is a form
      • each cell has a separate end point
    • the knowledge of context of compute is tacit, not shared
    • jupyterlab-blockly
      • have this cell
        • renders the cell
        • uses a kernel as a service
        • uses a cell
      • see also
        • jupyterlab-outsource
        • jupyterlite "just one single cell" app
  • capturing tacit knowledge as shared knowledge
    • e.g. outputs for sys.modules, user_ns consumed and created
  • reusing a single cell
    • once it becomes one cell, that unit can be moved around
  • we always wrap our REST APIs
    • websocket wrapper about as easy as it is
  • use case:
    • working on a notebook
    • how do i go find a thing that i did last week?
    • open a (throwaway) cell
      • do a search in python
      • talks to search API (some way)
      • see some things
        • mime bundles with the type application/jupyter-cell+json
          • drag cell (by reference+cache or copy/fork) into the current document
        • code blocks are implictly interactive
          • promote them to a cell
  • if a computer only has bluetooth inputs
    • should it allow you to turn off bluetooth?
  • if you are running your server as a kernel
    • should you be able to turn off kernel endpoints?
  • this conversation hits on things happening at a bigger level
    • codifying as schema/mime
      • mime
      • kernel
        • provisioning
          • the jupyter machine
      • client
  • the work can't happen fast
    • it's boring
    • but not having to advocate

May 16th, 2023

Name Affiliation GitHub
tonyfast @tonyfast
Nick Bollweg GTech @bollwyvl

Agenda

  • several outstanding JEPs need issue tags

May 2nd, 2023

Name Affiliation GitHub
tonyfast @tonyfast
Angus Hollands Princeton University @agoose77

Agenda

  • several outstanding JEPs need issue tags
  • how do we actually implement extensions?
  • intermediate products should have compromising representations along the compute according to POUR-CAF

April 25th, 2023

Name Affiliation GitHub
tonyfast @tonyfast
Nick Bollweg GTech @bollwyvl

Agenda

April 4th, 2023

Name Affiliation GitHub
tonyfast @tonyfast
jeremy ravenal naas @jravenel
Angus Hollands Princeton University @agoose77
Afshin T. Darian QuantStack @afshin

Agenda

  • schema provide solutions for validation and ui
  • angus on extra schemas
    • platform to design a different notebook that will allow different input cells.
    • extension authors can encode some other validation logic
    • mainly useful for the front end.
  • what is the history of this meeting
    • spun out of a workshop that discussed modifying the notebook format
    • we talked about different ways to create new cells types: is there one code cell or many different cells?
    • there is no nice way for myst to store metadata, there is no way to enshrine metadata in the schema
  • naas - push notebooks to production seamlessly. package software, data, and chats.
  • sell things with demos
    • raw cell bolt-on
    • add a MIME / text entry widget to the cell view
      • specify the mimetype of the contents
    • add a custom renderer based upon mime type
      • HTML
      • SVG
      • Form generation (one way, code generation, hacky!)
    • Execute cell has two steps
      • Two steps, one "executes", another "renders"
    • ability to select output mimetype as well?
  • we talked a lot about cell types
    • use raw cells for another cell type
    • use the kernelspec in cells to identify cell actions
  • steps to getting JEP accepted
    • what implementation will need updating?
  • Split these conversations:
    • $schema - uncontentious
    • extraSchemas - motivates extension validation, needs discussion
    • @context, annotation, etc. additional discussions!

March 28th, 2023

Name Affiliation GitHub
tonyfast @tonyfast
Nick Bollweg GTech @bollwyvl
Steve Purves Curvenote @stevejpurves
Afshin T. Darian QuantStack @afshin

Agenda

  • markdown text format

  • problem with jeps: they aren't validated

    • how could we use notebooks to validate jeps? how can we validate schema?
  • Nick need to be able to reference schema from schema

    • portable cells that can copy and paste across documents
    • treat all cells as the same
    • attachments are broken (not discoverable)m UI is busted
    • slugifying headers isn't consistent across implementations
    • evenutally register cells as a mimetype
  • extra schema uses cases

    • jupyter.org could/should host ui
    • the format should avoid content validation unless explicitly in the purview of the schema
    • vendors could provide content validation
  • how to demo?

    • nbformat, jupyter server
    • traitlets to schema
  • rjsf jep?

  • formal schema specification

March 14th, 2023

Name Affiliation GitHub
tonyfast @tonyfast
Steve Purves Curvenote @stevejpurves
Jason Grout Databricks @jasongrout
Angus Hollands Princeton University @agoose77
Nick Bollweg GTech @bollwyvl

Agenda

March 7th, 2023

Name Affiliation GitHub Favorite Schema Key
tonyfast @tonyfast properties
fcollonval QuantStack @fcollonval
Angus Hollands Princeton University @agoose77
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Rowan Curvenote / ExecutableBooks @rowanc1 @context
Nick Bollweg Georgia Tech @bollwyvl

Agenda

first meeting of the notebook cells schema group outside of the nbformat workshop.

  • Meeting logistics

    • use hackmd for notes
    • use google meet for video because jovyan is crowded
      • Image Not Showing Possible Reasons
        • The image file may be corrupted
        • The server hosting the image is unavailable
        • The image path is incorrect
        • The image format is not supported
        Learn More →
        this account is limited to our hour so we have a real hard stop.
    • the textual format team is working in other channels to submit their jeps.
  • Research

    • which schema draft are we using?
    • should only be adding cells and metadata
    • how is this file format going to be reused?
    • introduction of notebook mimetype. how do we carry around the mimebundle across documents and use that information.
    • how do we use attachments better? where do attachments belong?
      • could attachments just be a cell? hold the whole mimebundle
    • Distinguish between saving and reading - always uphold $schema, but not extraSchemas?
    • Should extraSchemas allow embedding schema?
    • Do we include @context?
      • Probably a separate JEP because the value proposition is a different learning curve.
  • Interests

    • Rowan - standardization of notebooks in scientific publishing. dealing with authorship, title, subtitles, scholarship.

to do

  • follow up JEP shepherd
  • post an issue to the team compass
  • add the event to the community calendar

$vocabulary

  • does this provide the convention (and therefore the tools) we need

https://gregsdennis.github.io/Manatee.Json/usage/schema/vocabs.html

"$vocabulary": {
    "https://json-schema.org/draft/2019-WIP/vocab/core": true,              // 2
    "https://json-schema.org/draft/2019-WIP/vocab/applicator": true,
    "https://json-schema.org/draft/2019-WIP/vocab/validation": true,
    "https://json-schema.org/draft/2019-WIP/vocab/meta-data": true,
    "https://json-schema.org/draft/2019-WIP/vocab/format": true,
    "https://json-schema.org/draft/2019-WIP/vocab/content": true,
    "https://myserver.net/my-vocab": true
  },
  • Angus' understanding of vocabulary[1]:
    • Vocabularies allow meta schemas to define custom keywords, e.g. a units keyword that adds units to an integer:
      ​​​​​​​ {
      ​​​​​​​     "type": "number",
      ​​​​​​​     "units": "kg/s"
      ​​​​​​​ }
      
    • One must create a new metaschema that defines these vocabularies, and copies the meta-schema that it "inherits" from (or use allOf?)
    • The $vocabulary section of a metaschema lists the vocabularies, and a boolean flag of whether they constitute a failure if they cannot be located. The units keyword above does not affect validation, so it can safely be ignored if the validator cannot find the URI (it's metadata). Other keyword schemas might not be so permissive:
      ​​​​​​​ {
      ​​​​​​​     "type": "number",
      ​​​​​​​     "isEven": True
      ​​​​​​​ }
      
      This schema would incorrectly validate documents with odd integers, but the essence is still upheld. A keyword that changed the "type" would not be ignorable if the validator is at-all to be useful.

      Modern JSON Schema introduces vocabularies, which allow you to define a group of keywords and identify them with a URI. Schema authors can then use that URI to tell implementations that the need to support the vocabulary in order to use the schema. If they can't, instead of failing validation, the implementation refuses to run the schema and indicates which vocabularies it doesn't understand.[2]

    • i.e. $vocabulary solves the problem of "is this failure a 'unrecoverable' error?".
    • We could use this to introduce a top-level extraSchemas field (?)
      • Crucially, it means that validators that don't understand what to do with extraSchemas don't try and validate the document.

Challenges

mimetypes
IANA
multiple schema
validation
validation report
JEP
end this meeting
  • Extra schemas: Failure modes
    • How can our approaches fail?
      • two conflicting extra schemas
    • How can users save themselves if we break stuff? what happens code/clients break?

Reference

JEP Drafts

References


  1. https://json-schema.org/learn/glossary.html ↩︎

  2. https://modern-json-schema.com/what-is-modern-json-schema ↩︎