###### tags: `talk`
# Matplotlib Scipy 2020 talk
Aka "Update on re-desiging Matplotlib's data model"
Hannah Aizenman and Thomas A Caswell
## Short Summary
We will present the motivation, challenges, and progress of redesigning the underlying data model for Matplotlib.
Visualization is fundamentally about mapping between your data to a visual representation. As data structures and visualizations have become more complex it is has become clear that Matplotlib needs a coherent and consolidated data model to access the underlying data, independent of how it is encoded in memory.
This data model will be the basis of the future development of Matplotlib.
## The Abstract
Currently `Matplotlib` requires users to provide one or more array-like objects that are subsequently stored internally. Because `Matplotlib` expects inputs (eg `x` and `y`) as individual inputs if the user has data already in a higher-level structure (eg a DataFrame), they must pull the data apart so that `Matplotlib` can re-assemble it internally. When we render the figure we access the stored arrays and if the user updates those arrays the next time the figure is drawn it will reflect the change. However, each plotting function / `Artist` is independently responsible for ingesting and storing its data which leads to code duplication and inconsistencies in the API to update the data.
`Matplotlib` needs a consistent way for `Artist`s to access the underlying data that they are displaying.
We will report on our efforts to understand the mathematical foundations of these Data objects and design an API such that each `Artist` will have a single `DataAcces` object responsible for providing the data on demand. This will allow:
- native consumption of structured data,
- smart re-sampling of plotted data based on view limits,
- seamless updating of the underlying data, either interactively or via streams, and re-computation of derived results,
- and use of alternative data sources such as database queries or analytic functions.
This Data model well form the basis for the next generation of Artists, both in core-Matplotlib and in domain specific libraries. Decoupling the `Artists` from the details of data storage and