Oh sure yes we're nonprofit or

# example queries > find me a health care facility where i can take this child, in enough time to save their life, and with attention to cost. > 100 MB and 6 million different prices cause that's how many hospital needs to maintain ## models test directed acyclic graph transformers (https://arxiv.org/abs/2210.13148) # benchmark - contain cost - egress fees - limit on cost to incur per use (query side) - electricity - latency - SLA on how fast query needs to come back - memory requirement (may take longer) - inventory of queries to benchmark, SLA for each (or blanket SLA) - right now it's 30 ms on click house - accuracy: precision/recall - eyeball results, see if it's righta complete hallucination - do we need to use SGLang (https://github.com/sgl-project/sglang) or aici (https://github.com/microsoft/aici) or pseudo-semantic losses for constrained optimization? - https://onefact.github.io/healthcare-data/ - UML is a helpful picture, graphql is not visual (JSON, https://github.com/rails/jbuilder) - jaxtyping beartype - runtime static type checking - etl tool to crawl data (glue) csv, parquet -> crawler can work through the data, find all the attributes, express that as a set of tables in a catalog: - data lake idea: not certain what it looks like, put it in there, discover its schemas. # reconvene in august ![image](https://hackmd.io/_uploads/Sk7x-WlcC.png) # examples ## motherduck > MotherDuck does have the ability to do query planning between local and remote storage > Today, we use rules within the query planning phase (predominantly around where the data is currently located) > we have plans to add more context so the query optimizer can make more optimal decisions (CPU utilization, available RAM, network bandwidth, etc.). > Do you have an example workflow that you are looking to enable? > For example, are you looking to store most of your data in the cloud and the cache some locally when a user interacts with your platform? > We have the capabilities for you to do that today with some explicit caching commands (creating a local temp table, for example). > There isn't an automatic spill step from local memory to cloud memory at the moment. > In the browser, today we are fully in-memory, but we have plans to use persistent local browser storage in the near future. > Non-browser clients like Python/etc. can use larger-than-memory databases. > Great to hear that you are able to filter down your queries! > have have not enabled the VSS extension in MotherDuck yet it is on the near term roadmap.