HackMD - Collaborative Markdown Knowledge Base

Dynaconf.md - Your host as usual is Tobias Macey and today I'm interviewing Bruno Rocha about Dynaconf, a powerful and flexible framework for managing your application's configuration settings ## Interview - Introductions Hi, My name is Bruno Rocha, I live in Portugal, some things about me: I am a member of the Python Software Foundation, I am a Senior Engineer on Red Hat Ansible, I have a youtube and twitch stream channels targeting portuguese speaking audience and besides that I like bicycles. - How did you get introduced to Python? It was on 2003 I worked in a factory and my job was maintainance of hundreds Linux desktops, we used a Linux Distribution called Kurumin Linux, one of the features on that Linux was called Magic Icons, a Grpahical interface for running scripts, mostly written in bash or perl and I created many of those for office day to day task automation like printing, merging pfds, converting files etc. Then I started my first opensource contributions and there were some of those scripts written in Python so I had to start learning it and migrating perl scripts to Python. - Can you describe what Dynaconf is and the story behind it? Well, Dynaconf is a library, I like to define it as a settings client, so, settings are usualy defined in multiple locations such as static files like .py, .toml, .yaml, .json and other common formats and also settings might be overridable by environment variables or even external services like secret vaults and in-memory databases. So Dynaconf is a the client that reads settings from all those places and puts it together in your application to read it from a single object. The story behind it is, I worked in a company running Django and Flask applications on AWS, the main website and APIs used load balance with auto scaling and eventually there were hundreds of Ec2 instances running and they wanted to be able to share the same settings across all the applications, so not only the fixed settings, like having a single settings file in the aws environment, but also be able to read from a central service, at that point we used Redis server to store a subset of key:values that could change without the need to restart the applications and the instances. So I created a Python module called `dynamic_config` inside the main project and I started creating a `Settings` class using all the dynamic and functional programming strategies, it was 2014 and Heroku released a guide called 12 factor apps and I based the first implementation on the factor 3 of that guide which says: Config belongs to the environment. At some point we decided to promote that internal module in to a standlone project and renamed from dynamic_config to dynaconf. - What are your main goals for Dynaconf? The project is evolving organically, based on the needs and feedback of the users and contributors, I can say that the goal as a project is to be reliable and simple to use, like a go-to choice when you think about how to manage settings on a Python application. - What kinds of projects (e.g. web, devops, ML, etc.) are you focused on supporting with Dynaconf? Mainly web applications running on cloud, it can be on VMs, containers, Kubernetes, all about that 12 factor apps for SaaS apps in the main target. We also have users running it in different kinds of projects, like Machine Learning Pipelines, CLI applications, Desktop applications and testing frameworks. - Settings management is a deceptively complex and detailed aspect of software engineering, with a lot of conflicting opinions about the "right way". What are the design philosophies that you lean on for Dynaconf? Environment first is the first important thing, so we keep foolowing the 12 factor app guide. and I try to keep dynaconf decoupled and transparent from the projects using it, its settings API behaves like a normal Python module or a Dict Like object. So the idea is that you can have an existing project and you can add or remove dynaconf without much work on editing multiple parts of your code. To achieve that we use Python dynamic features like Lazy evaluation, overload of attribute access protocols so it doesn't stay on your way, you add it and your project keeps running. Another important philosofy is maintaining a reasonable backwards compatibility, so even if we change the library we keep layers of compatibility to avoid breaking existing user projects when we upgrade. - Many engineers end up building their own frameworks for managing settings as their use cases and environments get increasingly complicated. What are some of the ways that those efforts can go wrong or become unmaintainable? Dealing with multiple sources of data, layered environments for example: testing, production, development, and merging data from all those sources are really hard - Can you describe how Dynaconf is implemented? Dynaconf is a class that implements Lazy evaluation, the user declares configurations, paths and validators when creating a `settings` instance and at that point nothing is done until the first variable is accessed or validation is explicitly started. When it starts building the settings it runs a pool of configured loaders, the loaders are written using the Strategy pattern, there is a base loader that can be initialized with custom reader, writer and loader methods, so it reads data from all the files and external data sources and puts the data in a ChainMap. (chainmap is a stack of dictionaries, key lookup works by falling back to all the dicts on it until it is found) The process of loading each key from sources involves some parsing because you can define typing annotations for exported environment variables, it involves type inference using toml when the type is not defined, it does evaluation of lazy formatting because some values may have Jinja templates or math expressions in it and then the last step is merging each dataset with the previous loaded. - How have the design and goals of the project evolved since you first started it? Sure, it was very simple when I started then it growed to be a complete framework for testing, now it has its own settings mini language and strategies for annotating and merging data. Recently we added a dynaconf `CLI` to allow users to inspect the state of the settings. - What is the workflow for getting started with Dynaconf on a new project? If you are just starting a new project, the best way to get the boilerplate is to run `dynaconf init` and it creates 3 files, a config.py where the dynaconf object is instantiated, a settings.toml file for default settings and a .secrets.toml file for development secrets. If you have an existing project you can manually instantiate the Dynaconf class and pass all the parameters you need to customize it, and this gives you more flexibility, you can for example wrap dynaconf in your existing settings module. If you are using a web project like Django, Flask or FastAPI then you can use specific extensions available for those frameworks. - How does the usage scale with the complexity of the host project? The goal is for the settings object created by dynaconf to be really straighforward, behaving just like a normal Python module or a dictionary, so there is no huge increase of complexity. In bigger projects users tends to add more keys in multiple layered environments, and that will require to add validators to ensure the reliability of the settings state. - What are some strategies that you recommend for integrating Dynaconf into an existing project that already has complex requirements for settings across multiple environments? Try to add dynaconf in the project in a friction-less way, go to your existing settings module and wrap dynaconf on it, Python offers multiple dynamic lookup utils and in some cases you can even use module impersonation. (some examples are on advanced section on the docs) My other recomendation is to add a functional test to dynaconf repository, on the dynaconf repository there is a folder called `functional_tests` where users put small applications that reproduces their use cases and those apps are executed and validated as part of dynaconf CI so we ensure your use case doesn't break when we change the library. - Secrets management is one of the most frequently under- or over-engineered aspects of application configuration. What are some of the ways that you have worked to strike a balance of making the "right way" easy? The best way to manage secrets is using encrypted vault services, that will depend on your environment, but for example there are AWS secrets manager, Hashicorp vault, or encrypted environment variables. Dynaconf supports out of the box to load data from hashicorp vault. To make life easier during development environment, dynaconf supports a .secrets.toml file, that is meant to be local only, never pushed to repository and contains the development vault (that file keys can be encrypted via dynaconf cli using a private key) So developers doesn't need to run a vault service and in production those secrets can still come from a real security environment. Other thing that dynaconf does is identifying variables as secrets, so it is not printed out on logs, stdout or CLI output. - What are some of the more advanced or under-utilized capabilities of Dynaconf? Besides the basics, dynaconf can do some more things: - On env vars it lets you override keys under nested data structures, lets say you have in your settings a dictionary with 3 levels of keys, you can separete your envvars with __ for each nesting level you want to reach - Dynaconf supports Lazy values, so you can evaluate expressions (for example to get the number of CPU cores or compose with other envvars) - Besides the custom loades, Dynaconf has a hook system, you can define on your application hooks to do data loading or transformation. - The most under-utilized and most important feature are validators, dynaconf is schema-less by default and it is easy to define validators for known keys or even define a schema to load only known keys from the sources. - What are the most interesting, innovative, or unexpected ways that you have seen Dynaconf used? I have been hired to give a consultancy on a company that collects data using a mobile data collector, each of their collecting workers generates a .csv file to be uploaded at the end of the day, the upload form is a single field that accepts multiple files and they wanted to gather more information to process on their Python backend such as the serial number of the device, the send time, the name pf the user etc.. The mobile application they use is legacy, proprietary and they can't really ask for changes in the client software, so what they did was to add a info.toml file on each device and ask the operator to select that file with the .csv. On the backend they added the .toml in a dynaconf folder, the file containing the username as the section added variables to the settings and then they executed the Airflow pipelines to process it. It worked really great, dynaconf settings file replaced an html form. However... users discovered that they could override more variables editing that little .toml file and things started getting insecure. So I worked for 2 hours with them and we defined a custom file format and a custom loader that allowed just a subset of settings to be loaded. - What are the most interesting, unexpected, or challenging lessons that you have learned while working on Dynaconf? interesting: Docs: Sometimes it is bad to have too much documentation, users are no more interested in API docs, users wants tutorial docs, mostly in a book or lecture style. unexpected: Django settings is really difficult to customize, while in Flask you can easily subclass and override the Config class, on Django the only way to customize the settings interface is by patching. challenging: Merging data from multiple sources is a big challenge, this is the source of our most difficult bugs to solve and the largest discussions are around decisions about data merging. - When is Dynaconf the wrong choice? Dynaconf is a good fit for any project that needs to decouple the settings from code, from the simple project having a single .toml or .yaml file for settings to a more complex structure using external loaders and hooks. However if the project is a library or a plugin to be embedded in a framework, then I don't think dynaconf is a good choice, actually I don't thing libraries or plugins shoudl define settings, they need to rely on settings provided by the consumer application, I have seem people trying to use dynaconf for libs and they get frustrated. - What do you have planned for the future of Dynaconf? First, I would like to thank the contributors and co-maintainers of the project we recently defined the roadmap for version 4.0 and there are 3 big changes coming. * We are implemented a new way to define and validate settings, this is based on Pydantic and Type annotations, the current way of using Dynaconf will keep working, we are just adding another options for when you want your settings to be based on a Schema. * We are spliting the codebase, removing some core features and transforming them in plugins,dynaconf package will keep being the all-in bundle, but when user needs a small dependency in order for example to load vault secrets, you will need to install to install dynaconf-core + dynaconf-vault + any other plugins. * The data merging algorithm right now is where most of the computing is performed, dynaconf is already implemented in Rust https://crates.io/crates/hydroconf and we have plans to rewrite the data merging in Rust because it is really faster. ## Keep In Touch ## Picks - Tobias - [ - Bruno - I recently watched the series called Severance and it is really great, the first two episodes are boring but after those it is really interesting. - Sharon Kovacs (music) - Learn Rust, the future of Python is full of rust. ## Links - [ The intro and outro music is from Requiem for a Fish [The Freak Fandango Orchestra](http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / [CC BY-SA](http://creativecommons.org/licenses/by-sa/3.0/)