how to make changes to code

Sounds simple. At some point: isn't.

Code gets harder to change as more people and more other code rely on it.

Some kinds of usage, and some ways of organizing and managing code, can make the change taxes get nastier faster… or, stay more managable. It's important to understand these and organize code in ways that avoid tax multiplications.

Some patterns of action can avoid some change taxes in some situations. It's important to populate your mental toolbelt with these, and it's useful to be able to name them so you can quickly tell collaborators what your angle of approach is (and why).

The document below has four major sections:

Taxable Resources – Identifying them, understanding what they're worth, and what kind of effects we see from taxes beginning to affect them.
Tax Sources – Identifying things that produce tax effects, and understanding what kind of burdens they place and how those can slow you down.
Patterns of Action – Things you can do to ship changes, and when approaches tend to work (and what drawbacks they might have). A playbook, in other words.
Tax Optimization – Longer-term actions to take when Tax Sources can't be reduced directly but the impacts still need to be minimized.

They're all related – some patterns of action are more useful for avoiding some tax sources than others, and some patterns of action may actually produce some tax sources (especially when used repeatedly/excessively over time), etc. Don't feel like you need to read this in order; jumping around is great.

This document may be a work-in-process. You will find todos below. Some sections may need renovations. Take it for what it's worth, and enjoy whatever parts you find useful. :)

Taxable Resources

Before we get into either Tax Sources, ways to do work, or Tax Optimization options, first let's talk about what can be taxed. What are the resources we're worried about conserving?

A lot of discussions about "making changes to code" tend to focus on development velocity, but that is not the only resource to be conserved.

Examples of taxable subjects (and what happens to them as they get taxed) include:

Development Velocity

Development velocity refers to how quickly a system can be developed in. Higher development velocity is good; increasing development velocity taxes means increasing the difficulty of sheer creation of changes to code.

Ideally, development velocity always remains high. Development velocity is especially important in nacsent projects, but remains important at all phases of project lifecycle – later in a project's life, low development velcity means new features taking longer to implement, and even in a relatively finished project, a low development velocity means inability to respond quickly to bug reports.

Design Inertia

Design inertia refers to the difficulty of imagining other ways the system could be. Higher design inertia means greater difficulty in reconsidering any of the key design choices.

Design inertia is a little different from development velocity. High design inertia means reduced probably of finding "the right solution" to some problem space. This will manifest as either user-facing inconsistencies, or sometimes even as outright product failure if the design inertia takes root in a project before "PMF" has been found.

The design inertia in any system will increase over time. This is not necessarily a bad thing. However, it's something that's important to be conscious of – if design inertia is rising too quickly, then it can be problematic.

High design inertia – especially the early-onset problematic variety of design inertia – can have negative feedback onto development velocity and onto system comprehensibility, which makes design inertia something to be very careful of.

System Comprehensibility

Comprehensible systems are important. This is such a basic claim it's almost tautological, but still, it needs to be said!

The more that a systems shifts comprehension burders to users, the more it will drive user adoption rates down; over time, inconsistencies and incomprehensible behaviors may even drive existing users of the system away.

Sometimes systems contain legitimate, irreducible complexity, which comes from the domain they work in. Sometimes systems contain accidental complexity. Accidental complexity can come from development missteps that go uncorrected; design inertia that lead things down a poorly fitting path in which the system's design doesn't match its domain well; etc. The goal with system comprehensibility should be as little accidental complexity as possible.

When we talk about system comprehensibility, we're mostly concerned with the user's perspective, but low system comprehensibility can also have negative feedback onto development velocity.

Design Churn

Design churn refers to how often a system changes in ways that users experience and need to invest energy or attention to adapt to. A low rate of design churn is generally preferable.

Design churn isn't really a taxable – it's not a resource we start out with and try to conserve. However, it's interesting to identify it as a resource, because it's something we can choose to spend, and yet must spend sparingly.

Design churn is something that occasionally has to be spent in order to reduce design inertia, or to combat system incomprehensibility, or sometimes even as a direct response to defect rate.

Spending design churn too frequently can result in driving users away in similar ways to system incomprehensibility, which means design churn has to be metered carefully.

Defect Rate

Defect rate is another way of saying "how many bugs are there? how often are the bugs encountered?", and, possibly, "how bad are the outcomes?".

This isn't really a taxable, like the other headings in this section – it's not really a resource that we start out with and try to conserve. We won't actually spend a lot of time talking about defect rate specifically in this document, but it may be worth keeping it in mind because it's a negative outcome that can come from poor choices about tax management (and probably the most obviously visible outcome, at that).

Tax Sources

"Tax Sources" means things that can be a source of burdens that slow down work – but also I call them "tax" rather than "cost", because that's kinda how most of what we'll discuss here behaves: it's things that start affecting you only after you've got some product that's being used (some "income"), and most of them scale with the number of users, too.

CLIs

CLIs – Command Line Interfaces – are a form of interface for software to interact with humans.

CLIs are a form of tax for software development, because once they have users, there begin to be costs associated with surprising those users by making changes. Any kind of user interaction surface area, once used, creates a source of inertia, and becomes something you'll have to consider during any future prospective changes.

UIs

UIs and GUIs – Graphical User Interfaces – are a form of interface for software to interact with humans.

UIs are a form of tax for software development in exactly the same way as CLIs. Once a UI has users, there begin to be costs associated with surprising those users by making changes. Any kind of user interaction surface area, once used, creates a source of inertia, and becomes something you'll have to consider during any future prospective changes.

UIs can be somewhat milder than other tax sources, because it's possible to invest more development effort and produce UI flows that introduce users to changes gradually, or simply notify users of the change. This ability to communicate to a human means UIs can experience slightly less change resistance compared to systems which have more mechanical (read: more fragile) interactions.

Serial APIs

Serial APIs – application/program interfaces which involve some sort of message-passing (rather than direct function calls and shared memory) – are a category of ways that programs communicate with other programs.

Serial APIs are a form of tax for software development, because once they have users, there begin to be costs associated with surprising those users by making changes.

Comparative tax rate for Serial APIs

Serial APIs generally have a higher tax rate than CLIs or UIs, because they're usually consumed by other programs. Having programs rather than humans as consumers means there are generally fewer avenues for hinting about upcoming changes, and thereby helping downstream consumers do upgrades smoothly over time. Similarly, when something changes, whereas a downstream consumer that's human may experience surprise and mild irritation, a downstream consumer that's a program will generaly simply break, which can often be a bigger and sharper-edged problem.

Serial APIs are still a lower tax rate than Programming APIs (which will talk about more next), however. Since serial APIs are used at boundaries between processes or where networks are used, generally, a decent amount of error handling code necessarily is present in the area. Some of this is because it's required for dealing with the potential of network communication failures. Some of it is also because programmers are relatively trained to expect a serial API to change over time. The upside of this is:

Examples of serial APIs

There are many kinds of Serial APIs. For the purpose of this document, we'll mostly consider them in aggregate. But, to have some solid examples to keep in mind:

JSON streams passed between a process and its parent process over stdin and stdout are a Serial API.

HTTP APIs are Serial APIs. (HTTP APIs in particular have some interesting benefits in that there are in-built signalling planes vs data planes, which means there is a wide variety of ways that things like protocol versioning can be signalled… however, this is a more specific detail than we'll spend much time on in the rest of this document.)

CLIs that are heavily consumed by other scripts and programs instead of by humans interactively also become effectively a serial API rather than counting as a CLI for tax purposes – in other words, a CLI can become a more expensive development tax when its consumers become more automated.

Next we'll talk about Programming APIs. These may seem similar to Serial APIs! The one critical distinction to keep in mind is that Serial API message shapes can be independent of how you organize your code for handling them, whereas that's not generally true for Programming APIs – and this makes handling change management for Serial APIs much easier than it is for Programming APIs.

Programming APIs

Programming APIs are the definitions of functions and data types in a programming language. Programs are composed out of Programming APIs.

Compared to UIs and CLIs: Programming APIs are generally much harder to change than CLIs or UIs, because you can't "just maintain the old behavior" without:

increasing the amount of code you write (…which will increase the amount of other development taxes you pay over time!)
increasing the amount of information and code a library consumer might need to understand
experiencing significant design inertia burdens

As with Serial APIs, it's also very hard to "hint" to users that it's time to change. This is easy (or at least possible) in UIs and CLIs; but libraries, which are consumed by other code rather than humans, have fewer options.

Programming APIs can contain a lot of surface area, too. The definition can sometimes be stretched to include debugging and other programming tools (and if this applies to you, yeah, your tax rates are going way up). For example, in some languages, it's possible for a "stack trace" to become part of the surface area that your downstream users consider your API. (This is not considered normal – but Hyrum's Law is a well-known eponymous law for a reason.) Because of this tendency towards surface area expansion, even against the designer's will, programming APIs can be pernicious tax sources.

Why not just treat every edge in a programming API like a serial API?

If you're thinking to yourself "cool, so, serial APIs were lower tax because they have more error handling around them; why not just treat ever programming API the same way?"…

Yeah, great idea. It just a bit (cough) easier said than done.

Error handling is expensive.

Gathering error handling together at serial API boundaries is a natural accretion effect. It's pretty convenient to lean in on that; the accretion is a form of cost ammortization, and serial API boundaries are a natural crystal seed. One could create other crystal seeds, but the utility of ammortization is remains present. Which means pouring more crystalization seeds into the solution of your programming environment doesn't necessarily help you buy much more. (Forgive the horrifically mixed metaphors.)

There are some programming languages which attempt to make every function call as failable as a serial API. (Erlang is somewhat famous for this.) For practical purposes, though, in the ballpark of the year 2020, these languages remain (perhaps unfortunately) niche.

Programming ABIs

(This section is optional – you can skip it. If you know what an "ABI" is and you're wondering if it's a tax: read on (and spoiler: yes). If you don't know what an "ABI" is: this may be a level of detail you'll be fine without.)

First, some definition and disambiguation: "ABI" is short for "Application Binary Interface" (as constrasted to "Application Programming Interface"). In practice, "ABI" means "an API that works with raw memory and some kind of message passing that involves pointers and integers", while "API" means "there is something here that compiler will check for you".

For example, the linux kernel surface area is often referred to as an ABI. It's not a "programming API", because there are large lists of numerical constants that define how to communicate with it, and these are not programming-language specific. and they're definitely not human readable, they also tend to be used for in-memory communcation, and they can contain pointers and cycles (so it's not a "serial API" either).

Broadly, I'm going to skip over these as a distinct topic, because we can get most of the intuitions by just saying they're still Serial APIs, maybe with a slight hint of Programming API mixed in.

The main differences between an ABI and a Serial API are:

Concretely: ABIs can have pointers. And thus cycles.
Concretely: ABIs have a tendency to use lots of "magic numbers", rather than readable strings, to signal meanings. (This only a correlation – but a strong one, because ABIs are usually seen where there's performance sensitivity.)
Resultingly: ABIs tend to be harder to debug. You can't necessarily serialize the things being passed around (because of the pointers, and the cycle possibility); and translating any "magic numbers" back into semantics for a human just adds layers of work.

The main differences between an API and a Programming API are:

An ABI tends to have slightly more flexibility (closer to a Serial API), because there's generally a way to "call that function and see what happens" – even if it's "not there" – whereas a Programming API is usually explicitly fragile in the same context.

Repositories

Each version control repository is a tax. Some of the costs are in simple administrivial; some of them become systemic.

Each repo is something you and your contributors will have to push and pull updates in and out of.
Each repo generally needs its own CI setup, etc.
Each repo (assuming we're talking github) gets its own issue tracker, etc – that means more places to engage and more distractions.
If changes need to propagate through multiple repos, the need to jump across repos can produce enormous amounts of work. (And this tax can become an exponential one (!!) – see Taking things too far: uncountable repos)

There are upsides to creating new repos, too:

Since repos are the unit of granularity for CI, more repos means faster CI turnaround times per repo.
Adding New Repos is one of the Patterns of Action we can use to help write new code!

Documentation links

(Don't laugh, I'm serious.)

Documentation links should ideally be relatively stable. How stable exactly may vary based on the topic and content, but generally, stable is good.

Documentation links can induce more pain when changing than something like changing an API does, because there's no compilers and generally no tooling to help catch and address any breakages from these changes. On the brighter side, however, documentation is always consumed by humans, which means there's similar room for human adaptiveness and forgiveness as with UIs.

If you lean heavily on documentation which is auto-generated from code, it can add friction to code changes, because you may need to worry that changing an API can also result in a loss of discoverability of information about a topic.

This tax source creeps up slower than most of the others, because when your API/UI surface areas are small, consumers are more able to just "poke around" and "figure it out"; because when your project is young, there's simply been no time for things like out-of-date docs (or Stack Overflow posts!) to accrue; and also, when you don't have many consumers, it's typically also the case that they're either new, or some kind of "power user", and thus either weren't exposed to older documentation links, or will be slightly less bothered just because they're so invested. Nonetheless: eventually those early affordances will wear down, and this tax source will creep up on you.

One can just choose to refuse to pay this particular tax, unlike most of the others we've discussed. It may not a good idea to do so, however – or at least, not for long. It can be deferred for a while, but in the long run, users will stop being willing to lean on something that changes frequently if it doesn't at least provide stable documentation to help them cope with it.

Patterns of Action

Changing In Place

"Changing code in place" refers to the obvious, basic, normal thing: add or remove code, alter and rename things, however you want.

The upside of this approach is it's simply the easiest to do. It's work without worrying.

The development velocity tax is minimal.
The design inertia tax is present, but low.

The downside/limitation of this approach is that it produces the most change churn and pushes potentially large amounts of work to downstream consumers. For projects with a sufficiently large number of downstream consumers (and especially if those downstream consumers are maintained by different agencies), this can simply become nonviable.

Additive-Only Mode

"Additive Only Mode" is when you write code or develop serial APIs an existing project in such a way that it changes no exported symbols or messages whatsoever.

This can be desirable because it minimizes the apparent magnitude of change that's experienced from outside the project, "making fewer waves", so to speak.

The downside is that it's both a huge amount of work, and significantly changes the kind of work that will get done:

It's very difficult to write additive-only code:

The development velocity tax is high, for [Serial APIs].
The development velocity tax is excruciating, for [Programming APIs].
The design inertia tax is infinity. There are some (many!) kinds of change which simply cannot be done in this mode.

Additive-only approaches mean semantic changes to APIs are impossible; or, to do them, that you have to make a new API (which is an option, but means you're creating more code, which is going to mean more other taxes in the future). For this reason, we can say the design inertia tax is nearly infinite.

"Additive Code" strategies is most intensely applicable to the maintinence of [Programming APIs]. Wrangling [Serial APIs] can also sometimes benefit from using an additive-only approach to the serial messages to minimize the apparent magnitude of change, but this has a different ROI curve compared to how additive approaches affect code. Because Serial APIs already have decouplings between the external surface area and the code structure that developers of the API deal with, the design inertia tax is significantly lower for locking a Serial API into additive-only mode.

Additive code is the most safe and reliable way to make changes, but also the most costly. Because of its sheer costs, an organization should be aware of areas of their development which can only be touched in this mode; those are huge cost centers for development velocity overall (and worse, your developers are probably actively avoiding them, which may be having knock-on effects to your design health overall).

If you're seeing large regions where "Additive Code Only" is a defacto policy, then it may be wise to check in on if [The Fear] has taken hold of you.

// todo: consider splitting this up into sections for 'additive code' vs 'additive messages'. Latter can be more subtle. Unfortunately, a whole book could also be written about it. The concrete details are too important and too detailed, so it's hard to talk about without getting over-focused on e.g. thinking of it only in JSON, or only in $foobar formats, or only in the $frobnoz parser context.

Preparing for Additive Code

// todo: it's not bad if you can plan for it. in other words, planning your way out of the design inertia tax component, in advance: that's awesome. if it's cheap.

// ofc, this raises upfront development cost, in exchange for that long run tax break.

// and it's only a chance: one is betting that the plan works. if it doesn't, you paid higher dev costs, and don't get the tax break -> bummer.

// the most important antipole to avoid here is not just paying higher dev costs, but being mindful not to pay in some currency that also raises other taxes. if you do that: that's where "archetecture astronautism" has kicked in and will really pants you.

Adding New Repos

Adding a new repo is one of the "generational" approaches.

The developmental velocity tax is moderate (setting up a new repo does generally incur some extra work).
The design inertia tax is near zero – lower than for change-in-place, and wildly lower than for additive-only.

// todo: more

"Adding new repos" is distinguished from "Additive Code", even though both are "adding" new code, because of the low design inertia tax. Adding a new repo makes a new space for design evolution; whereas with "additive-only code", we're talking about working in a confined way in an existing space.

Adding Sibling Packages

Adding sibling packages is one of the "generational" approaches.

It's like the approach of adding new repos, but significantly lighter: it simply… doesn't involve generating a new repo.

This approach can feel a little janky, because in sheer naming, it often involves a "foo2" folder next to a "foo" folder. However, once one looks past the cosmetics, the practicalities are pretty good.

This approach is suitable for maintainer and core-contributor work, but probably not as suitable for community code contributions. Adding new sibling packages in the same repo creates some amount of minimal ongoing forced maintinence, or at the very least adds some amount of design inertia, so it should only be used for fairly finished or maintainer-originated work, in order to make sure the burdens are centered on the right shoulders. Community-originated work which considers taking this approach can certainly ask the maintainers if they're open to this approach, but should probably usually use the new repo approach.

Meta Pattern: Generational Approaches

"Generational approaches" means writing new code or new APIs in a new place, with minimal regard to the existing API.

The upside of generational approaches is that they reduce design inertia burdens to near zero. (This is a huge upside. There are some kinds of progress that simply cannot be made without zeroing this.)

Generational approaches also allow defering development work costs associated with propagating updates. Because the new API is in a new place, the work can deferred indefinitely. (This is both an upside, in that it creates planning flexibility (and especially, doesn't force timelines on downstream consumers that are maintained by a different agency), and a risk, because the update work is potentially indefinitely deferrable, and in that case, it produces many new tax sources.)

Generational approaches are also great for de-risking development, because the work can be started, and then paused at any time in partially complete states, or, even abandoned entirely. Because changes aren't forced into downstream consumers, a lot of risk is either removed or becomes relocateable. Downstream consumers either start using the next generation code, or… they don't; and in the latter case, sometimes that's even just a legitimate indication that the attempt at the new generation just wasn't up to snuff.

Using Import-Path Versioning

// todo: this section either needs to be not golang specific, or, if that's not possible, needs to move.

Some toolchains (namely, golang) offer a form of package versioning based on import paths.

The intended upside of this is that downstream consumers of a package can say which major version they want by using a version number in the import path they use for your module. The toolchain effectively serves a copy of the code from that version whenever a project asks for it. At the same time, the developer of the project keeps using their version control normally, not taking on taxes such as maintinence branches.

It's unclear how relevant the freedom from maintinence branches is, but the general idea seems to be to make people less scared of making significant/breaking changes and advancing the major version as they do so. (It's a bit like the generational approaches, really, but the toolchain is attempting to give it to you "for free", without either new repos nor new sibling directories.)

// todo: use Apache Commons Collections from java as another example of this, which wasn't tool-induced nor particularly tool-assisted.

// todo: move all golang references into a subsection, which can focus on that tool assistence for generating more import-path versioning incidences is solving the wrong problem and greasing the wrong chute.

Major downside: forces changes to be breaking even if they weren't relevant. By putting a version hunk in the import path, it aggregates all changes into one big change, and that new aggregated "change" is expressed on every single function and type in the module… even on things that didn't actually change.

There is a time and a place, sometimes, for using import-path versioning. Putting versions in import paths can be used for "breaking" diamond-dependency problems. ("Diamond-breaking" is a good thing in this context. Roll with it.) That does provide value!

…Or at least, it provides value in the situation where you had a diamond-dependency problem. However: diamond-dependency problems aren't all that common. And even when they do occur, generally, if at all possible, the ideal solution is to rearrange import graphs until they disappear.

On the other hand: putting versions in import paths can also fabricate problems even where there wouldn't have otherwise been: when there are two siblings in an import graph which communicate with each other using types from the import-path-versioned library, then if the version appearing in the import path must agree, and this can mean an import tree can end up forced into a broken state if the actual symbols interacted with would've been fine otherwise.

Long story short: use import-path versioning judiciously. Reach for it as a last resort. Exhaust all other alternatives first.

Using Content-Addressed Import Versioning

Like above, but the downside is just hypermagnified.

I think this is in fact never a good idea.

(Content-addressed dependency management: good; putting that fact into import paths: not good. The odds of manufacturing a pointless but unsolvable(-once-manufactured) "diamond dependency" conflict with this approach is extremely high.)

Tax Optimization

What do you do when the tax sources in your working environment are reaching problematic, work-impeding level, and it's not an option to just "remove some"?

Remove Tax Sources

The #1 action to take to minimize development taxes is always: remove tax sources.

If you can reduce the number of UIs and CLIs you have – reduce the number of APIs you support – reduce the number of libraries you consider public – reduce the number repos you have – all of these will immediately and directly reduce the taxing effect you feel on your development efforts.

This is rarely easy (and may conflict directly with business goals, so may sometimes be outright impossible), but if you can do it, it's absolutely going to be the highest bang for buck.

Build Propagation and Test Tools

If your problem is uncountable repos, then: start building change propagation tools.

(or at least playbooks. It should be clear and communicated what the expectations are for changes, as well as humanly executable.)

The idea is simply that when you make commits in some burdensome repo, there should be a known and documented (or ideally, fully scripted) set of steps to take which bring those commits into some known list of downstream repos, and runs the tests in those downstream repos against the changes.

Form Change Families

// todo: give a name to the idea that you can split some things across repos but make sure there's a horizonal clade that is expected to be upgraded at once when the clade's hub repo experiences a major change. lower tax than the same number of repos but with deeping.

This is especially easy if you can create an architecture like the one above, where many features (probably "plugins" in vibe if not in name) can be united under a common API: then, it is quite clear that the unified API is the change-family definer.

However, this architecture isn't strictly necessary. Even if C1 was reaching directly in to contact FooFeatA, and C3 directly contacting FooFeatA and FooFeatD… knowing where the change-family border is can still help, because when developing in any of those repos within the boundary, the developers working there know they need to make sure everything within that box is cross-compatible and consistent.

Defining a change-family has a bonus effect of making an area where it's especially clear how to build and apply Build Propagation and Test Tools !

Systematically prefer breadth in dependency trees over depth

Many of the taxes having to do with large dependency trees scale in cost primarily with depth. (This is especially true of the taxes that make change difficult.)

If you need to break down a design into lots of repositories and packages: have a strong preference for making things siblings rather than introducing more dependencies if they'll form a "tall" tree.

Discuss Change Origination

This one is just communicational.

Discuss where changes are allowed to originate.

Just have the discussion in your team. It'll reveal a lot.

It's altogether too easy to slip into a pattern of believing that the farthest downstream (coincidentally: typically the closest to customer!) projects, and the APIs as they consume them, dictate what the APIs must be. However, this logic, if used to exclusion of all other logic, can result in significant paralysis. Self-awareness is important here, and can be sufficient to break out of any traps.

Create Clear Boundaries Within a Repo

In some cultures, people will freely and frequently reach for new repos as a way to create boundaries between functional groupings.

This can work, but it's not the only way.

Consider building tests which assert on the dependency tree of a package (or group of packages). This has multiple benefits: it adds clarity to boundaries and relationships; it lets you create boundaries without creating whole new repos (which would be an action that creates tax sources! and is thus worth avoiding); and as a bonus, it gives a single clear locus that's a good place to document why these boundaries and layerings are intended.

Seasons

Winter ("code freeze") and Spring ("change bloom"). Possibly Fall ("additive only").

If you're going to have winter… make sure you know how to have spring again at some point in the future, too! Talk explicitly about these cycles. Give them planned durations. Communicate the durations. People can work well on seasonal schedules as long as they're clearly communicated.

// todo: expand

Explicitly ignore Hyrum's Law

These's a short paragraph about Hyrum's Law in the Programming APIs section, and how if it affects you, your tax sources from a Programming API can skyrocket.

The "law" is simple enough that we can just repeat it here:

With a sufficient number of users of an API,
it does not matter what you promise in the contract:
all observable behaviors of your system
will be depended on by somebody.

So what to do about this?

Ignore it.

Sometimes the most direct solution is best. And sometimes no solution is the best solution. This is one of those (rare) scenarios.

One might argue it's even better to explicitly caution your API consumers not to rely on behaviors not specified by the contract. I won't try to talk anyone out of that, but I think it can be a fairly unlimited cost sink with rapidly diminishing ROI, because misses the key observation of the "law": users will do this anyway. So then: the question is only what you will do about it, and how you let it affect (and tax) your own development processes.

And I suggest you ignore it. Don't let anyone invoke Hyrum's Law on you and raise your taxes with it, unless you actually agree with them about the impacts and believe it's worth it to weight them more highly than the freedom to make changes. You can just shrug this one off; it's an option.

It's totally morally acceptable to leave the burden of misconduct purely on the shoulders of the ones who went out of bounds. Here, "out of bounds" means "leaned on Hyrum's Law".

(The calibration here, of course, is: if you feel that you committed the misconduct by having an underspecified API in the first place… then, okay. Maybe taking on more work to compensate for it, for the benefit of your users, is acceptable. But there should be limits to this. It should probably not be the default stance.)

Errata

Taking things too far: one monorepo

Nuff said.

Some repo boundaries can give you flexibility. Zero repo boundaries is typically not the right number for a large organization.

Now – It has to be confessed: Some organizations – including some very famous organizations – do work this way.

However, don't mistake your organization for those famous organizations. Organizations which do work in one large monorepo have oodles of tooling and automation which makes that possible. If you attempt the same thing with even remotely close to the same number of people, but without the tooling, you will have a bad time.

Taking things too far: uncountable repos

More repos, more taxes. But it's also a bit subtler than that. It's not linear.

The problem isn't even that the repos become uncountable; the problem is better measured by: "how many repos do I have to propagate a change down through before I'm sure something won't force me to revert it?" In other words, it's the depth of dependencies, and how many repos that crosses, that's consequential.

The costs here are exponential. One downstream is a cost. A granddownstream is a sizable burden that will visibly affect development velocity. A great-granddownstream is an albatross. If you ever see a great-great-granddownstream, then it's all but guaranteed that development is going to be constrained to a dead halt.

While the upside of having more repos is that each one is individually easier to make changes within, the cumulative effect is quite different. Fewer repos can be better.

Even Additive-Only Code can be "breaking changes"

Consider that adding new functions to a class in Java can change the binding affinity for parametric polymorphism, causing effective "changes" to what functions are called even when you're only "adding" code.

Consider that adding new functions on a type in Golang can cause interfaces to be matched in code handling this type, causing different logical branches to be taken even though you're only "adding" code.

These aren't necessary "game over" scenarios, nor a reason not to try to make minimal changes. But it's a valuable admonition about the boundaries of caution. Even the additive-only approach to code can cause "breaking" changes if your definition of "breaking" is expansive enough.

The lesson to take away from this is that "breaking changes" is not always a hard line. Defining it is always needs some discussion. A knee-jerk reaction to the word "breaking" can be problematic, because there are many situations where it will be an over-reaction. Don't let knee-jerk reactions and over-simplifications of this concept into your culture.

Deprecation warnings

Deprecation warnings aren't a pattern of action. Removal policies are.

Deprecation notices do not, by themselves, help you write or maintain code. Do not mislead yourself on this. Deprecated code is still code that's exerting a maintinence burden, and definitely a documentation and user comprehension burden (probably doubletime), and even a design inertia burden, right up until the moment it's gone.

Deprecation is only useful if you follow through on it and actually remove deprecated code.

Deprecation is low-value unless you have granddependents

Deprecation periods – a period when an API becomes unrecommended to use, but is still included in code – are intended to give downstream consumers a safe period of time in which to update their use.

Here's a key thing, though: deprecation periods don't matter if your consumers are one step downstream of you.

Consumers that are one step downstream from you can just choose not to update your library until they're ready. It's not valuable to bend over backwards for them. These folks already have viable change management options.

Deprecation periods matter when your dependents have their own dependents, and those granddependents either see your API, or have two dependencies that both depend on you. When you have granddependents, it's now possible to have "diamond breaking" problems in the dependency graph: if two things in the graph depend on different versions of your library that aren't compatible, you can end up with situations where nothing compiles until all the involved libraries are updated at once. And while of course everyone's ultimate goal is to update everything (cough right?), it's still not productive to create scenarios that force this, because it's a coordination nightmare.

But again: all that only happens if you have granddependents. You can't have "diamonds" in the dependency graph if the graph's max depth is one!

Don't bother with deprecation periods unless you're in a position in your ecosystem where you have granddependencies. It'll slow development for no particularly good reason. Do it if there's a reason: not before.

(This admonition may be dampened a bit if you live in a language ecosystem that has good tools for deprecation warnings, because in that case, one can update a dependency, check everything still compiles and tests, and then start grinding away to address deprecation warnings; and doing these things in phases could be a productive flow cycle. However, this author has never actually seen tooling for this which is good enough in UX to really make a strong case for it. It's generally still far too easy to just ignore deprecation warnings.)

The Fear

"The Fear" is what happens when you have uncountable repos and also lack any tooling for propagating and testing changes across those repos.

The Fear must be avoided and minimized at all costs. The Fear is the mind-killer. The Fear is the little death that brings total developmental paralysis.

If you have The Fear:

ADMIT IT. This is critically important. If you begin making choices based on The Fear, you need to know this, admit this, and communicate this fact to your team.
Develop tools to assist. You need a playbook for propagations. If you can script it and include this in tests, it's huge leverage for combating The Fear.
Did you reach this problem by using "generational approaches"? Then you've forgotten to let old code die. Find a way.

"Firing" Users

Your project/product doesn't need to be everything to everyone.

Sometimes you'll have early users who liked an early form of your project… but as the project's vision becomes more refined, you might find that some of these users and what they wanted is not very well aligned with the refined vision of the project.

That's okay.

You can "fire" users.

There's a million ancient adages for this, because it's important.

"More wood behind fewer arrows." You've heard others.

It still applies when it's about choosing which people to please and which users to serve.

Building a product that's really good at something means reigning it in rather than trying to make it sorta good at too many things.

Going full Galaxy Brain

All Serial APIs are Programming APIs. All Programming APIs are ABIs. All CLIs are Programming APIs. All Serial APIs are CLIs. All is One. One is All.

… and therefore all of this advice is bad and useless.

Yep. Alright, true. As vagueness approaches infinitity, all things are equal, and all statements become simultaneously trivial, and yet infinitely rebuttable by counterexamples.

Nonetheless. There are a range of things in the world today that a randomly selected developer will probably look at and say "that's a serial API", and look at another thing and say "that's a program API", and there's a very high probability that if we randomly select a second developer, and point at the same things, they'll agree.

We're trying to use descriptive language here in that sort of agreeable way.

TODOs

resource section

The resource section still feels a little unclear.

It conmingles talking about {taxable things}, {resources that we can use to pay down taxes}, and {things that are costs, but not resources}. This may make sense, because some things are in multiple categories at once, but not all are, and the clarity about this is not high enough.

Just renaming it "units of account" or something like that may improve it.