# On Software Dependency Engineering _Note: I have worked at Google for the majority of my professional life, which will color my views. Take with a pinch of salt, remember that everything here is my own views and do not represent that of Google or anyone else._ This document is here to convince you, dear reader, of three things: >1. Open source dependencies that your software relies on is code that you did not write, but it is code that you _have a localized responsibility for_. >1. The software engineering world should recognize the sub-field of **Software Dependency Engineering** and the role of the **Software Dependency Engineer**. > 1. Managing dependencies is _not toil_. Dependencies are not free and extract an ongoing maintenance cost. Dependency management is something that cannot be avoided and cannot be fully automated out of. There is a _lot_ to unpack here, so let's start now. ## Quick dependency management primer There are precious few software projects that do not have any dependencies at all (past the standard library of their languages). > Most projects rely on at least some open source software. Let us think about an imaginary project we own. Once a project relies on open source software, it's now at significant risk of various forms of [dependency hell](https://en.wikipedia.org/wiki/Dependency_hell), but perhaps the thorniest one is the diamond dependency problem: ![Diamond Dependency Problem](https://hackmd.io/_uploads/SyES-cIU2.png) _Credit: [SourceGraph](https://about.sourcegraph.com/blog/nine-circles-of-dependency-hell)_ Here, A depends on libraries B and C, but B and C _state_ that they depend on difference versions of library D, but only one version of D can exist in the final binary. Resolving such an issue can be savage: often the only realistic resolution is to head upstream to the project using the minimum version — here it is B — and file an issue with them asking to update to the later version. Then we wait to see if the B maintainers are willing to do it. But remember, this diamond dependency issue is a problem with _our_ build graph, no-one else. Depenncy hell grows exponentially with the number of dependencies any project relies on. Take the Kubernetes Go client (just the client, not the cluster software itself). With transitive dependencies, the project relies on upwards of 80 projects. Any two of them can introduce a diamond dependency. And this is a small example. There are much larger projects than this. ## Dependency responsibility With the diamond dependency problem in our imaginary project, the problem was upstreamed to B, even the original issue was that the diamond dependency affected our build, not theirs. > Do we have the right to offload our build issue onto open source maintainers? Do they have the responsibility to deal with that issue? I would argue _no_ and _no_. Open source projects exist as a gift to the world. There is no responsibility for maintainers of those projects to do anything. If we are willing to accept ownership of the dependency hell problem, we now reach the conclusion of position 1: > Open source dependencies that your software relies on is code that you did not write, but it is code that you _have a localized responsibility for_. It is incumbent on us to solve the problem. Perhaps we send a pull request upstream (the most responsible, but perhaps slowest, method). Perhaps we manually patch the projects to play nicely together (for diamond dependencies, there is zero guarantee that either project couldn't just use a different version, the versions defined could be just what the maintainers _stated_ at the time). Perhaps we decide we don't truly need a dependency anymore and figure out how to do without. ## How to work with dependencies Before I discuss the concept of the Software Dependency Engineer, I'm going to start with a little backstory. ### The Software Engineer in Test Almost lost to the mists of time, Google had a job role called "Software Engineer in Test". Engineers in this role were responsible for work like writing test harnesses and environments, consulting with feature developers to ensure they were writing effective tests, and making sure that tests ran fast enough without flakes. The beauty of the job role is that it succinctly communicated what these engineers were expected to do, and what they were _not_ expected to do. These engineers were not expected to be working on user-facing features. That freedom allowed them to focus and hone their skills, doing work that most software engineers can't. Can you write a test harness? Maybe. Can you write one that you _know_ is good? I would hazard probably not. I can't and I've been professionally writing software for over ten years. The Software Engineer in Test had a very specific and recognized skill set. (We see other fields of specialization as well, such as the Software Reliability Engineer, but I would note that I think such a role perhaps has more overlap with standard software engineers than the Software Engineer in Test.) ## Software Dependency Engineering Given how difficult dependency management is, we now we reach the second conclusion: > The software engineering world should recognize the sub-field of **Software Dependency Engineering** and the role of the **Software Dependency Engineer**. A Software Dependency Engineer is responsible for managing a project's dependency graph. The Software Dependency Engineering skill set bisects automation, build systems, policy and external outreach. Here's a quick and non-exhaustive list of possible tasks: * Create automated systems to update dependencies. * Create automated systems to warn of vulnerable dependencies. * Resolve dependency hell issues. * Communicate with dependency maintainers about breakages or discovered bugs as a result of attempting to upgrade, including a proposed patch if possible. * Work in a cross-cutting manner across the whole project to resolve build/compile failures from attempts to update a dependency. This is _significantly difficult_ for any reasonably sized project, especially one that is polyglot in its programming languages. * Provide feedback to teams on the _quality_ of a proposed dependency: where quality may mean things like popularity, responsiveness to filed issues or pull requests, evidence of strong code review practices, analysis of API stability. If a cost is going to paid for maintaining a dependency, let's make sure the cost is worth it. * Ensure that the legal, and more importantly, _ethical_ obligations to dependencies licenses are upheld. This is a series of specialized skills where any given software engineer is unlikely to be effective at all of them. I don't know about you, but to me this sure sounds like a full-time job just to keep up with this all. Now, we are at the final position: > Managing dependencies is _not toil_. Dependencies are not free and extract an ongoing maintenance cost. Dependency management is something that cannot be avoided and cannot be fully automated out of. It has been too easy to discount dependency management as toil. By doing so, the issue is inherently diminished, and we collectively put our heads in the sand and hope no disaster befalls us. ## Conclusion I hope this document has done at least a half-decent job of making this case. I am worried that the problem of dependency management will only get worse. Without a serious change in mindset about how dependencies are managed and how much time needs to be invested to do so, projects careen towards disaster. By creating a Software Dependency Engineer versed in the field of Software Dependency Engineering, we can collectively not only mitigate, but accelerate, the ability to effectively utilize dependencies. ## Appendix: Notes * This article was triggered after reading Titus Winters' chapter on [Dependency Management](https://abseil.io/resources/swe-book/html/ch21.html) in the [Software Engineering At Google book](https://www.oreilly.com/library/view/software-engineering-at/9781492082781/). * I keep referring to open source dependencies, as I feel this is most applicable to most readers. But open source is not a prerequisite. For example, the Google monorepo is enormous, and largely operates as a series of independent projects that have dependencies on each other. At Google, the policy is that if a dependency breaks a dependent, it is incumbent on the dependency to fix the dependent. This is an inversion of what I speak about here. The reason being that there is no way for an open source library to know who it breaks upfront, especially for projects that are not themselves open source, and open source maintainers have no responsibility to their dependents.