Team Topologies

Team Topologies === ###### tags: `formación` [TOC] # PART I: Teams as Means of Delivery ## 1. The Problem with Org Charts ### Beyond the Org Chart People don’t restrict their communications only to those connected lines on the org chart. **We reach out to whomever we depend on to get work done**. We bend the rules when required to achieve our goals. That’s why actual communication lines look quite different from the org chart. There are three different organizational structures in every organization: - **Formal structure**: the org chart. Facilitates compliance. - **Informal structure**: the “realm of influence” between individuals. - **Value creation structure**: how work actually gets done based on inter-personal and inter-team reputation. Instead of a single structure, what is needed is a model that is adaptable to the current situation, one that takes into consideration how teams grow and interact with each other. ### Systems Thinking 1. optimize for the whole 2. look at the overall flow of work 3. identify what the largest bottleneck is today 4. eliminate it. 5. repeat ### Cognitive Load and Bottlenecks One person has a limit on **how much information they can hold in their brains at any given moment**. The same happens for any one team. When cognitive load isn’t considered, teams are spread thin trying to cover an excessive amount of responsibilities and domains. Such a team lacks bandwidth to pursue mastery of their trade and struggles with the costs of switching contexts. The number of services and components for which a product team is responsible (in other words, the demand on the team) typically keeps growing over time. However, the development of new services is often planned as if the team had full-time availability and zero cognitive load to start with. This neglect is problematic because the team is still required to fix and enhance existing services. Ultimately, the team becomes a delivery bottleneck, as their cognitive capacity has been largely exceeded, leading to delays, quality issues, and often, a decrease in team members’ motivation. We need to put the team first, advocating for restricting their cognitive loads. Explicitly thinking about cognitive load can be a powerful tool for deciding on team size, assigning responsibilities, and establishing boundaries with other teams. ## 2. Conway's Law Conway’s law tells us that an organization’s structure and the actual communication paths between teams persevere in the resulting architecture of the systems built. They void the attempts of designing software as a separate activity from the design of the teams themselves. Communication paths within an organization effectively restrict the kinds of solutions that the organization can devise. :::info Dime cómo es la estructura de tu equipo/organización y te diré cómo son los productos que genera. ::: ### The Reverse Conway's Maneuver If we want our organization to discover and adopt certain designs or discourage other ones, we can reshape the organization to help make that happen. ### Software Architectures that Encourage Team-Scoped Flow For a safe, rapid flow of changes, we need to consider team-scoped flow and design the software architecture to fit it. The fundamental means of delivery is the team, so the system architecture needs to enable and encourage fast flow within each team. Thankfully, in practice, this means that we can follow proven software-architecture good practices: * Loose coupling—components do not hold strong dependencies on other components * High cohesion—components have clearly bounded responsibilities, and their internal elements are strongly related * Clear and appropriate version compatibility * Clear and appropriate cross-team testing :::info Equipos con las características que esperamos de un buen software. Los diseños que produce un equipo son un reflejo de su estructura organizativa. ::: ### Organization Design Requires Technical Expertise Given that there is increasing evidence for the homomorphism behind Conway’s law, it is very ineffective (perhaps irresponsible) for organizations that build software systems to decide on the shape, responsibilities, and boundaries of teams without input from technical leaders. :::info ¿Qué sentido tiene que RRHH o Negocio decida lal estructura de un equipo técnico? ::: ### Restric Unnecessary Communication Not all communication and collaboration is good. Thus it is important to define “team interfaces” to set expectations around what kind of work requires strong collaboration and what doesn’t. Many organizations assume that more communication is always better, but this is not really the case. The aim is **focused communication**. If we can achieve low-bandwidth communication—or even zero-bandwidth “communication—between teams and still build and release software in a safe, effective, rapid way, then we should. ### Everyone Does Not Need to Communicate with Everyone If the organization has an expectation that “everyone should see every message in the chat” or “everyone needs to attend the massive standup meetings” or “everyone needs to be present in meetings” to approve decisions, then we have an organization design problem. Conway’s law suggests that this kind of many-to-many communication will tend to produce monolithic, tangled, highly coupled, interdependent systems that do not support fast flow. More communication is not necessarily a good thing. ## 3. Team-First Thinking Teams working as a cohesive unit perform far better than collections of individuals. Who is on the team matters less than the team dynamics; and when it comes to measuring performance, **teams matter more than individuals**. Use small-long-lived teams as standard. ### Smaller Size Fosters Trust Anthropological research shows that the type and depth of relationship we can have with people has clear limits. #### Dunbar's number * ~5 — Limit of people with whom we can hold close personal relationships and working memory * ~15 — Limit of people with whom we can experience deep trust * ~50 — Limit of people with whom we can have mutual trust * ~150 — Limit of people whose capabilities we can remember Consequences: * A single team: around 5 to 8 people (based on industry experience) * In high-trust organizations: no more than 15 people * Families (“tribes”): groupings of teams of no more than 50 people * In high-trust organizations: groupings of no more than 150 people * Divisions/streams/profit & loss (P&L) lines: groupings of no more than 150 or 500 people ![](https://i.imgur.com/gLBHfQe.png) ### Work Flows to Long-Lived Teams The best approach to team lifespans is to keep the team stable. Teams should be **stable but not static**, changing only occasionally and when necessary. Teams for stages of the Tuckman model: 1. **Forming**: assembling for the first time 2. **Storming**: working through initial differences in personality and ways of working 3. **Norming**: evolving standard ways of working together 4. **Performing**: reaching a state of high effectiveness However, in recent years, research by people like Pamela Knight has found that this model is not quite accurate, and that storming actually takes places continually throughout the life of the team. Organizations should continually nurture team dynamics to maintain high performance. ### The Team Owns the Software Team ownership helps to provide the vital “continuity of care” that modern systems need in order to retain their operability and stay fit for purpose. #### Multiple horizons * **Horizon 1** covers the immediate future with products and services that will deliver results the same year. * **Horizon 2** covers the next few periods, with an expanding reach of the products and services. * **Horizon 3** covers many months ahead, where experimentation is needed to assess market fit and suitability of new services, products, and features. **Every part of the software system needs to be owned by exactly one team**. This means there should be no shared ownership of components, libraries, or code. Individual team members should not feel like the code is theirs to the exclusion of others. Instead, teams should view themselves as stewards or caretakers as opposed to private owners. Think of code as gardening, not policing. ### Team Members Need a Team-First Mindset The people within our teams also have (or develop) a team-first mindset. However, even with coaching, **some people are unsuitable to work on teams** or are unwilling to put team needs above their own. These people are “team toxic” and need to be removed before damage is done. :::info Clave a la hora de contratar o decidir quién está en el equipo y quién no. ::: ### Embrace Diversity in Teams Teams with members of diverse backgrounds tend to produce **more creative solutions** more rapidly and tend to be better at empathizing with other teams’ needs. ### Reward the Whole Team, Not Individuals Looking to reward individual performance in modern organizations tends to drive poor results and damages staff behavior. With a team-first approach, the whole team is rewarded for their combined effort. With a team-first approach, the whole team rather than each individual gets a **single training budge**. ### Restrict Team Responsabilities to Match Team Cognitive Load Cognitive load was characterized in 1988 by psychologist John Sweller as “*the total amount of mental effort being used in the working memory*”. Sweller defines three different kinds of cognitive load: * **Intrinsic cognitive load** — relates to aspects of the task fundamental to the problem space (e.g., “What is the structure of a Java class?” “How do I create a new method?”) * **Extraneous cognitive load** — relates to the environment in which the task is being done (e.g., “How do I deploy this component again?” “How do I configure this service?”) * **Germane cognitive load** — relates to aspects of the task that need special attention for learning or high performance (e.g., “How should this service interact with the ABC service?”) Broadly speaking, for effective delivery and operations of modern software systems, organizations should attempt to: * **minimize intrinsic cognitive load** through training, good choice of technologies, hiring, pair programming, etc. * **eliminate extraneous cognitive load**: automate away boring or superfluous tasks or commands that add little value to retain in the working memory, **leaving more space for germane cognitive load** (which is where the “value add” thinking lies). ### Limit the Number and Type of Domains per Team #### Types of Domains * **simple**: most of the work has a clear path of action) * **complicated**: changes need to be analyzed and might require a few iterations on the solution to get it right) * **complex**: solutions require a lot of experimentation and discovery. #### Heuristics * assign each domain to a single team * a single team should be able to accommodate two to three "simple" domains * a team responsible for a complex domain should not have any more domains assigned to them—not even a simple one. * avoid a single team responsible for two complicated domains. ### Match Software Boundary Size to Team Cognitive Load Instead of designing a system in the abstract, we need to design the system and its software boundaries to fit the available cognitive load within delivery teams. This team-first approach to software boundaries leads to favoring certain styles of software architecture, such as small, decoupled services. To **increase the size of a software subsystem or domain for which a team is responsible**, tune the ecosystem in which the team works in order to maximize the cognitive capacity of the team (by **reducing the intrinsic and extraneous types of load**): * Provide a team-first working environment (physical or virtual). * Minimize team distractions during the workweek by limiting meetings, reducing emails, assigning a dedicated team or person to support queries, and so forth. * Change the management style by communicating goals and outcomes rather than obsessing over the "how". * Increase the quality of developer experience (DevEx) for other teams using your team’s code and APIs through good documentation, consistency, good UX, and other DevEx practices. * Use a platform that is explicitly designed to reduce cognitive load for teams building software on top of it. #### Tip Having workspaces that clearly indicate the type of work going on also helps reduce disturbance and unnecessary interruptions. :::info En el caso de Bifer algo tan simple como que las parejas que trabajen juntas deberían sentarse juntas. Los equipos deberían estar físicamente juntos. ::: # PART II: Team Topologies that Work for Flow ## 4. Static Team Topologies In order to be as effective as possible, we need to consciously design our teams rather than merely allow them to form accidentally or haphazardly. We call these consciously designed team structures team topologies. ### Team Anti-patterns * **Ad hoc team design**. (DBA Teams, Middleware teams,...) * **Shuffling team members**. The cost of forming new teams and switching context repeatedly gets overlooked. Organizations must design teams intentionally by asking these questions: Given our skills, constraints, cultural and engineering maturity, desired software architecture, and business goals, which team topology will help us deliver results faster and safer? ### Succeful Team Patterns We consider a **feature team** to be a cross-functional, cross-component team that can take a customer facing feature from idea all the way to production, making them available to customers and, ideally, monitoring its usage and performance. The key for the team to remain autonomous is for **external dependencies to be non-blocking**, meaning that new features don’t sit idle, waiting for something to happen beyond the control of the team. Non-blocking dependencies often take the form of **self-service capabilities**. These can be consumed independently by the product teams when they need them. ## 5. The Four Fundamental Team Topologies * Stream-aligned team * Enabling team * Complicated-subsustem team * Platform team When used with care, these are the only four team topologies needed to build and run modern software systems. ![](https://i.imgur.com/dygdiYj.png) ### Stream-Aligned Teams A stream-aligned team is a team aligned to a single, valuable stream of work. **The purpose of the other fundamental team topologies is to reduce the burden on the stream-aligned teams**. #### Capabilities within a Stream-Aligned Team * Application security * Commercial and operational viability analysis * Design and architecture * Development and coding * Infrastructure and operability * Metrics and monitoring * Product management and ownership * Testing and quality assurance It’s critical not to assume each capability maps to an individual role in the team. We’re talking about being able, as a team, to understand and act upon the above capabilities. This might mean having a **mix of generalists and a few specialists**. ### Enabling Teams An enabling team is composed of specialists in a given technical (or product) domain, and they help bridge this capability gap. The end goal of an enabling team is to increase the autonomy of stream-aligned teams. If an enabling team does its job well, the team that it is helping should no longer need the help from the enabling team after a few weeks or months; there should not be a permanent dependency on an enabling team. Enabling teams do not exist to fix problems that arise from poor practices, poor prioritization choices, or poor code quality within stream-aligned teams. ### Complicated-Subsystem Teams A complicated-subsystem team is responsible for building and maintaining a part of the system that depends heavily on specialist knowledge. The goal of this team is to reduce the cognitive load of stream-aligned teams working on systems that include or use the complicated subsystem. ### Platform Teams The purpose of a platform team is to enable stream-aligned teams to deliver work with substantial autonomy. “Ease of use” is fundamental for platform adoption and reflects the fact that platform teams must treat the services they offer as products that are reliable, usable, and fit for purpose, regardless of if they are consumed by internal or external customers. Platform teams are expected to focus on providing a smaller number of services of acceptable quality rather than a large number of services with many resilience and quality problems. The mission for a platform team is to provide the underlying internal services required by stream-aligned teams to deliver higher level services or functionalities, thus reducing their cognitive load. #### Avoid Silos in the Flow of Change :::danger Teams composed only of people with a single functional expertise should be avoided. ::: We combine stream-aligned teams that support and operate software in production together with platform teams that provide the underlying “substrate” for stream-aligned teams. Work is never handed off to another team for a later stage in the flow. #### A Good Platform is just Big Enough The platform always serves the need of consuming applications and services, not the other way round. Too often, a platform is left to former system administrators to build and run without using well-defined software development techniques. :::warning Debería haber alguien con una mentalidad más de desarrollo. ::: We should aim for a thinnest viable platform (TVP) and avoid letting the platform dominate. The most important part of the platform is that it is built for developers. It is essential to ensure that the platform teams have a focus on user experience (UX) and particularly developer experience (DevEx). An attention to good UX/DevEx will make the platform compelling to use, and the platform will feel consistent in the way the APIs and features work. How-to guides and other documentation will be comprehensive (but not verbose), up to date, and focused on achieving specific tasks, not documenting every last corner and niche of the platform. ## 6. Choose Team-First Boundaries A **fracture plane** is a natural seam in the software system that allows the system to be split easily into two or more parts. A **bounded context** is a unit for partitioning a larger domain (or system) model into smaller parts, each of which represents an internally consistent business domain area. The business domain fracture plane aligns technology with business and reduces mismatches in terminology and “lost in translation” issues, improving the flow of changes and reducing rework. # PART III: Evolving Team Interactions for Innovation and Rapid Delivery ## 7. Team Interaction Modes Three core team interaction modes that simplify and clarify the essential interactions needed between teams building software systems: * **Collaboration**: two teams work closely together for a defined period to discover new patterns, approaches, and limitations. Responsibility is shared and boundaries blurred, but problems are solved rapidly and the organization learns quickly. * **X-as-a-Service**: one team consumes something (such as a service or an API) provided “as a service” from another team. Responsibilities are clearly delineated and—if the boundary is effective—the consuming team can deliver rapidly. The team providing the service seeks to make their service as easy to consume as possible. * **Facilitating**: one team helps another team to learn or adopt new approaches for a defined period of time. The team providing the facilitation aims to make the other team self-sufficient as soon as possible, while the team receiving the facilitation has an open-minded attitude to learning. ![](https://i.imgur.com/NH15Ew5.png) A key decision is whether to collaborate with another team to achieve an objective or to treat the other team as providing a service. :::info Intermittent collaboration found better solutions than constant interaction. ::: **Example** ![](https://i.imgur.com/4YqvPP9.png) :::info Teams should ask: "What kind of intereaction should we have with this other team?" ::: ### Collaboration |Advantages|Disadvantages| |----------|-------------| |Rapid innovation and discovery|Wide, shared responsibility for each team| |Fewer hand-offs|More detail/context needed between teams, leading to higher cognitive load| ||Possible reduced output during collaboration compared to before| :::danger **CONSTRAINT** A team should use collaboration mode with, at most, one other team at a time. A team should not use collaboration with more than one team at the same time. ::: :::success **TYPICAL USES** Stream-aligned teams working with complicated-subsystem teams; stream-aligned teams working with platform teams; complicated-subsystem teams working with platform teams. ::: ### X-as-a-Service |Advantages|Disadvantages| |----------|-------------| |Clarity of ownership with clear responsibility boundaries|Slower innovation of the boundary or API| |Reduced detail/context needed between teams, so cognitive load is limited|Danger of reduced flow if the boundary or API is not effective| :::danger **CONSTRAINT** A team should expect to use the X-as-a-Service interaction with many other teams simultaneously, whether consuming or providing a service. ::: :::success **TYPICAL USES** Stream-aligned teams and complicated-subsystem teams consuming Platform-as-a-Service from a platform team; stream-aligned teams and complicated-subsystem teams consuming a component or library as a service from a complicated-subsystem team. ::: ### Facilitating |Advantages|Disadvantages| |----------|-------------| |Unblocking of stream-aligned teams to increase flow|Requires experienced staff to not work on “building” or “running” things| |Detection of gaps and misaligned capabilities or features in components and platforms|The interaction may be unfamiliar or strange to one or both teams involved in facilitation| :::danger **CONSTRAINT** A team should expect to use the facilitating interaction mode with a small number of other teams simultaneously, whether consuming or providing the facilitation. ::: :::success **TYPICAL USES** An enabling team helping a stream-aligned, complicated-subsystem, or platform team; or a stream-aligned, complicated-subsystem, or platform team helping a stream-aligned team. ::: ## Team Behaviors for Each Interaction Mode Each team-interaction mode works best with a corresponding set of team behaviors. The band (or team) is made up of the same people, but the style they adopt as a group changes depending on the kind of effect they need to have. Behavioral studies suggest that humans work best with others when we can predict their behavior. As humans, we can build trust by providing consistent experiences for others in the organization. Clear roles and responsibility boundaries help this by defining expected behavior and avoiding what some refer to as “invisible electric fences. Teams behaviors for... * Collaboration: **High interaction** and **mutual respect** * X-as-a-Service: **Emphasize the user experience** * Facilitating: **Help** and **be helped** ## 8. Evolve Team Structures with Organizational Sensing A successful modern organization needs to be able to shape-shift to deal with these changing circumstances by designing for adaptability. Interaction modes of different teams should be expected to change regularly, depending on what the teams need to achieve. ![](https://i.imgur.com/Nk3UmZu.png) In a discovery phase, some degree of collaboration is expected, but close collaboration often doesn’t scale across the organization. The aim should be to try to establish a well-defined and capable platform that many teams can simply use as a service. The team topologies within an organization change slowly over several months, not every day or every week. Over a few months, change should be encouraged in the team interaction modes, and a corresponding change should be expected in the software architecture. It’s often difficult to have the required organizational self-awareness to detect when it’s time to evolve the team structure. ## Trigger: Sofware Has Grown Too Largo for One Team Successful software products tend to grow larger and larger as more features get added and more customers adopt the product. While initially it is possible that everyone in the product team has a fairly broad understanding of the codebase, that becomes increasingly more difficult over time. This can lead to an (often unspoken) specialization within the team regarding different components of the system. Requests that require changes to a specific component or workflow **routinely get assigned to the same team member(s)**, because they will be able to deliver faster than other team members. The routine aspect can also **negatively affect individual motivation**. ## Trigger: Delivery Cadence Is Becoming Slower ## Trigger: Multiple Business Services Rely On a Large Set of Underlying Services ## Treat Teams and Team Interactions as Senses and Signals Organizational sensing uses teams and their internal and external communication as the “senses” of the organization. Without stable, well-defined neural communication pathways, no living organism can effectively sense anything. In today’s network-connected world, high-fidelity sensing is crucial for organizational survival, just as an animal or other organism needs senses to survive in a competitive, dynamic natural environment. ## The Business as Usual Antipattern Avoid “maintenance” or “business as usual” (BAU) teams whose remit is simply to maintain existing software. Having separate teams for new-stuff and BAU also tends to prevent learning between these two groups. Each stream-aligned team should expect to look after one or more older systems in addition to the newer systems they are building and running. ## Final Toughts The reason so many organizations experience so many problems with software delivery is because most organizations have an unhelpful model of what software development is really about. An obsession with “feature delivery” ignores the human-related and team-related dynamics inherent in modern software, leading to a lack of engagement from staff, especially when the cognitive load is exceeded.