# T1 Platform Changes
I was thinking about this discussion around projects and how they fit into the intake process and drafted the idea for a model that could help.
# Core Concept
Scrum teams are a single core unit that move together across the platform, and simultaneously take on **both projects and releases**. Coordination of the Scrum Teams are managed by the Platform Oversight team, **who will also operate as a scrum team** (SOS).
# Scrum Teams
All scrum teams will have a velocity allocation that is split between two buckets:
- what we consider “project work”.
- what we consider “maintenance”.
At most, any team can take on 1 project and depending on how involved that project is, it’ll split the remaining time with maintenance.
# WHOA WHOA, WHA- HOW COU-
I said keep an open mind! Let's first identify about the problems at the heart of this issue and how it helps "solve" it.
### Understanding "Maintenance" vs "Projects"
Maintenance is a catch all phrase for anything that isn't a project. You may have a different definition in mind but the reality is that we have large projects (think SIT, MPR, offers), we have small projects (think new module builds or anything else that spans multiple sprints), and we have small enhancements and bug tickets that we finish in bulk.
Additionally, large projects and some small projects have stake holder reviews, whilst the small enhancements and bug tickets don't have formal reviews, just an email to client and app leads.
### Clarity in total capacity
One of the problems we have right now is that it's hard to tell how much work we can take on as a team.
In the model where we have "dedicated maintenance" teams, there might be times where there is little to no work, but we can't assign them to "dedicated projects" because what if there is a maintenance need? This leads to a situation where allocation is lopsided while we await to onboard new producers, developers, etc to take on dedicated projects.
In the proposed solution, we can still have teams working on maintenance but we're treating it more as a load balance mechanic. Additionally, we know that **more than likely the most we can take on** as a team is N number of scrum teams before things start to back up on all fronts, at which time we respond by building an additional scrum team to handle incoming work.
### Transparency in resourcing at the smaller scale
The entire scrum model revolves around team efficiency, and part of that is due in large to the scrum masters. We have excellent scrum masters who are intimately aware of what their teams' velocity is and can forecast what it will look like in the next sprint - who better to ask if they are able to take on an additional ticket or take on a project? Instead of coordinating multiple parties to figure out if a non-specified, highly valuable backend developer will be available for a task in the next week, work with a scrum master to prioritize your ticket; if it can't happen in the time frame you need, the Platform oversight team will allocate it to the next available scrum master.
### Reacting to new requests
We currently have difficult time processing new requests for many reasons, one of which is "who is going to work on it?" We do away with the idea of waiting for resources to free up from one project and moving them to other projects - a metaphorical aligning of the stars to get a team together.
The proposed model makes the answer simple: any available scrum team. How do we know if a scrum team is available?
- Are they already assigned a project? Yes
- Are they already taking on high priority backlog tickets? Yes
- Move on to the next team, If none then we'll need to hire.
### Reacting to emergency/last minute requests
Emergency requests usually fall under the maintenance umbrella and take the form of high priority bugs found that we need resolved asap, whether it's because it's on production or threatening the next release. These types of events don't typically have time for standard intake and may require a specific resource to investigate an issue, or potentially all hands-on deck. As a result, it's beneficial for us to account for this by firstly allocating time for maintenance on every team to begin with, and transparently communicating with the scrum masters that there is an issue that requires a member of their team. All our issues get solve but with the added benefit of knowing exactly what the impact of such an event will have on the platform.
# Changes
## Platform Strategy Oversight and SOS changes
We currently hold a scrum of scrums standup and an intake two days a week; this is great, but it can be improved if we treated the PSO team as a true scrum of scrums. This does not necessarily mean that we must change the number of meetings but rather we should formalize some processes and roles.
### Roles:
- Scrum Master: Keeps everybody organized and helps clear blockers at the platform level
- Product Owner: has final say in matter pertaining to the platform, helps provide overall direction
Traditionally, these two roles are separated but at a higher level I don't find that the distinction is 100% necessary and can be merged into one role. Additionally, BAs help with a lot of the intake process so there is less responsibility on the scrum master, again at that high level.
### Processes:
- Standup: Can stay as is, meet twice a week since macro items don't tend to change as frequently.
- Backlog refinement + Sprint Planning: we can split this up or keep them combined, but this is essentially our current intake process.
- Sprint Review + Retro: We are currently missing a meeting that allows us to go over what was accomplished during the previous 2-week cycle at a high level, so it may be a good idea to hold a meeting that discusses progress, and a retro to review feedback and share ideas across scrum teams.
Note that traditionally a scrum of scrums **should be run as a scrum team,** and we can entertain that idea if the team is interested but I feel it's more important that we capture the core concepts rather than replicating the process exactly; as long as all our needs are met, the format and number of meetings are free to change.
## Scrum Team changes
Scrum teams will now be available for project work if the following criteria are met:
- The scrum team isn't already allocated on a project that requires stake holder meetings
- LDM and PSO are aware and accept the impact to "maintenance".
Teams can adjust their allocation anywhere from 100% Project to 100% Maintenance, at the discretion of the scrum master and PSO Team.
# My attempts to read your mind
## Won't we have a ton more meetings?
Not a ton more. Some of the reasons why this method is theoretically feasible is that:
- PSO already have meetings that are similar to SOS but doesn't have the formality:
- SOS standups should probably occur more frequently but may not need to
- Intake is essentially grooming and sprint planning combined
- All we are missing is retro, which we do in standup and intake.
- This method allows scrum teams to replace existing scrum team ceremonies with projects.
- The key insight here is that small items in the backlog and small projects that don't involve stake holders don't generally cause as much friction for scrum teams, and can be tackled either in a dedicated "maintenance" setting or as a small segment of a project teams' responsibilities.
- We should have **no more than one large project per team**, since these usually have a separate set of stake holder interactions that do not scale.
## How does this affect our release cadence?
It shouldn't! The release manager will still be part of PSO, and we'll still have regular releases every two weeks. In the unlikely event that all teams have taken up projects and there is absolutely nothing to deploy, we just skip that release.
## This will slow down maintenance!
No and Yes. The only way maintenance slows down is because the scrum teams currently handling maintenance have been reallocated.
The unfortunate fact of the matter is that something is going to slow down if we have X resources and X+ things to do, it's just in this case we'll be aware of it earlier and more transparent about it in the long run, rather than reacting to it once client points it out.
## How do we know this will even work?
Surprise! We already do a lot of these things today without knowing it. The concept is just a formalization of the process with QOL adjustments to the PSO so things can run smoother. Don't believe me?
We currently have projects teams today consisting of key senior UI members - if an urgent dealer tagging ticket comes in, or a production bug is found, who do you think addresses it? We don't pull more availability out of a hat, the resourcing is already shared but it's done via back channels and the project scrum master might not be aware until their project starts to slip.
We have a scrum master today who is a scrum master for BOTH a project team and a maintenance team, and they have to run two sets of scrum as well as attend PSO meetings - this model gives her one team to work with, one set of scrum meetings, and one set of PSO meetings.
We had a new module build (Bespoke Visualizer) that spanned a couple of months, which you could argue could have been a small project; during this time that scrum team was able to take on regular maintenance work without any real issues.
## Why the changes to PSO?
There are two specific reasons why there should be PSO changes to match their scrum team counter parts more similarly.
First, one of the problems we face today is that tech leads like me operate outside of a scrum team for the most part. Bigger projects that can't be assigned to a scrum team, but also require heavy tech lead involvement (think SSR or A/B testing) need to be tracked if we are going to make progress on it. "But you should manage your time and just get it done" you might be tempted to say, but the reality is that very often the matters of the day to day occupy our time that might not be visible to the team. To combat that and ensure steady progress is maintained when we decide to start working on these projects, all work should operate in a scrum format with a timebox and a scrum master to clear blockers.
Second, running PSO as a true scrum of scrum increases visibility to all parties including digital production leadership and thusly the business. "Where are we with SSR?" - a tech lead is giving a presentation on it this sprint review. "How are we doing with Accessibility?" - a scrum master is giving a burn down report at the sprint review. Sprint Reviews and Retros do not necessarily have to be "code sharing" but can be treated as a review of major items for the platform every sprint, and a space to share feedback and adjust for the upcoming sprint.
# Scenarios
Join me on this narrative as I try to give examples of how this model could work! The following will be loosely based off real events and setups, combined with some theoretical exercises. For ease of understanding, we'll be referring to scrum teams as Team A, Team B, etc. References to allocation will take the form of [Project%, Maintenance%].
The Teams:
- Team A: Search Inventory [95%, 5%]
- Team B: Maintenance [0, 100%]
- Team C: Maintenance [0, 100%]
The client has requested that we prioritize Accessibility. PSO reviews the backlog and determines that there is a lot of tickets open, that we should have a team fully dedicated to Accessibility but must keep room open for other work. It's decided Accessibility should be spun up as a Project, allocated to Team B to burn down as much as they can by end of fiscal.
The Teams:
- Team A: Search Inventory [95%, 5%]
- Team B: Accessibility [95%, 5%]
- Team C: Maintenance [0, 100%]
A request from TIPS comes in, and a senior producer is requesting updates for a prototype application. PSO reviews in grooming and determines that the scope of the request is minimal, doesn't require stake holder reviews and can just be allocated to Team C.
The Teams: (unchanged)
Business announces that we've been awarded Drivers, a project that requires its own scrum team. PSO has determined that we'll want a separate team for this since it has its own scope, and we need to keep maintenance available.
The Teams:
- Team A: Search Inventory [95%, 5%]
- Team B: Accessibility [95%, 5%]
- Team C: Maintenance [0, 100%]
- Team D: Drivers [95%, 5%]
App leads wants us to speed up the migration of LOD to OAT. PSO determines it's a significant amount of work, but the reality is that most of it is backend related. We convert Team C into an OAT Project team but only at 40% allocation in response to the specific needs.
The Teams:
- Team A: Search Inventory [95%, 5%]
- Team B: Accessibility [95%, 5%]
- Team C: OAT [40%, 60%]
- Team D: Drivers [95%, 5%]
Tragedy - COE performs an update that requires the entire backend team to fix the way they pull data from databases. Simultaneously, Apple decided to roll out a buggy update, causing many visual bugs to occur. The tech team must fully switch gears to address these issues as it's the utmost highest priority! Because we padded every team roughly 5%, we can afford approximately 4 hours of undedicated project work which Teams A, B, and D have left unallocated as a buffer for these situations, made possible by the fact that Team C was still working at 60% allocation on maintenance. In scenarios where project work is impacted, Scrum masters are fully aware and communicate with stake holders appropriately. Issues are resolved and teams continue their work with little impact to project work.
The Teams: (unchanged)
A tagging request comes through which typically a developer in Team A takes care of. PSO discusses with Scrum Master A whether they can take on such a request, Scrum Master A reviews velocity and determines their team can take on the work and adds it to the team backlog along with the rest of the project work.
The Teams: (unchanged)
Lexus Mexico needs a handful of new modules built and would like us to work with their team to integrate as soon as possible. PSO discusses and realizes that most of this work is UI only. Given that OAT is a dedicated task yes but does not require stake holder reviews and Team C offers to take on LMex and OAT simultaneously. PSO sees that this would put Team C at [90%, 10%], resulting in a massive reduction in maintenance output. Account goes to have a conversation with business, and it's determined that LMex is a priority but in response we're going to reduce Accessibility by 50%.
- Team A: Search Inventory [95%, 5%]
- Team B: Accessibility [45%, 55%]
- Team C: Lmex + OAT [95%, 5%]
- Team D: Drivers [95%, 5%]
We are awarded another project from business, very high visibility code name "Project Z". PSO recommends that we put accessibility is placed on pause, but legal indicates that we must maintain velocity at the risk of a lawsuit, and the business additionally wants to retain regular maintenance availability in case of last second MYCO requests.
At 55% left, we determine that there just aren’t enough resources available to us to take on another project as is. We spin up a new Team E
- Team A: Search Inventory [95%, 5%]
- Team B: Accessibility [45%, 55%]
- Team C: Lmex + OAT [95%, 5%]
- Team D: Drivers [95%, 5%]
- Team E: ProjectZ [95%, 5]
# Conclusion
I recognize the inefficiencies I identified in our processes are viewed from a highly specific lense from the position of one person in a large team and that the scenarios provided are potentially an over simplification of nuanced business dynamics.
I believe that the changes purposed can only help our team in the long run but I'm interested in hearing your thoughts. Think of any oversights? Scenarios that you feel could muddy up things? Send me your feedback and hopefully we can collaborate to find a solve.