Three key things for delivering software predictably
October 18, 2021
A vast majority of software engineering is translating business requirements into products that bring user delight and deliver business value. Most of the innovation in most of the companies is on the business and product side, while engineering is a lot about taking correct decisions from the vast technical knowledge and resources which already exist. Hence, businesses expect software engineering teams to deliver software predictably, efficiently, and with quality, which seldom happens. Isn’t it ironic that the most modern branch of engineering built on extremely precise machines is so prone to errors?
To explain each briefly, predictability is the ability to correctly estimate project completion dates, estimate the impact of unplanned work on planned projects, estimate the amount and kind of resources before starting a project and predict resource availability. Efficiency is how much you can achieve in the limited number of resources you have. More mature organizations would measure this in terms of outcomes and impact rather than regressive metrics like lines of codes / APIs, etc. Lastly quality has two dimensions - performance and defects. Let us focus on predictability for the remainder of the article.
Predictability depends upon planning and execution. The plan must be as close to reality as possible and provide clear and verifiable milestones. The execution must be as close to the plan as possible and the deviance from the plan clearly understood. Finally, neither of them are static processes in themselves and must constantly feed into each other.
Plans are nothing, planning is everything.
– President Dwight D Eisenhower
There are many many things which contribute to it. From my experience of building and scaling several engineering teams at MindTickle as it grew from 20 to 200 engineers and from a valuation of $50M to over $1B, it became increasingly difficult to manage software deliveries and I learned some valuable lessons in the process. In hindsight, I can boil the gist down to three key things -
Your ability to iterate
Unknowns act as an upper bound on your planning accuracy. Any project involves extending the system horizontally or vertically and is prone to unknowns. One horizontally extends the system by adding new services, components, etc. This is usually easier if the components of an existing system have clear boundaries and interfaces. Then such tasks are more predictable because fewer unknowns are resulting due to the existing system. Vertically extending the system involves making changes to existing components, classes, APIs, etc. This is easier if the code is easy to understand and the impact of a change is predictable. Unit tests help greatly by providing an upper bound to how much and what can break. The sooner a developer can discover defects in the development cycle and fix them, the lesser the unknowns and more confidence in the milestones which have been achieved.
Secondly, the lack of ability to iterate has a severe compounding effect on your ability to solve any other problem. Your existing ecosystem is your (or lack of) code, documentation, your test cases, product requirements, functional and non-functional specifications, CI/CD, infrastructure, code merge processes, project management processes (scrum, agile, etc), software defect lifecycle processes, product development lifecycle processes, support processes, etc. Each of these contributes to predictability, efficiency, and quality in varying proportions and may look like a maze. But improving any of these involves a change to the existing system and if you cannot change your system predictably, the problems start snowballing.
So, what does this practically mean? When designing a system or writing code, we introduce abstractions and boundaries so that changing those pieces is easier later. But if you design for everything to be changed, then you will be in an abstraction hell which will be a maze to navigate. If you are too short-sighted and have built only what was most immediately required, you will need to replace most of it. So, it requires judgment to decide what flexibility is necessary to introduce into the system. All abstractions come at a cost and if everything is flexible, then nothing is. This understanding of what should be flexible should come from a deeper understanding of the product and the future strategy. This means understanding what kind of areas the product may expand to, what kind of scale it may be subjected to and which business requirements are well-validated and need to be built solid while which are still yet to be validated and must be built to experiment and measure.
Clear and verifiable milestones
Whether you use scrum, agile, kanban, scrumban, or any other project management methodology, the key sauce is to have clear and verifiable milestones. These milestones must be verifiable by any stakeholder in form of sprint demos, working features, etc. The way to do this is to structure the milestones as features meant to be delivered incrementally part by part which can be demoed to all the stakeholders, rather than engineering tasks which are meant to be integrated at the end and which only the engineer can claim are completed. The most critical element of execution is knowing the current progress or status. A progress report having clear verifiable milestones is much more effective than % of listed down tasks that have been completed. It is desirable to have more frequent horizontal integrations and get feedback on defects and user experience from all the stakeholders earlier than later than stacking it up to the end. It is a common story that once different areas such as backend, frontend, etc. finish their respective tasks and then integrate them for a working product several new unplanned tasks are discovered. Secondly, all the milestones need to add up to the final deliverable. This is also more likely if the milestones reflect product deliverables rather than engineering tasks. Ultimately, project success is about achieving the intended business outcome which means delivering the planned tasks as well as integrating the early feedbacks and discoveries made during the development itself.
Ability to share tasks among team members
The second practical aspect is the availability of resources to execute the plan. Resource availability has two dimensions - number and skill. The skill dimension explodes when you need a specific engineer for doing a specific task too often. This usually happens due to accumulated but undocumented tribal knowledge. It also results in highly inefficient teams since it becomes very difficult to equally assign tasks to every individual for a given milestone. It consequently leads to a regression from having clear verifiable milestones to having a laundry list of engineering tasks. This subsequently leads to several projects running in parallel, projects lingering for a long time due to a few pending items, consequently projects and plans going stale and hence affecting business outcomes. It makes hiring and onboarding people difficult and error-prone. Instead, if the team members can finish a pool of tasks collectively, planning and execution become much simpler and accurate. Keep things simple and similar when it comes to architecture and code. Make it easy for everyone to contribute. That is also a litmus test for code quality and the ability to iterate faster.