A couple of days ago, I wrote the article Well Crafted Code Will Ship Faster. It contains some reasoning why well crafted code is also cheaper and faster, and some ideas what you can try next to get there (like, “first do things right, then do the right things”). This article sparked an interesting discussion on Twitter.
We should stop using misleading generalizations for concepts that exist on a scale. That's my point.
I agree the science we have is crap but does that mean we should stop trying?
I really liked the discussion, but I think some people did not understand what I wanted to say with my original article. I guess this is mainly because I did not express myself clearly enough. And in some cases, I just disagree.
Chad Fowler also sent me the link to his very interesting presentation McDonalds, Six Sigma, and Offshore Outsourcing: Unexpected Sources of Insight during this discussion. You should watch it, it’s great… Just, I would do some things differently (I would never quote the chaos report ;) ), and I disagree with some of his points.
Here is a quick summary of all the criticism / feedback / ideas I got in the twitter discussion:
- Most of the concepts in the article ("Quality", "Well crafted code", ...) are not universal, but highly subjective and opinionated and context dependent.
- Some things I wrote ("defect", "efficiency") are so generic that they are almost meaningless.
- The studies I quoted only apply to "industrial style" software development, not small teams / startups / ...
- The science we have is crap.
- We should stop using misleading generalizations for concepts that exist on a scale.
- A more nuanced discussion would better reach the target.
- Internal quality is nearly unimportant for software to function well / Your users don't care about internal quality.
Here I want to clarify some of the things I wrote, define some of the concepts and provide more arguments where I disagree. This is going to be a loooong article, but please bear with me. I will try to show you my perspective, and I hope you can learn something. Or start another interesting discussion when you disagree with me ;)
All The Answers...
Here comes basically the disclaimer of this article :) During our twitter discussion, Chad Fowler wrote:
I've done a lot of thinking about this. I don't have answers but I really enjoy it :)
I have done a lot of thinking about this too. And I have some answers. Here, I tried to write them down. But I don’t want to claim that those are universal truths.
Those answers work well for me, right now, most of the time. And from discussions with other developers, I know they work well for others too. I hope they can help you to:
- Start thinking about those topics / Show you one perspective on the topic.
- Start discussions with your coworkers / managers.
- Start investigating where you are losing time (and money).
- Start improving.
Cycle Time and Lead Time
I am using the definitions for those two terms that are used in software development most of the time:
Lead Time is, roughly speaking, the time from when we first have an idea about a feature until our users get the updated software with the implementation of the feature present.
Cycle Time is, roughly speaking, the time from when we start development of a feature until we finish it’s implementation in a tested, documented, potentially shipable product.
Note that this definition differs a bit from the definition in manufacturing, where “cycle time” is the average time between products created in a process, i.e. “Assembly line 3 produces a new razor avery 5 seconds”. I actually like this definition more, but nobody in software development uses it, so let’s stick with the definition above.
...Will Ship Faster
By “will ship faster” I mostly mean “will have shorter average cycle times”. Well crafted code does not influence things that happen before we even start a feature. But it definitely has an influence on how long we need to develop, test and document the feature. So it will influence our cycle time
The lead time will be shorter too, since cycle time is a part of the lead time: It will be shorter by the amount of time we saved in cycle time. With well crafted code, it might also be easier to go from “potentially shippable” to “shipped”, so in some cases the lead time might be even shorter than that.
Speed and Cost
With a stable team, cost is a function of speed: The cost is dominated by the salaries of the team members. So, if “will ship faster” holds, we will also be cheaper.
Everything is more complicated when the team is not stable: When the team size or composition changes, the “per person output” becomes lower. This effect is probably worse when you grow a team than when you shrink it (see also Original Scope On Time On Budget - Whatever, section “The Problem With On Budget”).
Speed and Scope
If you are faster, you will deliver more. Obvious, right? But there is an important aspect here that we often forget:
Requirements change (~27% within a year). If you have long lead times, some of the features you ship are already outdated by the time the users get the software. So if you can deliver faster (reduce the lead times of your features), you have a better chance of delivering something the user actually wanted. You have a better chance of delivering quality software (see below).
So, if you are faster, you do not only deliver more features. You also deliver less features that are outdated by the time the user gets them. You deliver even more useful features.
The Cost Of A Feature
Say you have a long running software development effort. The team is stable and experienced. You need to add a feature F_n, where n is the index of the feature (F_1: You implement it as the first feature, right after starting; F_100: You have already implemented 99 other features before starting this one). Does it make a difference whether you implement the feature in the first month or after 6 years of development?
Yes it does. In the beginning you lose some time because you don’t have all the infrastructure in place: You need to setup version control, the CI server, the test environments. You will change your architecture and design a lot, because you are trying to figure out what is right. You will lose some time because you have to create a first sketch of all the layers and multiple components of your software.
Later, you lose some time because you don’t immediately understand the effects of your changes. You have to search for all the places where you have to do changes. Make sure you don’t cause any side effects. Read the documentation and find out where it is still accurate. Work around that one weird design that we added because of BUG1745. Find out why there is some inconsistency in the architecture and whether it affects you. You are slower (cycle time, see above) because of Accidental Complication.
Developing a feature in one year will cost more than developing the same feature now. We want to minimize the difference, though.
The big question here is: How much slower is implementing feature F (“Log all logins of priviliged users”) as F_100 than it would be if it was F_1? Two times? Ten times? Not slower at all? How much slower would F_1000 be? I have seen some very large values for the slow down factor in past projects. If you like to use the term “Technical Dept”, you could say the slowdown was caused by not paying back the dept.
Minimizing F_n (For Large n)
To minimize the cost of adding features later, you have to:
- Make sure everybody understands the architecture of the system.
- Make sure everybody understands the current functionality that is implemented and that the description of this functionality is accurate.
- Minimize rework caused by not understanding requirements correctly.
- Make sure developers can easily find all the places where they have to make changes when they are working on a feature / defect.
- Make sure changes can be made locally (i.e. there are no side effects that ripple through the system when you make a simple change).
- Make sure developers can find out quickly when there are side effects or regressions.
- Make sure that no defects escape to production, and if they escape, that you find and fix them quickly.
I am pretty sure all those things are necessary, but I guess they might not be sufficient (i.e. you have to do even more in your specific situation).
Quality is really hard to define. I know, and I agree. But it is not entirely subjective.
There are two important aspects of quality: External quality is the quality as experienced by the user, and internal quality is the quality as experienced by the developers (I wrote more about this in my free email course Improve Your Agile Practice).
I want to call “internal quality” “well crafted code” for now (see below) and focus on external quality from now on when I say “quality”. I think there are two important conditions for quality:
Absence of defects Software with less defects has higher quality.
Does what the user wants Fulfills the requirements the users currently have, which may or may not be the requirements they wrote down 6 years ago (see above).
Both conditions are necessary, but not sufficient. Quality has lots of other aspects too, some of which are, in fact, subjective.
Defects are hard to define too. I often hear the very simple definition “A defect is when the software does not conform to its specification”.
Which leads to behavior like: “Yes, the software does not work correctly, but it was specified exactly like that in the requirements, so it is no defect”. If you still have discussions like that, you probably value processes and tools more than individuals and interactions. Comprehensive documentation more than working software. Contract negotiation more than customer collaboration. Following a plan more than responding to change. You are probably not very agile. You maybe are not ready for this “well crafted code” discussion yet. But the good news is: You can start to improve now.
Honestly, “does not conform to specification” does not make sense as a definition of a defect when our goal is providing a steady stream of value to the customer. On the other hand, “was reported as a defect” is also not enough.
Do you have a product vision and a list of things your product definitely does not do? (Hint: you should have). Then I would classify a defect as: “Was reported by someone as a defect; Given our current product vision, a reasonable person would expect the expected behavior described in the defect; Does not fall into the list our product definitely does not do”.
Low Defect Potential, High Defect Removal Efficiency
Those two terms are from a study that was quoted in a book I quoted in the original article. I don’t know how they defined the two terms in the original study, and maybe “The science we have is crap” anyway, so I want to come up with my own definitions that make sense to me.
Low defect potential The chances of introducing a defect with a change are lower than in comparable teams / systems.
High defect removal efficiency The average lead time and cycle time for fixing defects are lower than in comparable teams / systems.
Getting good at avoiding and fixing defects is essential for delivering high-quality software
Both are necessary to deliver a high quality product, since high quality means (among other things) absence of defects.
The Cost Of Defects
Defects are expensive. Your team has to find, fix and deliver them. If the defect escapes to production, you often have to put out fires in operations while you are developing the fix. Your users cannot do their work - They are losing time and money too. You lose reputation.
The later you find a defect, the more expensive it becomes. A defect found in the specification… - You know the numbers. But if a defect escapes to production, it does not stop there. It does matter if you find a defect within a week or after two years!
When you find a defect in production a week after deploying the code, it is probably still rather cheap to fix: All the original developers are still there, they still remember what they did, the documentation is still accurate. And: there is not much new code that depends on the code with the defect. When you find the defect after two years, all of those will have changed. It is way more expensive to fix it.
Defects are rediculously expensive.
High-Quality Software and Speed
When your software has low (external) quality, i.e. has defects or does not do what the users want, you have to do a lot of rework. Rework is expensive, and while you do rework you cannot work on new features, so it slows you down.
High external quality in our software allows us to reduce the lead time of new features.
But: Some rework is necessary. Often we don’t know exactly what our users want. And they don’t know either. Most of the time, we don’t know beforehand how the perfect software would look and behave like. So we have to deliver non-finished software to gather feedback, and then get better based on that feedback.
Well Crafted Code
Maybe “well crafted” is somewhat subjective again, but we all know crappy code when we see it. So there must be some objective aspects that we can find about well crafted code.
Well crafted code is tested, and the tests are easy to understand and easy to read. The code has tests on different levels, and tests that tell me different aspects of the system (what is the functionality from a users point of view, how does the system interact with the outside world, how does it behave under load, how do all the single units in the system work, …)
Well crafted code is self documenting. When I read the code together with it’s tests, I want to be able to understand what is going on. Without reading some external documentation (wikis, …)
Well crafted code does only what it absolutely has to do. It’s design does not anticipate any possible future. It has a minimum number of design elements.
Well crafted code follows good software design and has a consistent, documented architecture.
Well crafted code now looks different than it did one, two or 10 years ago, because it was refactored continuously.
Good Software Design
Good software design is, in my opinion, less subjective than some other terms I described above.
You can follow a set of rules to come to good design, like the “Four Rules Of Simple Design” by Kent Beck. They don’t tell you exactly what you have to do in which situation, but they tell you about which things you should think when designing software.
You can learn about design pattern (if you use object oriented languages) or equivalent patterns and techniques when you use other types of languages.
You can apply some principles, like reducing coupling, improving cohesion, or the SOLID principles (if you use object oriented languages).
You can try to find better names and write more self documenting code by applying domain driven design.
Most of the things above are not subjective, some are even measurable. Maybe there are other ways to arrive at good design, but the ones above seem to work for many developers and teams.
It's Not Much Slower Anyway
The Microsoft study on TDD found that teams doing TDD where between 15% and 35% slower than teams not doing it (and all had higher quality). This is not very much. But you might say “All science we have is crap” again, so I’ll try to argue that well crafted code is not much slower. (If you don’t do over-engineering or gold plating, but I wouldn’t count this as “well crafted code” anyway).
Here is a situation I experienced several times, for myself and seeing fellow developers: You bang in a feature, without writing automated tests or really caring about quality, because you are in a hurry. And you are really quick! You start the application, and almost everything works - Great, you just have to fix this one little error, and then… You fire up the debugger. And you waste spend the next few hours debugging.
If you would have done it right in the first place, you could have saved a big part of that debugging time. You would fire up the debugger less often and when you had to, you would only debug a small, focused test, not the whole application.
Well Crafted Code and Speed
Let’s look back to “Minimizing F_n (For Large n)”:
- Self documenting code and consistent names from the domain make it easier for everybody to understand the architecture of the system.
- Executable specifications (a part of the test suite) make sure everybody understands the current functionality that is implemented. And that "documentation" cannot be outdated, because otherwise the tests would fail.
- A consistent design and architecutre makes sure developers can easily find all the places where they have to make changes when they are working on a feature / defect.
- The SOLID principles and high cohesion / low coupling make sure changes can be made locally (i.e. there are no side effects that ripple through the system when you make a simple change).
- The comprehensive test suite makes sure developers can find out quickly when there are side effects or regressions.
- Focussing on quality from the beginning makes sure that not many defects escape to production. Those that do escape can be fixed quickly, because making changes is easy (see above).
Your Users Do Care About Internal Quality - Indirectly
“Internal quality is nearly unimportant for software to function well, and it’s only function that counts. Your users don’t care about internal quality.” Really?
At some point, your users will recognize that it takes longer and longer until they get software that contains the new features they requested. And they have to pay more and more for it. And at that point, you will recognize that they actually do care about things like cycle time, lead time, and the marginal cost of features. Even if they don’t use exactly those words.
But by then it’s too late for you - At that point, you have a “Rescuing legacy code” project. And it will take time and money to get back on track - a lot of time and money.
Well, you could rewrite the whole thing from scratch, but that’s probably not a good idea either. I mean, some companies have pulled it off (I have been consulting some re-write projects that actually delivered in the end too…), but it will be more expensive and take longer than you think.
Putting it All Together
Yes, some of the terms I used in my original article (like “quality”, “well crafted”, “defect”, …) are a bit fuzzy and generic. But I still mean it:
Software with a high external quality will ship faster because we spend less time on rework and have more time to work on relevant stuff. Through reduced lead times, we can deliver more features before their original requirements become obsolete.
And about well crafted code:
Well crafted code (i.e. software with a high internal quality) will ship faster because writing new code and fixing defects is less risky (no side effects, immediate feedback from automated tests) and also faster, because we can quickly find the places we have to change (and there are fewer of those places than in crappy code).
Now you can go back to my original article, which hopefully makes more sense now. There you will find some practical considerations and also things you could try right now.
What If My Software Is Really Simple?
A former colleague once said TDD would not make sense for them “because our iOS app only consists of a user interface and some server calls, and you cannot really test those. There is no logic in between. And it’s really easy to test manually.” Well, maybe you can get away with it when you have an app like that.
But if you want to deliver such an app really often (multiple times per day, which is not possible with iOS apps anyway), you would need automated tests again, so the situation is not that clear cut.
Anyway, you can unit test those. I know at least one developer - Rene Pirringer - who creates iOS apps in a test driven way. And he also tests his user interfaces with fast, automated tests. He is really enthusiastic about what he is doing, and he told me he would not want to work in a different way again.
What If The Time Horizon Is Really Short?
What if the time left until the deadline is really short and our short term goals are so much more important than our long term strategy (i.e. “We are running out of money next month so we need the credit card form now”)?
Well, then you might get away with it, but how much time can you actually save? Doing it right is not that much slower anyway (see above).
Also, your actions now will come back and bite you later. And you don’t know when. You cannot possibly estimate when that really hard defect will come or when you’ll get really stuck on a feature because of the things you’re doing now. It might be in three years, but it might also be as soon as next week! And your actions now have potentially unlimited downside! (Well, almost. Unlimited is really a lot…)
So, even in this situation, think hard if being a few percent faster is actually worth it…
Throw Away Code (a.k.a Implement Everything Twice)
You could implement a feature in a quick and dirty way (as a Spike, maybe time-boxed, to learn something), then throw everything away and do it “right” the second time. The idea behind this is that the second time will be much faster than the first time, even though you’re doing it “right” now: You can incorporate all the learnings from the first time.
The only problem here (which is basically the same problem as with “Technical Dept”) is: You really have to follow through. You really have to go back, throw away the bad code, and do it again. And in many organizations, you will have a hard time to argue throwing away “perfectly working code” when the time pressure gets bad…
Read more about how software architecture and design impact your agility in my book “Quick Glance At: Agile Anti-Patterns”: Buy it now!
You might be also interested in: