Metrics: Never Mind the Thresholds
Let's face it: managers love statistics. Having a bunch of numbers to look at, to track or to manipulate gives you the feeling of being in control of things, or at least of having a grasp on reality.
So when a manager asks me what our average cyclomatic code complexity is today, I don’t blame him. Still, I respond by asking why he wants to know.
I think I might literally roll on the floor laughing if I was told, for example, that we have too few lines of code (I heard a story about that once, and I hope the person who told it was kidding). But you do hear quite often that according to the contract, our test coverage should be above 90%, or that the maximum value of MCC should be 10. And that makes me sad.
I’m not going to argue here that none of the automated metrics reflect real code quality, and that the only metric you can trust is WTFPM: this argument would be long and technical, and would require too many code samples. I’m also not saying that you should never collect code metrics automatically. In fact, as a technical leader I like them very much. My point mainly is that managers should look at trends, not absolute values. Let me explain.
Let’s start with absolute numbers. At First Line, we like values like 80% for minimum unit/integration test coverage, and 20 for maximum cyclomatic complexity. Still, we don’t panic if a project has worse metrics, nor do we throw an office party every time a project exceeds these numbers. OK, so assume a project has 70% code coverage. If this value has been the same for the past 3 months, that is nothing in particular to worry about in my book. A technical leader could easily explain why it is not a good idea to expend the effort of covering the last 30% of the code with unit tests. However, if code coverage was at 90% last week, then I would guess something went wrong, and it needs to be investigated.
The same applies to code complexity. From my experience, measured across a large number of projects, the average code complexity is somewhere between 1.2 and 1.6. Should you be upset if your project has a higher number? Definitely not – most probably there’s a good explanation. What you should be wary of is the steady growth of the average code complexity. [As a small aside – nobody has been able to explain the meaning of this metric to me, so I don’t take it too close to heart]. Regarding the maximum value, sometimes it has to do with a method that sets up a test environment for integration testing or creates some god-object mock for unit tests; sometimes it’s a class factory; sometimes it’s a very complex algorithm copy-pasted direct from Wikipedia. Certainly, it is usually possible to refactor that to improve (decrease) code complexity, but in most cases that exercise doesn’t make too much sense. A technical leader should go through all ‘complex’ methods and decide which ones should be simplified. In fact, I would rather use ‘number of methods with code complexity higher than N’, where N is not the highest value (like 20) but a certain ‘threshold of suspicion’ (like 10).
Back to contractual thresholds for metrics: what should the team do if they cannot meet the requirements for objective reasons? Nobody likes to make changes to contracts, and so developers will need to either cheat, or waste customer’s money building a more complex solution than necessary, solely to satisfy the metrics requirement. When that happens, everything looks great on paper, but in fact you have just billed your customer for unnecessary work, or made your solution less maintainable, which is far worse.
To conclude, if you are a customer, try to find a team you believe you can work with, develop a relationship with them based on professional trust, and let the team do their work. If you are a vendor, and your customer wants a contractual guarantee for code quality, try to convince them that “I cross my heart and hope to die” is many times a more effective guarantee than metrics-related clauses in the statement of work.