There’s a great blog post doing the rounds today, titled “Every attempt to manage academia makes it worse“. Going through a number of examples of metric-based assessment, the conclusion is that standard management practice applied to academic work results in obviously worse outcomes.

At the heart of the argument is an interesting contradiction – that it is possible to assess academic work and show that under a specific regime the results are less good, while simultaneously it is impossible to assess the results of academic work in such a way as to improve it. However, it’s possible to accept a slightly weaker form of the argument – that the practice of measuring while science is being done negatively affects the work in a way that appraising the results post-facto doesn’t. I’m not in a position to really know whether or not this is genuinely the case for academic work, but I’m seeing people apply the same argument to software development, and I truly believe it doesn’t apply.

There is an interesting parallel here with the practice of estimation and project management to software development. As I noted previously, there are many people who believe that software development is substantially more complex than can be managed with standard approaches. This is, in a way, true – software is incredibly complex, and as an engineering discipline we are still very much in the early days.

It is absolutely the case that when you incentivise people in a specific way, you have to expect that it will change their behaviour – that is the entire point. Measure people by lines of code, and you’ll get a lot of code. Nobody will deny that bad goals and poor incentives will produce excessively negative behaviour.

The quote from Tim Harford is interesting:

The basic principle for any incentive scheme is this: can you measure everything that matters? If you can’t, then high-powered financial incentives will simply produce short-sightedness, narrow-mindedness or outright fraud. If a job is complex, multifaceted and involves subtle trade-offs, the best approach is to hire good people, pay them the going rate and tell them to do the job to the best of their ability.

Unfortunately I think the idea here is vulnerable to the No True Scotsman fallacy – can we define what “everything that matters” is? Probably not, and therefore we can never meet this test – so we should just throw our hands up, hire good people and pay them the going rate. And even if we could measure everything that matters, it would still be an issue to relate these different factors together in a way that allows us to find an optimal solution (assuming such an optimum even exists!).

It’s clear that this wouldn’t apply in selected jobs. Simply hiring good people and paying them the going rate – or even over the odds – is not going to win football matches or produce world record marathon performances, yet it’s difficult for me to see how we could be able to measure everything that matters in either scenario – weather and climate alone have a significant role to play, for example.

And on the face of the argument, the metric itself doesn’t matter. Measuring grant attainment, for example, produces a second-order effect that time spent writing grants going up – and time spent doing science presumably goes down. Rewarding in the inverse direction would ensure time spent writing grants would go down, instead, and all grant-based sources of funding would quickly dry up.

In fact, it would be interesting to know whether attempting to keep the rate static would produce a negative consequence – presumably so.

Fundamentally, observation absolutely changes the work being done, and it’s likely that measuring but doing nothing would still have a perverse impact compared to simply not measuring – the people on a team would need to be entirely unaware of measurements in order not to react to them.

But, I personally think we should be able to expect better results than this. For one, the various metrics discussed on the page are pretty much lagging indicators, and the consistent use of such metrics is known to lead to problems.

More importantly, though, it really doesn’t address team dynamics. It is the case that metrics can be used to attempt to optimise the work of individuals, but more importantly they can be used to align the members of teams.

I don’t believe it’s possible to simply pull people together and rely on people to “do their best” within a team situation; I think you have to set goals and direction, and that measuring the output of the team is a critical factor. Being able to answer questions like;

  1. are the people on the team successfully doing the job they were hired for?
  2. is the team making progress on the project at the expected rate?
  3. does the team have the required resources to best complete the project, or do we need to change the make-up?

One of the interesting things about agile methods is that they give teams – and individuals – a large amount of latitude, but also significant constraints, usually a time box. Having this type of responsibility and accountability tends to produce very good results – and while agile teams are often self-organizing, this is not to say that there is no management in place (holacracy does exist, but is still extremely rare in practice).

Which really brings me to my point. Metrics, particularly lagging ones rather than leading ones, are often closely tied to significantly valuable goals within an organisation, but while powerful can be quite blunt instruments. Rather than attempt to exhaustively enumerate all the metrics that are relevant to any piece of work it is much more important to test and experiment with those factors that make the best difference, and having such metrics in the context of a broader approach to performance is crucial.