Haystacks Without Needles?

I was speaking at a CIO conference last week, feeling a bit nervous that I was about to suggest that their collective Big Data efforts were delivering zero value. I needn’t have worried. They already knew.

Considering the Big Data industry invoiced $20B last year, that’s quite an admission of failure. Almost enough, one might guess, to merit some kind of investigation. On that point, however, I suddenly felt very much alone again. Everyone I spoke to seemed to reluctant to dig deeper.

Take a current UK example.  liveppm.com is a website that allows the British public to see a moment-by-moment update on the punctuality of trains on each of the country’s networks. From a purely technical standpoint – every time a train departs and arrives at a station anywhere in the country it’s performance is instantly updated on the site – its inconceivably impressive. From a ‘does it do anything useful?’ perspective, on the other hand, it can only be seen as a pointless waste of taxpayers money.

haystacks

Now I’m not totally blaming the IT professionals that built the system: they were given a brief and they executed it in spades. A bigger portion of the blame, one suspects, has to head in the direction of the Rail Regulators – the people tasked with making sure the taxpayer’s money is being wisely spent. On one level, checking how punctual our trains are serves might serve some kind of useful purpose. At first blush, I have no argument with the collection of punctuality data. It’s difficult to know if you’ve improved a system if you can’t measure what’s happening. It’s only when the information is used to set targets that the problems start. Setting arbitrary targets in order to impose penalties and generally beat people up destroys value and becomes an impediment to improvement. That’s because all the targets ultimately serve to do is encourage operators to improve the way they cheat reality in order to make the figures reflect better on them.

Meanwhile, the poor old commuter is paying for the whole downward spiral sham. Both directly in terms of the millions it cost to set up and maintain the system, but – far worse – indirectly in that it provides them with absolutely nothing that allows them to make any meaningful travel decision. The system is an expensive needle-less haystack. Learning that the 92% punctuality record of an operator today only means something if I have the choice to use another, higher performing, operator. And even then I have no idea whether the performance of my operator will be any better or worse tomorrow. Not that the majority of us have any choice either way. If I live in rural Devon, I can’t elect to catch a London Overground train. Tangibly, the punctuality data is useless to the commuter. Intangibly it is far, far worse because it adds frustration and annoys anyone that cares to look at the data because it leaves us with a feeling of utter powerlessness. That (twist the knife why don’t you) we’re paying for.

In all these respects, it’s a highly typical Big Data ‘solution’. The Data is only ever of any use if it reveals actionable insight about a situation. Insight is the needle in the haystack. Insight in the case of train punctuality is enabling the commuter to make decisions about whether they should bother to leave the house today. Or to take the car or bus instead. Or stay in bed five minutes longer because the specific train they intend to travel on is running late. Or, lest the train operators might also wish to do something vaguely useful with the information, allow them to put in place actions to improve performance of the network. Those are needles. Needles are difficult to find because they require seekers to go beyond what is merely easy to measure. Measure the wrong things (train punctuality), for the wrong reasons (to punish operators with a poor punctuality record) and while you might end up with a multi-million pound size haystack, it sadly contains no needles at all.