Truth, time, context, and computation

A reference to Jeanne Ross’ exhortation to companies to find one agreed – or declared – one declared source of truth got me thinking this morning. Jeanne’s point is that in order to get organizations to start discussing solutions rather than bickering over descriptions, it is better to declare a version of the truth to be the real one. If there are inaccuracies in the source of the data, then people can do something about making them more precise, an exercise that in most cases is much more fruitful than trying to suggest alternative numbers.

I very much agree with Jeanne in the main of this statement (probably a smart move, given that I am her guest at MIT CISR this year), as well as the need for it in many organizations. But it got me thinking – what is the truth, and how has what we consider to be the truth been influenced by advances in computation? With Big Data increasingly available, we can now analyze our way to most things. How does this change our concept of what is truth? Moreover, at what level should a CIO declare the one source of truth?

Truth as a function of time and context

I remember a conversation sometime in the nineties with colleagues Richard Pawson and Paul Turton at CSC – the discussion was on how object orientation changed the nature of systems, from being a computationally limited representation (a function, if you will) to being a simulation of the organization. We saw three stages in this evolution:

VERNER Swivel chair, white Width: 24 3/8 " Depth: 27 1/2 " Min. height: 42 1/8 " Max. height: 47 1/4 " Seat width: 20 1/2 " Seat depth: 18 1/2 " Min. seat height: 16 7/8 " Max. seat height: 23 5/8 "  Width: 62 cm Depth: 70 cm Min. height: 107 cm Max. height: 120 cm Seat width: 52 cm Seat depth: 47 cm Min. seat height: 43 cm Max. seat height: 60 cm  First, truth as a stored value. The example we thought of was inventory level – what is inventory level for a certain product? In a world with limited computer resources, the simplest way to have this number would be to periodically calculate it, and then store it so people can have access to it. When you go to IKEA’s web site to search for a nice and cheap office chair (such as the pictured Verner), for instance, they will give you an estimated number in the store closest to you. I don’t know how IKEA calculates that number, but I doubt if they dip into the local POS system of each store to precisely check it each time you query. (If they do, more power to them.) If this number is calculated on an intermittent basis, it will of course be rather imprecise – but it is computationally easy to get to. Similarly, if you ask Google about the distance to the moon, they will come back with documents which have that number in them, generally agreeing on an average of 384,403 km (238,857 miles). However, that is an approximation, since the moon is can be as near as 363,104 km (225,622 miles) and as far as 405,696 km (252,088 miles) depending on where it is in its elliptical trajectory.

I suspect much of the discussion over which are right in most corporations are about these kinds of numbers – calculated after the fact, subject to interpretation because we just don’t know what the precise situation is, and very often we do not know how we got to that number.

However, computation comes to the rescue – with more powerful computers, sensors and faster networks, we can actually move to the second stage: Truth as a calculated number.

For the distance to the moon example, the simple answer is Wolfram Alpha, the mathematical search engine, which will give you the calculated distance to the moon at the time of the query. For the IKEA example, this would mean calculating the number of Verner chairs in the store each time a customer asks on the web. This can be done varying levels of precision. The simplest way would be to get it from the POS system, which records when a chair is purchased and can subtract it from the inventory. A more precise method, given the length of IKEA’s checkout lines, would be to have a sensor on the chair and track when it is taken out of the shelf and placed on the customer’s cart. Precision is largely a question of how much you are willing to spend. For a physical store, tracking cart volumes is expensive, for an online store, it is, in theory, cheap, since a customer moving an item from inventory to cart is done digitally.

This kind of number is much closer to the truth, and much more operationally useful – and the job of the CIO is to declare how this number should be found, tracked and displayed. It may seem somewhat simple to say this, but this is where there should be no question of the source of the truth – every company should have one and only one, and much of the work of CIOs and their organizations in the last 10-15 years has been in moving companies along until they are capable of calculating the one true number.

Then, we move to the next (and so far last) stage: Truth as a calculated number in context. Context very gets more difficult as the need for precision goes up (which, I suppose, blatantly ignoring the quantum mechanical context, is a sort of business version of the Heisenberg uncertainty principle.)

For the distance to the moon example there is little room for context. You could argue that it be different based on where on earth you are, or for what you are going to do with the information (launching a satellite or calibrating your telescope, for instance) but for most uses, there is little need for contextual customization.

For the IKEA example, the situation is rather different for different parts of the organization, and for different types of customer. If I am a customer looking up the number from my smartphone while close to the store, the POS number might be OK, since I would get to the product in short time and the consequences of imprecision would be small. If the nearest IKEA store is several hours’ driving away, then I might want a different number, one that incorporates not just the current situation but also the likelihood that the number would be zero before I get there. Or, I might want a reservation function, either setting the product aside or at least allowing me to report that I aim to buy one within the next x hours and thus would like the number shown as available to be reduced until I can make it to the store. In an online store, the problem is the diametrical opposite – there, customers can have carts sitting for days and it becomes an operational necessity to have some policy declaring at what point the products in the cart will have to be made available to other customers.

Similarly, the very concept of inventory level itself means different things to different parts of the organization. For a store manager, it is a cost concept, something to be optimized in a balancing act between capital costs and stock-outs. For a supply chain manager, it is also a flow concept, something to be optimized between stores. For someone managing the physical space of the warehouse, it is a physical concept – goods that have been sold to a customer but not yet picked up are very much something you need to manage. And for a sales person, inventory levels is an availability concept, often subject to negotiations and transfers within the organization.

So, what is a CIO to do?

I think the declaration of a source of truth is a question of hitting the right level, navigating between the simplicity of simple numbers and the complexity of inferred context. In most cases, I suspect, the optimum lies in providing the ability to find the truth, giving customers (i.e., of the IT organization) their numbers at the source – which should be the one, declared one – but also giving them the tools to interpret them in light of their own context.

The key here is not to try to move from the first phase to the third without missing the second. Unfortunately, in my view, many IT organizations have done just that, by responding to requests for customized reports, systems and views from archival rather than current, operational data. As each number becomes institutionalized through use within its context, transitioning to a declared truth can become an exercise in power rather than rationalism. Better to promise context after speed and precision has been provided – and even better, provide the context in a format the end consumer can relate to within their own context.

For IKEA, that might be giving me the number of chairs available plus a prediction (based on history and, say, number of cars in IKEA’s parking lot) as to how many chairs are likely to be sold, with variance, within the next x hours. For the rest of organization, well – it depends. But ones you provide real-time access to well defined operational data, you can safely leave the question of what it depends on to the person wanting to use it.