Category Archives: Analytics

Big Data in practice

(This is a translation of an earlier post in my Norwegian blog. This translation was done by Ragnvald Sannes using Google Translate with a few amendments. This technology malarky is getting better and better, isn’t it?).
ml_mapI have just finished teaching four days of data analytics – proper programming and data collection. We (Chandler, Alessandra and the undersigned) have managed to trick over 30 executives and middle managers in Norway to attend a programming and statistics course (more or less, this is actually what analytics basically is), while sort of wondering how we did that. The students are motivated and hard-working and have many and smart questions – in a course taught in English. It is almost enough to make me stop complaining about the state of the world and education and other things.
Anyway – what are these students going to do with this course? We are working on real projects, in the sense that we require people to come up with a problem they will find out in their own job – preferably something that is actually important and where deep data analysis can make a difference. This has worked for almost all the groups: They work on real issues in real organizations – and that is incredibly fun for the teacher. Here is a list of the projects, so judge by yourself. (I do not identify any students here, but believe me – these people face these issues every day.) Well worth spending time on:
  • What is the correct price for newly built homes? A group is working to figure out how to price homes that are not built yet, for a large residential building company.
  • What is the tax effect of the sharing economy? This group (where one student works for the Tax Administration) tries to figure out how to identify people who cheat on the tax as Uber drivers – while making suggestions on how tax rules can be adapted to make it easy to follow the law.
  • What characterizes successful consulting proposals? A major consulting firm wants to use data from their CRM system (which documents the bidding process) to understand what kind of projects they will win or lose.
  • How to recognize money laundering transactions? A bank wants to find out if any of their customers are doing money laundering through online gaming companies.
  • How to offer benefits to customers with automated analysis? A company that supplies stock trading terminals wants to use data analysis to create a competitive edge.
  • How to segment Norwegian shareholders? A company that offers online trading of shares wants to identify segments of its customers to pinpoint and improve its marketing strategy.
  • How to lower costs and reduce the risk of production stoppages in a process business? A hydropower company wants to better understand when and why your power stations need repairs or maintenance.
  • How to identify customers who are in the process of terminating? A TV company wants to understand what characterizes “churn” – how can they identify customers who are about to leave them?
  • Why are some wines more popular than others? A group will work with search data from a wine site to find out what makes some wines more sought after than others.
  • Which customers will buy a new product? A group is working on data from a large bank that wants to offer its existing customers more services.
  • How to increase the recycling rate for waste in Oslo? REN – Oslo’s municipal trash service – wants to find out if you can organize routes and routines differently to better utilize trash trucks and recycling plants.
  • How to avoid being sold out for promotional items? One of Norway’s largest grocery chains wishes to improve their ordering routines so that customers do not get to the store and find out that there is no more left of the offer they wanted.
  • How to model fraud risk in maritime insurance? An insurance company wants to build a model to understand how to find customers attempting to fraud companies or authorities.
  • Which customers are about to leave us? A large transport company wants to find out which customers are about to go to a competitor so that they can take action before it happens.
  • What characterize students who drop out? BI enters 3500 new students each year, but some of them end after the first year. How can we find evidence that a student is about to drop out?
Common to all the projects – and so it’s with all the student projects I have advised since I started in this industry – is that you start with a big question and reduce it to something that can actually be answered. Then you look for data and find that you need to reduce it even more. Then you get problems that the data is either not found, unreliable or inadequate – and one has to figure out what to do with it. Finally, after about 90% of the time and money budget is gone, one can begin to think about analysis. And then there is a risk that you find nothing…
And that is an important lesson of this course: The goal is that the student should be able to know about actual data analysis to ask the right questions and have a realistic expectation of what kind of answer you actually can get.
There is a great demand for this course – so we have set up an additional course this fall. See you there!

Key myths about analytics

My excellent colleagues Alessandra Luzzi and Chandler Johnson have pointed me to this video, a keynote speech from 2015 by Ken Rudin, head of analytics at Facebook:

This is a really good speech, and almost an advertisement for our course Analytics for Strategic Management, which starts in two days (and, well, sorry, it is full, but will be arranged again next year.)

In the talk (starting about 1:30 in), Ken breaks down four common myths surrounding Big Data:

  1. Big Data does not necessarily imply use of certain tools, in particular Hadoop. Hadoop can sift through mountains of data, but other tools, such as relational databases, are better at ad hoc analysis once you have structured the data and determined what of the data that is interesting and worth analyzing.
  2. Big Data does not always provide better answers. Big Data will give you more answers, but, as Rudin says, can give you “brilliant answers to questions that no one cares about.” He stated the best way to better answers to formulate better question, which requires hiring smart people with “business savvy” who will ask how to solve real business problems. Also, you need to place the data analysts out in the organization, so they understand how the business runs and what is important. He advocates an embedded model – centrally organized analysts sitting geographically with the people they are helping.
  3. Data Science is not all science. A lot of data science has an “art” to it, and you have to have a balance. Having a common language between business and analytics is important here – and Facebook sends its people to a two-week “Data Camp” to learn that. You ned to avoid the “hippo” problem – the highest paid person’s opinion – essentially, not enough science. The other side is the “groundhog” issue – based on the movie – where the main character tries to win the girl by gradual experimentation. Data is like sandpaper – it cannot create a good idea, but it can shape it after it has been created.
  4. The goal of analytics is not insights, but results. To that end, data scientists have to help making sure that people act on the analysis, not just inform them. “An actionable insight that nobody acts on has no value.”

To the students we’ll meet on Tuesday: This is not a bad way of gearing up for the course. To anyone else interested in analytics and Big Data: This video is recommended.

(And if you think, like I do, that his sounds like the discussion of what IT should be in an organization 20 years ago – well, fantastic, then we know what problems to expect and how to act on them.)

Analytics for Strategic Management

I am starting a new executive course, Analytics for Strategic Management, with my young and very talented colleagues Alessandra Luzzi and Chandler Johnson (both with the Center for Digitization at BI Norwegian Business School).


Alessandra Luzzi


Chandler Johnson

The course (over five modules) is aimed at managers who want to become sophisticated consumers of analytics (be it Big Data or the more regular kind). The idea is to learn just enough analytics that you know what to ask for, where the pressure points are (so you do not ask for things that cannot be done or will be prohibitively expensive). The participants will learn from cases, discussions, live examples and assignments.

Central to the course is a course analytics project, where the participants will seek out data from their own company (or, since it will be group work, someone else’s), figure out what you can do with the data, and end up, if not with a finished analysis (that might happen), at least with a well developed project specification.

The course will contain quite a bit of analytics – including a spot of Phython and R programming – again, so that the executives taking it will know what they are asking for and what is being done.

We were a bit nervous about offering this course – a technically oriented course with a February startup date. The response, however, has been excellent, with more than 20 students signed up already. In fact, wi will probably be capping the course at 30 participants, simply because it is the first time we are teaching it, and we are conscious that for the first time, 30 is more than enough, as we will be doing everything for the first time and undoubtedly change many things as we go along.

If you can’t do the course this year – here are a few stating pointers to whet your appetite:

  • Big Data is difficult to define. This is always the case with fashionable monikers – for instance, how big is “big”? – but good ol’ Wikipedia comes to the rescue, with an excellent introductory article on the concept. For me, Big Data has always been about having the entire data set instead of a sample (i.e., n = p), but I can certainly see the other dimensions of delineation suggested here.
  • Data analytics can be very profitable (PDF), but few companies manage to really mine their data for insights and actions. That’s great – more upside for those who really wants to do it!
  • Data may be big but often is bad, causing data scientists to spend most of their time fixing errors, cleaning things up and, in general, preparing for analytics rather than the analysis itself. Sometimes you can almost smell that the data is bad – I recommend The Quartz guide to bad data as a great list of indicators that something is amiss.
  • Data scientists are few, far between and expensive. There is a severe shortage of people with data analysis skills in Norway and elsewhere, and the educational systems (yours truly excepted, of course) is not responding. Good analysts are expensive. Cheap analysts – well, you get what you pay for. And, quite possibly, some analytics you may like, but not what you ought to get.
  • There is lots of data, but a shortage of models. Though you may have the data and the data scientists, that does not mean that you have good models. It is actually a problem that as soon as you have numbers – even though they are bad – they become a focal point for decision makers, who show a marked reluctance to asking where the data is coming from, what it actually means, and how the constructed models have materialised.

And with that – if you are a participant, I look forward to seeing you in February. If you are not – well, you better boogie over to BIs web pages and sign up.