Category Archives: Analytics

From notepad: The power and limits of deep learning – Yann LeCun

Warning: These are my notes from an ACM webcast. Misunderstandings, skips, jumps and errors (probably) abound. Caveat emptor.

Notes from
The Power and Limits of Deep Learning,” presented on Thursday, July 11 at 1 PM ET/10 AM PT by Yann LeCun, VP & Chief AI Scientist at Facebook, Silver Professor at NYU, and 2018 ACM A.M Turing Award Laureate.

Abstract:
Deep Learning (DL) has enabled significant progress in computer perception, natural language understanding, and control. Almost all these successes rely on supervised learning, where the machine is required to predict human-provided annotations, or model-free reinforcement learning, where the machine learns policies that maximize rewards. Supervised learning paradigms have been extremely successful for an increasingly large number of practical applications such as medical image analysis, autonomous driving, virtual assistants, information filtering, ranking, search and retrieval, language translation, and many more. Today, DL systems are at the core of search engines and social networks. DL is also used increasingly widely in the physical and social sciences to analyze data in astrophysics, particle physics, and biology, or to build phenomenological models of complex systems. An interesting example is the use of convolutional networks as computational models of human and animal perception. But while supervised DL excels at perceptual tasks, there are two major challenges to the next quantum leap in AI: (1) getting DL systems to learn tasks without requiring large amounts of human-labeled data; (2) getting them to learn to reason and to act. These challenges motivate some the most interesting research directions in AI.

Notes:

  • supervised learning works, but requires too many samples
  • convolutional networks: using layers to tease out compositional hierarchy
  • other approaches: reinforcement learning,
    • use convolutional networks and a few other architectural concepts, requires huge number of interactions with clearly defined universe – takes 80 hours to reach performance a human uses 15 minutes to reach. In the end, it does better than the human, but it takes a long time
    • impractical for non-electronic settings (self-driving car would need to crash thousands of times
  • better approach: (deep) multi-layer neural nets
    • alternates linear/non-linear layers
  • supervised machine learning, such as stochastic gradient descent
  • figure out tweaking by computing gradients by back-propagation (automatic differentiation)
  • architecture of neural networks – figure out sparse networks, not using all connections, based on research on visual cortex
    • first using simple cells, then combining them
  • convolutional neural network builds on this idea, but introduces back propagation
    • turn on/off each neuron based on the portion it sees, then combine
  • shows examples through the nineties, such as recognising numbers (for checks)
  • neural networks out of fashion with AI researchers, realized that they could recognize multiple objects
  • research on moving robots, did not need training data
  • moving on to autonomous driving by classifying pixels
  • 2010: Deep learning revolution, driven by speech recognition community
    • largely responsible for lowering of errors in SR
  • 2012: (Alexnet) Krizhevsy et al, NIPS 2012, other nets, large networks
  • better and better performance, dramatic increase in number of layers
    • current record: 84% image recognition
    • trying to find the minimal architecture that gives performance
    • Facebook: billions of pictures, each goes through 6 convnets
  • Mask R-CNN: instance segmentation, two stage detection system, identifies areas of interest and send them to new networks
  • RetinaNet: One-pass object recognition
  • other works, recognizing background,
  • Applications:
    • image recognition, such as finding femurs (for hip ops) by taking in the whole 3D picture rather than using layers
    • autonomous driving
    • everyone uses convnets
  • Limitation:
    • good for perception, not for reasoning
    • for this: introducing working memory (differentiable associative memory), need to maintain a number of facts, “memory network”, a neural net with an attached network for memory, essentially soft RAM
    • transformer networks, every unit is itself a neural network, works with translation (dynamic convolution)
    • Facebook; dynamic neural nets: networks that put out networks
  • Challenge: How can humans and animals learn so quickly?
    • children learn largely by observation
      • learn about gravity between 6 and 9 months, just by observation
    • solution(?) self-supervised networks
      • not task-directed, comprises most of our own learning (cake example)
      • very large networks (see slide on process)
      • works for speech recognition and text, filling in 15-20% of blanks in text
      • does not work for filling in missing parts of images (yet)
      • works partly for speech recognition
      • summary: works with discrete data (text, partly speech), much more difficult with continuous data, because we do not have good ways of parameterization
        • predicts the average of all possible futures, results in blurry images…
    • Adversarial training: prediction under uncertainty:
      • generator that makes prediction, discriminator that determines whether it is good or not
      • works well for generating images of people that don’t exist, clothes that has not been designed yet
      • important with video prediction for self-driving cars, that is where the demand is
    • Self-supervised forward models: training self-driving cars to predict it environment by adding latent variables, randomly sampled
    • Final slide: Theory follows invention, will deep learning result in a theory of intelligence?

(did not take notes during question session, should have don (might add them later), talk available at learning.acm.com)

Analytics III: Projects

asm_topTogether with Chandler Johnson and Alessandra Luzzi, I currently teach a course called Analytics for Strategic Management. In this course (now in its third iteration), executive students work on real projects for real companies, applying various forms of machine learning (big data, analytics, whatever you want to call it) to business problems. We have just finished the second of five modules, and the projects are now defined.

Here is a (mostly anonymised, except for publicly owned companies) list:

  • An IT service company that provides data and analytics wants to predict customer use of their online products, in order to provide better products and tailor them more to the most active customers
  • A gas station chain company wants to predict churn in their business customers, to find ways to keep them (or, if necessary, scale down some of their offerings)
  • A electricity distribution network company wants to identify which of their (recently installed) smart meters are not working properly, to reduce the cost of inspection and increase the quality of
  • A hairdressing chain wants to predict which customers will book a new appointment when they have had their hair done, in order to increase repeat business and build a group of loyal customers
  • A large financial institution wants to identify employees that misuse company information (such as looking at celebrities’ information), in order to increase privacy and data confidentiality
  • NAV IT wants to predict which employees are likely to leave the company, in order to better plan for recruitment and retraining
  • OSL Gardermoen want to find out which airline passengers are more likely to use the taxfree shop, in order to increase sales (and not bother those who will not use the taxfree shop too much)
  • a bank wants to find out which of their younger customers will need a house loan soon, to increase their market share
  • a TV media company wants to find out which customers are likely to cancel their subscription within a certain time frame, to better tailor their program offering and their marketing
  • a provider of managed data centers wants to predict their customers’ energy needs, to increase the precision of their own and their customers’ energy budgets
  • Ruter (the public transportation umbrella company for the Oslo area) wants to build a model to better predict crowding on buses, to, well, avoid overcrowding
  • Barnevernet wants to build a model to better predict which families are most likely to be approved as foster parents, in order to speed up the qualification process
  • an electrical energy production company wants to build a model to better predict electricity usage in their market, in order to plan their production process better

All in all, a fairly typical set of examples of the use of machine learning and analytics in business – and I certainly like to work with practical examples with very clearly defined benefits. Over the next three modules (to be finished in the Spring) we will take these projects closer to fruition, some to a stage of a completed proposal, some probably all the way to a finished model and perhaps even an implementation.

Interesting interview with Rodney Brooks

sawyer_and_baxterBoingboing, which is a fantastic source of interesting stuff to do during Easter vacation, has a long and fascinating interview by Rob Reid with Rodney Brooks, AI and robotics researcher and entrepreneur extraordinaire. Among the things I learned:

  • What the Baxter robot really does well – interacting with humans and not requiring 1/10 mm precision, especially when learning
  • There are not enough workers in manufacturing (even in China), most of the ones working there spend their time waiting for some expensive capital equipment to finish
  • The automation infrastructure is really old, still using PLCs that refresh and develop really slowly
  • Robots will be important in health care – preserving people’s dignity by allowing them to drive and stay at home longer by having robots that understand force and softness and can do things such as help people out of bed.
  • He has written an excellent 2018 list of dated predictions on the evolution of robotic and AI technologies, highly readable, especially his discussions on how to predict technologies and that we tend to forget the starting points. (And I will add his blog to my Newsblur list.)
  • He certainly doesn’t think much of the trolley problem, but has a great example to understand the issue of what AI can do, based on what Isaac Newton would think if he were transported to our time and given a smartphone – he would assume that it would be able to light a candle, for instance.

Worth a listen..

Neural networks – explained

As mentioned here a few times, I teach an executive course called Analytics for strategic management, as well as a short program (three days) called Decisions from Data: Driving an Organization on Analytics. We have just finished the first version of both of these courses, and it has been a very enjoyable experience. The students (in both courses) have been interested and keen to learn, bringing relevant and interesting problems to the table, and we have managed do what it said on the tin (I think) – make them better consumers of analytics, capable of having a conversation with the analytics team, employing the right vocabulary and being able to ask more intelligent questions.

Of course, programs of this type does not allow you do dive deep into how things work, though we have been able to demonstrate MySQL, Python and DataRobot, and also give the students an understanding of how rapidly these things are evolving. We have talked about deep learning, for instance, but not how it works.

But that is easy to fix – almost everything about machine learning is available on Youtube and in other web channels, once you are into a little bit of the language. For instance, to understand how deep learning works, you can check out a series of videos from Grant Sanderson, who produces very good educational videos on the web site 3 blue one brown.

(There are follow-up videos: Chapter 2, Chapter 3, and Chapter 3 (formal calculus appendix). This Youtube channel has a lot of other math-related videos, too, including a great explanation of how Bitcoin works, which I’ll have to get into at some points, since I keep being asked why I don’t invest in Bitcoin all the time.)

Of course, you have to be rather interested to dive into this, and it certainly is not required read for an executive who only wants to be able to talk intelligently to the analytics team. But it is important (and a bit reassuring) to note the mechanisms employed: Breaking a very complex problem up into smaller problems, breaking those up into even smaller problems. solving the small problems by programming, then stepping back up. For those of you with high school math: It really isn’t that complicated. Just complicated in layers.

And it is good to know that all this advanced AI stuff really is rather basic math. Just applied in an increasingly complex way, really fast.

Analytics projects

asm_topTogether with Chandler Johnson and Alessandra Luzzi, I currently teach a course called Analytics for Strategic Management. In this course (now in its second iteration), executive students work on real projects for real companies, applying various forms of machine learning (big data, analytics, whatever you want to call it) to business problems. We have just finished the second of five modules, and the projects are now defined.

Here is a (mostly anonymised) list:

  • The Agency for Public Management and eGovernment (Difi) wants to understand and predict which citizens are likely to reserve themselves against electronic communications from the government. The presumption is that these people may be mostly old, not on electronic media, or in other ways digitally unsophisticated – but that may not be true, so they want to find out.
  • An electric power distribution company wants to investigate power imbalances in the electric grid: In the electric grid, production has to match consumption at all times, or you will get (sometimes rather large) price fluctuations. Can they predict when imbalances (more consumption that production, for instance) will occur, so that they can adjust accordingly?
  • A company in the food and beverage industry want to offer recommendations to their (business) customers: When you order products from them, how can they suggest other products that may either sell well or differentiate the customer from the competition?
  • A petroleum producing company wants to predict unintended shutdowns and slowdowns in their production infrastructure. Such problems are costly and risky, but predictions are difficult because they are rather rare – and that creates difficulties with unbalanced data sets.
  • A major bank wants to look into the security profiles of their online customers and investigate whether some customers are less likely to be exposed to security risks (and therefore may be able to use less cumbersome security procedures than others).
  • An insurance company wants to investigate which of their new customers are likely to leave them (churn analysis) – and why. They want to find them early, while there is still time to do something to make them stay.
  • A ship management company wants to investigate the use of certain types of oil and optimise the delivery and use of it. (Though the oil is rather specialised, the ships are large and the expense significant.)
  • Norsk Tipping runs a service helping people who are in danger of becoming addicted to gaming, an important part of their societal responsibility which they take very seriously. They want to identify which of their customers are most likely to benefit from intervention. This is a rather tricky and interesting problem – you need to identify not only those who are likely to become addicted, but also make a judgement as to whether the intervention (of which there is limited capacity) is likely to help.
  • A major health club chain wants to identify customers who are not happy with their services, and they want to find them early, so they can make offers to activate them and make them stay.
  • A regional bank wants to identify customers who are about to leave them, particularly those who want to move their mortgage somewhere else. (This is also a problem of unbalanced data sets, since most customers stay.)
  • A major electronic goods retailer wants to do market basket analysis to be able to recommend and stock products that customers are likely to buy together with others.

All in all, a fairly typical set of examples of the use of machine learning and analytics in business – and I certainly like to work with practical examples with very clearly defined benefits. Now – a small matter of implementation!

Big Data and analytics – briefly

DFDDODData and data analytics is becoming more and more important for companies and organizations. Are you wondering what data and data science might do for your company? Welcome to a three-day ESP (Executive Short Program) called Decisions from Data: Driving an Organization with Analytics. It will take place at BI Norwegian Business School from December 5-7 this year. The short course is an offshoot from our very popular executive programs Analytics for Strategic Management, which are fully booked. (Check this list (Norwegian) for a sense of what those students are doing.)

Decisions from Data is aimed at managers who are curious about Big Data and data science and wants an introduction and an overview, without having to take a full course. We will talk about and show various forms of data analysis, discuss the most important obstacles to becoming a data driven organization and how to deal with data scientists, and, of course, give lots of examples of how to compete with analytics. The course will not be tech heavy, but we will look at and touch a few tools, just to get an idea of what we are asking those data scientists to do.

The whole thing will be in English, because, well, the (in my humble opinion) best people we have on this (Chandler Johnson og Alessandra Luzzi) are from the USA and Italy, respectively. As for myself, I tag along as best I can…

Welcome to the data revolution – it start’s here!

Big Data in practice

(This is a translation of an earlier post in my Norwegian blog. This translation was done by Ragnvald Sannes using Google Translate with a few amendments. This technology malarky is getting better and better, isn’t it?).
ml_mapI have just finished teaching four days of data analytics – proper programming and data collection. We (Chandler, Alessandra and the undersigned) have managed to trick over 30 executives and middle managers in Norway to attend a programming and statistics course (more or less, this is actually what analytics basically is), while sort of wondering how we did that. The students are motivated and hard-working and have many and smart questions – in a course taught in English. It is almost enough to make me stop complaining about the state of the world and education and other things.
Anyway – what are these students going to do with this course? We are working on real projects, in the sense that we require people to come up with a problem they will find out in their own job – preferably something that is actually important and where deep data analysis can make a difference. This has worked for almost all the groups: They work on real issues in real organizations – and that is incredibly fun for the teacher. Here is a list of the projects, so judge by yourself. (I do not identify any students here, but believe me – these people face these issues every day.) Well worth spending time on:
  • What is the correct price for newly built homes? A group is working to figure out how to price homes that are not built yet, for a large residential building company.
  • What is the tax effect of the sharing economy? This group (where one student works for the Tax Administration) tries to figure out how to identify people who cheat on the tax as Uber drivers – while making suggestions on how tax rules can be adapted to make it easy to follow the law.
  • What characterizes successful consulting proposals? A major consulting firm wants to use data from their CRM system (which documents the bidding process) to understand what kind of projects they will win or lose.
  • How to recognize money laundering transactions? A bank wants to find out if any of their customers are doing money laundering through online gaming companies.
  • How to offer benefits to customers with automated analysis? A company that supplies stock trading terminals wants to use data analysis to create a competitive edge.
  • How to segment Norwegian shareholders? A company that offers online trading of shares wants to identify segments of its customers to pinpoint and improve its marketing strategy.
  • How to lower costs and reduce the risk of production stoppages in a process business? A hydropower company wants to better understand when and why your power stations need repairs or maintenance.
  • How to identify customers who are in the process of terminating? A TV company wants to understand what characterizes “churn” – how can they identify customers who are about to leave them?
  • Why are some wines more popular than others? A group will work with search data from a wine site to find out what makes some wines more sought after than others.
  • Which customers will buy a new product? A group is working on data from a large bank that wants to offer its existing customers more services.
  • How to increase the recycling rate for waste in Oslo? REN – Oslo’s municipal trash service – wants to find out if you can organize routes and routines differently to better utilize trash trucks and recycling plants.
  • How to avoid being sold out for promotional items? One of Norway’s largest grocery chains wishes to improve their ordering routines so that customers do not get to the store and find out that there is no more left of the offer they wanted.
  • How to model fraud risk in maritime insurance? An insurance company wants to build a model to understand how to find customers attempting to fraud companies or authorities.
  • Which customers are about to leave us? A large transport company wants to find out which customers are about to go to a competitor so that they can take action before it happens.
  • What characterize students who drop out? BI enters 3500 new students each year, but some of them end after the first year. How can we find evidence that a student is about to drop out?
Common to all the projects – and so it’s with all the student projects I have advised since I started in this industry – is that you start with a big question and reduce it to something that can actually be answered. Then you look for data and find that you need to reduce it even more. Then you get problems that the data is either not found, unreliable or inadequate – and one has to figure out what to do with it. Finally, after about 90% of the time and money budget is gone, one can begin to think about analysis. And then there is a risk that you find nothing…
And that is an important lesson of this course: The goal is that the student should be able to know about actual data analysis to ask the right questions and have a realistic expectation of what kind of answer you actually can get.
There is a great demand for this course – so we have set up an additional course this fall. See you there!