EU has recently release a proposal for regulating the use of AI in companies and regulations. As far as I can see, it is modelled on the GDPR regulations: Assigning responsibility to board and top management, sanctions expressed in terms of percentages of revenues, and (hopefully) som sort of “safe harbor” rules so you can be somewhat confident in that you are not doing anything wrong.
An interesting aspect here is that the EU is early with regards to the use of AI (yes, I know “AI” is a really diffuse concept, but leave that be for a moment) and is again taking the lead in regulation where Silicon Valley (and China) leads in implementation.
This means that managers, board members and researchers will need to learn more. I plan to do this by attending a webinar at Applied Artificial Intelligence Conference 2021. This webinar (May 27, at CDT 1430-1600) is open for everyone who registers. It will be facilitated by Elin Hauge, who is a member of one of the EGN networks I lead.
Yesterday began with a message from a business executive who was concerned with the security of Zoom, the video conferencing platform that many companies (and universities) have landed on. The reason was a newspaper article regurgitating several internet articles, partly about functionality that has been adequately documented by Zoom, partly about security holes that have been fixed a long time ago.
So is there any reason to be concerned about Zoom or Whereby or Teams or Hangouts or all the other platforms?
My answer is “probably not” – at least not for the security holes discussed here, and for ordinary users (and that includes most small- to medium sized companies I know about).
It is true that video conferencing introduces some security and privacy issues, but if we look at it realistically, the biggest problem is not the technology, but the people using it (Something we nerds refer to as PEBKAC – Problem Exists Between Keyboard and Chair.)
When a naked man sneaks into an elementary school class via Whereby, as happened a few days ago here in Norway, it is not due to technology problems, but because the teacher had left the door wide open, i.e., had not turned on the function that makes it necessary to “knock” and ask for permission to enter.
When anyone can record (and have the dialogue automatically transcribed) from Zoom, it is because the host has not turned off the recording feature. By the way, anyone can record a video conference with screen capture software (such as Camtasia), a sound recorder or for that matter a cell phone, and no (realistic) security system in the world can do anything about it.
When the boss can monitor that people are not using other software while sitting in a meeting (a feature that can be completely legitimate in a classroom, it is equivalent to the teacher looking beyond the class to see if the students are awake), well, I don’t think the system is to blame for that either. Any leader who holds such irrelevant meetings that people do not bother to pay attention should rethink their communications strategy. Any executive I know would have neither time nor interest in activating this feature – because if you need technology to force people to wake up, you don’t have a problem technology can solve.
The risk of a new tool should not be measured against some perfect solution, but against what the alternative is if you don’t have it. Right now, video conferencing is the easiest and best tool for many – so that is why we use it. But we have to take the trouble to learn how it works. The best security system in the world is helpless against people writing their password on a Post-It, visible when they are in videoconference.
At BI Norwegian Business School, we are (naturally and way overdue, but a virus crisis helps) moving all exams to digital. This means a lot of changes for people who have not done that before. One particular anxiety is cheating – normally not a problem in the subjects I teach (case- and problem oriented, master/executive, small classes) but certainly is an issue in large classes at the bachelor level, where many answers are easily found online, the students are many, and the subjects introductory in nature.
Here are some strategies to deal with this:
Have an academic honesty policy and have the students sign it as part of the exam. This to make them aware of they risk if they cheat.
Keep the exam time short – three hours at the max – and deliberately ask more questions than usual. This makes for less time for cheating (by collaborating) because collaboration takes time. It also means introducing more differentiation between the students – if just a few students manage to answer all questions, those are the A candidates. Obviously, you need to adjust the grade scale somewhat (you can’t expect all to answer everything) and there is an issue of awarding students that are good at taking exams at the expense of deep learning, but that is the way of all exams.
Don’t ask the obvious questions, especially not those asked on previous exams. Sorry, no reuse. Or perhaps a little bit (it is a tiring time.)
Tell the students that all answers will be subjected to an automated plagiarism check. Whether this is true or not, does not matter – plagiarism checkers are somewhat unreliable, have many false positives, and require a lot of afterwork – but just the threat will eliminate much cheating. (Personally, I look for cleverly crafted answers and Google them, amazing what shows up…).
Tell the students that after the written exam, they can be called in for an oral exam where they will need to show how they got their answers (if it is a single-answer, mathematically oriented course) or answer more detailed questions (if it is a more analysis- or literature oriented course). Who gets called in (via videoconference) will be partially random and partially based on suspicion. Failing the orals results in failing the course.
When you write the questions: If applicable, Google them, look at the most common results, and deliberately reshape the questions so that the answer is not one of those.
Use an example for the students to discuss/calculate, preferably one that is fresh from a news source or from a deliberately obscure academic article they have not seen before.
Consider giving sub-groups of students different numbers to work from – either automatically (different questions allocated through the exam system) or by having questions like “If your student ID ends in an even number (0,2,4,6,8) answer question 2a, otherwise answer question 2b” (use the student ID, not “birthday in January, February, March…” as this will be the only marker you have.) The questions may have the same problem, but with small, unimportant differences such as names, coefficients or others. This makes it much harder to collaborate for the students. (If you do multiple questions in an electronic context, I assume a number of the tools will have functionality for changing the order of the questions – it would, frankly, astonish me if they did not – but I don’t use multiple choice myself, so I don’t know.
Consider telling the students they will all get different problems (as discussed above) but not doing it. It still will prevent a lot of cheating simply because the students believe they all have different problems and act accordingly.
If you have essay questions, ask the students to pick a portion of them and answer them. I do this on all my exams anyway – give the students 6 questions with short (150 words) answers and ask them to pick 4 and answer only those, and give them 2 or 3 longer questions (400 words or so) and ask them to answer only one. (Make it clear that answering them all will result in only the first answered will be considered.) Again, this makes cheating harder.
Lastly: You can’t eliminate cheating in regular, physical exams, so don’t think you can do it in online exams. But you certainly can increase the disincentives to do so, and that is the most you can hope for.
Department for future ideas
I have always wanted to use machine learning for grading exams. At BI, we have some exams with 6000 candidates writing textual answers. Grading this surely must constitute cruel and unusual punishment. With my eminent colleague Chandler Johnson I tried to start a project where we would have graders grade 1000 of these exams, then use text recognition and other tools, build an ML model and use that to grade the rest. Worth an experiment, surely. The project (like many other ideas) never took off, largely because of difficulties of getting the data, but perhaps this situation will make it possible.
Warning: These are my notes from an ACM webcast. Misunderstandings, skips, jumps and errors (probably) abound. Caveat emptor.
Notes from The Power and Limits of Deep Learning,” presented on Thursday, July 11 at 1 PM ET/10 AM PT by Yann LeCun, VP & Chief AI Scientist at Facebook, Silver Professor at NYU, and 2018 ACM A.M Turing Award Laureate.
Deep Learning (DL) has enabled significant progress in computer perception, natural language understanding, and control. Almost all these successes rely on supervised learning, where the machine is required to predict human-provided annotations, or model-free reinforcement learning, where the machine learns policies that maximize rewards. Supervised learning paradigms have been extremely successful for an increasingly large number of practical applications such as medical image analysis, autonomous driving, virtual assistants, information filtering, ranking, search and retrieval, language translation, and many more. Today, DL systems are at the core of search engines and social networks. DL is also used increasingly widely in the physical and social sciences to analyze data in astrophysics, particle physics, and biology, or to build phenomenological models of complex systems. An interesting example is the use of convolutional networks as computational models of human and animal perception. But while supervised DL excels at perceptual tasks, there are two major challenges to the next quantum leap in AI: (1) getting DL systems to learn tasks without requiring large amounts of human-labeled data; (2) getting them to learn to reason and to act. These challenges motivate some the most interesting research directions in AI.
supervised learning works, but requires too many samples
convolutional networks: using layers to tease out compositional hierarchy
other approaches: reinforcement learning,
use convolutional networks and a few other architectural concepts, requires huge number of interactions with clearly defined universe – takes 80 hours to reach performance a human uses 15 minutes to reach. In the end, it does better than the human, but it takes a long time
impractical for non-electronic settings (self-driving car would need to crash thousands of times
better approach: (deep) multi-layer neural nets
alternates linear/non-linear layers
supervised machine learning, such as stochastic gradient descent
figure out tweaking by computing gradients by back-propagation (automatic differentiation)
architecture of neural networks – figure out sparse networks, not using all connections, based on research on visual cortex
first using simple cells, then combining them
convolutional neural network builds on this idea, but introduces back propagation
turn on/off each neuron based on the portion it sees, then combine
shows examples through the nineties, such as recognising numbers (for checks)
neural networks out of fashion with AI researchers, realized that they could recognize multiple objects
research on moving robots, did not need training data
moving on to autonomous driving by classifying pixels
2010: Deep learning revolution, driven by speech recognition community
largely responsible for lowering of errors in SR
2012: (Alexnet) Krizhevsy et al, NIPS 2012, other nets, large networks
better and better performance, dramatic increase in number of layers
current record: 84% image recognition
trying to find the minimal architecture that gives performance
Facebook: billions of pictures, each goes through 6 convnets
Mask R-CNN: instance segmentation, two stage detection system, identifies areas of interest and send them to new networks
RetinaNet: One-pass object recognition
other works, recognizing background,
image recognition, such as finding femurs (for hip ops) by taking in the whole 3D picture rather than using layers
everyone uses convnets
good for perception, not for reasoning
for this: introducing working memory (differentiable associative memory), need to maintain a number of facts, “memory network”, a neural net with an attached network for memory, essentially soft RAM
transformer networks, every unit is itself a neural network, works with translation (dynamic convolution)
Facebook; dynamic neural nets: networks that put out networks
Challenge: How can humans and animals learn so quickly?
children learn largely by observation
learn about gravity between 6 and 9 months, just by observation
solution(?) self-supervised networks
not task-directed, comprises most of our own learning (cake example)
very large networks (see slide on process)
works for speech recognition and text, filling in 15-20% of blanks in text
does not work for filling in missing parts of images (yet)
works partly for speech recognition
summary: works with discrete data (text, partly speech), much more difficult with continuous data, because we do not have good ways of parameterization
predicts the average of all possible futures, results in blurry images…
Adversarial training: prediction under uncertainty:
generator that makes prediction, discriminator that determines whether it is good or not
works well for generating images of people that don’t exist, clothes that has not been designed yet
important with video prediction for self-driving cars, that is where the demand is
Self-supervised forward models: training self-driving cars to predict it environment by adding latent variables, randomly sampled
Final slide: Theory follows invention, will deep learning result in a theory of intelligence?
(did not take notes during question session, should have don (might add them later), talk available at learning.acm.com)
Together with Chandler Johnson and Alessandra Luzzi, I currently teach a course called Analytics for Strategic Management. In this course (now in its third iteration), executive students work on real projects for real companies, applying various forms of machine learning (big data, analytics, whatever you want to call it) to business problems. We have just finished the second of five modules, and the projects are now defined.
Here is a (mostly anonymised, except for publicly owned companies) list:
An IT service company that provides data and analytics wants to predict customer use of their online products, in order to provide better products and tailor them more to the most active customers
A gas station chain company wants to predict churn in their business customers, to find ways to keep them (or, if necessary, scale down some of their offerings)
A electricity distribution network company wants to identify which of their (recently installed) smart meters are not working properly, to reduce the cost of inspection and increase the quality of
A hairdressing chain wants to predict which customers will book a new appointment when they have had their hair done, in order to increase repeat business and build a group of loyal customers
A large financial institution wants to identify employees that misuse company information (such as looking at celebrities’ information), in order to increase privacy and data confidentiality
NAV IT wants to predict which employees are likely to leave the company, in order to better plan for recruitment and retraining
OSL Gardermoen want to find out which airline passengers are more likely to use the taxfree shop, in order to increase sales (and not bother those who will not use the taxfree shop too much)
a bank wants to find out which of their younger customers will need a house loan soon, to increase their market share
a TV media company wants to find out which customers are likely to cancel their subscription within a certain time frame, to better tailor their program offering and their marketing
a provider of managed data centers wants to predict their customers’ energy needs, to increase the precision of their own and their customers’ energy budgets
Ruter (the public transportation umbrella company for the Oslo area) wants to build a model to better predict crowding on buses, to, well, avoid overcrowding
Barnevernet wants to build a model to better predict which families are most likely to be approved as foster parents, in order to speed up the qualification process
an electrical energy production company wants to build a model to better predict electricity usage in their market, in order to plan their production process better
All in all, a fairly typical set of examples of the use of machine learning and analytics in business – and I certainly like to work with practical examples with very clearly defined benefits. Over the next three modules (to be finished in the Spring) we will take these projects closer to fruition, some to a stage of a completed proposal, some probably all the way to a finished model and perhaps even an implementation.
As mentioned here a few times, I teach an executive course called Analytics for strategic management, as well as a short program (three days) called Decisions from Data: Driving an Organization on Analytics. We have just finished the first version of both of these courses, and it has been a very enjoyable experience. The students (in both courses) have been interested and keen to learn, bringing relevant and interesting problems to the table, and we have managed do what it said on the tin (I think) – make them better consumers of analytics, capable of having a conversation with the analytics team, employing the right vocabulary and being able to ask more intelligent questions.
Of course, programs of this type does not allow you do dive deep into how things work, though we have been able to demonstrate MySQL, Python and DataRobot, and also give the students an understanding of how rapidly these things are evolving. We have talked about deep learning, for instance, but not how it works.
But that is easy to fix – almost everything about machine learning is available on Youtube and in other web channels, once you are into a little bit of the language. For instance, to understand how deep learning works, you can check out a series of videos from Grant Sanderson, who produces very good educational videos on the web site 3 blue one brown.
Of course, you have to be rather interested to dive into this, and it certainly is not required read for an executive who only wants to be able to talk intelligently to the analytics team. But it is important (and a bit reassuring) to note the mechanisms employed: Breaking a very complex problem up into smaller problems, breaking those up into even smaller problems. solving the small problems by programming, then stepping back up. For those of you with high school math: It really isn’t that complicated. Just complicated in layers.
And it is good to know that all this advanced AI stuff really is rather basic math. Just applied in an increasingly complex way, really fast.
Together with Chandler Johnson and Alessandra Luzzi, I currently teach a course called Analytics for Strategic Management. In this course (now in its second iteration), executive students work on real projects for real companies, applying various forms of machine learning (big data, analytics, whatever you want to call it) to business problems. We have just finished the second of five modules, and the projects are now defined.
Here is a (mostly anonymised) list:
The Agency for Public Management and eGovernment (Difi) wants to understand and predict which citizens are likely to reserve themselves against electronic communications from the government. The presumption is that these people may be mostly old, not on electronic media, or in other ways digitally unsophisticated – but that may not be true, so they want to find out.
An electric power distribution company wants to investigate power imbalances in the electric grid: In the electric grid, production has to match consumption at all times, or you will get (sometimes rather large) price fluctuations. Can they predict when imbalances (more consumption that production, for instance) will occur, so that they can adjust accordingly?
A company in the food and beverage industry want to offer recommendations to their (business) customers: When you order products from them, how can they suggest other products that may either sell well or differentiate the customer from the competition?
A petroleum producing company wants to predict unintended shutdowns and slowdowns in their production infrastructure. Such problems are costly and risky, but predictions are difficult because they are rather rare – and that creates difficulties with unbalanced data sets.
A major bank wants to look into the security profiles of their online customers and investigate whether some customers are less likely to be exposed to security risks (and therefore may be able to use less cumbersome security procedures than others).
An insurance company wants to investigate which of their new customers are likely to leave them (churn analysis) – and why. They want to find them early, while there is still time to do something to make them stay.
A ship management company wants to investigate the use of certain types of oil and optimise the delivery and use of it. (Though the oil is rather specialised, the ships are large and the expense significant.)
Norsk Tipping runs a service helping people who are in danger of becoming addicted to gaming, an important part of their societal responsibility which they take very seriously. They want to identify which of their customers are most likely to benefit from intervention. This is a rather tricky and interesting problem – you need to identify not only those who are likely to become addicted, but also make a judgement as to whether the intervention (of which there is limited capacity) is likely to help.
A major health club chain wants to identify customers who are not happy with their services, and they want to find them early, so they can make offers to activate them and make them stay.
A regional bank wants to identify customers who are about to leave them, particularly those who want to move their mortgage somewhere else. (This is also a problem of unbalanced data sets, since most customers stay.)
A major electronic goods retailer wants to do market basket analysis to be able to recommend and stock products that customers are likely to buy together with others.
All in all, a fairly typical set of examples of the use of machine learning and analytics in business – and I certainly like to work with practical examples with very clearly defined benefits. Now – a small matter of implementation!
There are many things to say about Stephen Fry, but enough is to show this video, filmed at Nokia Bell Labs, explaining, amongst other things, the origin of microchips, the power of exponential growth, the adventure and consequences of performance and functionality evolution. I am beginning to think that “the apogee, the acme, the summit of human intelligence” might actually be Stephen himself:
(Of course, the most impressive feat is his easy banter on hard questions after the talk itself. Quotes like: “[and] who is to program any kind of moral [into computers ]… If [the computer] dives into the data lake and learns to swim, which is essentially what machine learning is, it’s just diving in and learning to swim, it may pick up some very unpleasant sewage.”)
(This is a translation of an earlier post in my Norwegian blog. This translation was done by Ragnvald Sannes using Google Translate with a few amendments. This technology malarky is getting better and better, isn’t it?).
I have just finished teaching four days of data analytics – proper programming and data collection. We (Chandler, Alessandra and the undersigned) have managed to trick over 30 executives and middle managers in Norway to attend a programming and statistics course (more or less, this is actually what analytics basically is), while sort of wondering how we did that. The students are motivated and hard-working and have many and smart questions – in a course taught in English. It is almost enough to make me stop complaining about the state of the world and education and other things.
Anyway – what are these students going to do with this course? We are working on real projects, in the sense that we require people to come up with a problem they will find out in their own job – preferably something that is actually important and where deep data analysis can make a difference. This has worked for almost all the groups: They work on real issues in real organizations – and that is incredibly fun for the teacher. Here is a list of the projects, so judge by yourself. (I do not identify any students here, but believe me – these people face these issues every day.) Well worth spending time on:
What is the correct price for newly built homes? A group is working to figure out how to price homes that are not built yet, for a large residential building company.
What is the tax effect of the sharing economy? This group (where one student works for the Tax Administration) tries to figure out how to identify people who cheat on the tax as Uber drivers – while making suggestions on how tax rules can be adapted to make it easy to follow the law.
What characterizes successful consulting proposals? A major consulting firm wants to use data from their CRM system (which documents the bidding process) to understand what kind of projects they will win or lose.
How to recognize money laundering transactions? A bank wants to find out if any of their customers are doing money laundering through online gaming companies.
How to offer benefits to customers with automated analysis? A company that supplies stock trading terminals wants to use data analysis to create a competitive edge.
How to segment Norwegian shareholders? A company that offers online trading of shares wants to identify segments of its customers to pinpoint and improve its marketing strategy.
How to lower costs and reduce the risk of production stoppages in a process business? A hydropower company wants to better understand when and why your power stations need repairs or maintenance.
How to identify customers who are in the process of terminating? A TV company wants to understand what characterizes “churn” – how can they identify customers who are about to leave them?
Why are some wines more popular than others? A group will work with search data from a wine site to find out what makes some wines more sought after than others.
Which customers will buy a new product? A group is working on data from a large bank that wants to offer its existing customers more services.
How to increase the recycling rate for waste in Oslo? REN – Oslo’s municipal trash service – wants to find out if you can organize routes and routines differently to better utilize trash trucks and recycling plants.
How to avoid being sold out for promotional items? One of Norway’s largest grocery chains wishes to improve their ordering routines so that customers do not get to the store and find out that there is no more left of the offer they wanted.
How to model fraud risk in maritime insurance? An insurance company wants to build a model to understand how to find customers attempting to fraud companies or authorities.
Which customers are about to leave us? A large transport company wants to find out which customers are about to go to a competitor so that they can take action before it happens.
What characterize students who drop out? BI enters 3500 new students each year, but some of them end after the first year. How can we find evidence that a student is about to drop out?
Common to all the projects – and so it’s with all the student projects I have advised since I started in this industry – is that you start with a big question and reduce it to something that can actually be answered. Then you look for data and find that you need to reduce it even more. Then you get problems that the data is either not found, unreliable or inadequate – and one has to figure out what to do with it. Finally, after about 90% of the time and money budget is gone, one can begin to think about analysis. And then there is a risk that you find nothing…
And that is an important lesson of this course: The goal is that the student should be able to know about actual data analysis to ask the right questions and have a realistic expectation of what kind of answer you actually can get.
My excellent colleagues Alessandra Luzzi and Chandler Johnson have pointed me to this video, a keynote speech from 2015 by Ken Rudin, head of analytics at Facebook:
This is a really good speech, and almost an advertisement for our course Analytics for Strategic Management, which starts in two days (and, well, sorry, it is full, but will be arranged again next year.)
In the talk (starting about 1:30 in), Ken breaks down four common myths surrounding Big Data:
Big Data does not necessarily imply use of certain tools, in particular Hadoop. Hadoop can sift through mountains of data, but other tools, such as relational databases, are better at ad hoc analysis once you have structured the data and determined what of the data that is interesting and worth analyzing.
Big Data does not always provide better answers. Big Data will give you more answers, but, as Rudin says, can give you “brilliant answers to questions that no one cares about.” He stated the best way to better answers to formulate better question, which requires hiring smart people with “business savvy” who will ask how to solve real business problems. Also, you need to place the data analysts out in the organization, so they understand how the business runs and what is important. He advocates an embedded model – centrally organized analysts sitting geographically with the people they are helping.
Data Science is not all science. A lot of data science has an “art” to it, and you have to have a balance. Having a common language between business and analytics is important here – and Facebook sends its people to a two-week “Data Camp” to learn that. You ned to avoid the “hippo” problem – the highest paid person’s opinion – essentially, not enough science. The other side is the “groundhog” issue – based on the movie – where the main character tries to win the girl by gradual experimentation. Data is like sandpaper – it cannot create a good idea, but it can shape it after it has been created.
The goal of analytics is not insights, but results. To that end, data scientists have to help making sure that people act on the analysis, not just inform them. “An actionable insight that nobody acts on has no value.”
To the students we’ll meet on Tuesday: This is not a bad way of gearing up for the course. To anyone else interested in analytics and Big Data: This video is recommended.
(And if you think, like I do, that his sounds like the discussion of what IT should be in an organization 20 years ago – well, fantastic, then we know what problems to expect and how to act on them.)
The course (over five modules) is aimed at managers who want to become sophisticated consumers of analytics (be it Big Data or the more regular kind). The idea is to learn just enough analytics that you know what to ask for, where the pressure points are (so you do not ask for things that cannot be done or will be prohibitively expensive). The participants will learn from cases, discussions, live examples and assignments.
Central to the course is a course analytics project, where the participants will seek out data from their own company (or, since it will be group work, someone else’s), figure out what you can do with the data, and end up, if not with a finished analysis (that might happen), at least with a well developed project specification.
The course will contain quite a bit of analytics – including a spot of Phython and R programming – again, so that the executives taking it will know what they are asking for and what is being done.
We were a bit nervous about offering this course – a technically oriented course with a February startup date. The response, however, has been excellent, with more than 20 students signed up already. In fact, wi will probably be capping the course at 30 participants, simply because it is the first time we are teaching it, and we are conscious that for the first time, 30 is more than enough, as we will be doing everything for the first time and undoubtedly change many things as we go along.
If you can’t do the course this year – here are a few stating pointers to whet your appetite:
Big Data is difficult to define. This is always the case with fashionable monikers – for instance, how big is “big”? – but good ol’ Wikipedia comes to the rescue, with an excellent introductory article on the concept. For me, Big Data has always been about having the entire data set instead of a sample (i.e., n = p), but I can certainly see the other dimensions of delineation suggested here.
Data may be big but often is bad, causing data scientists to spend most of their time fixing errors, cleaning things up and, in general, preparing for analytics rather than the analysis itself. Sometimes you can almost smell that the data is bad – I recommend The Quartz guide to bad data as a great list of indicators that something is amiss.
Data scientists are few, far between and expensive. There is a severe shortage of people with data analysis skills in Norway and elsewhere, and the educational systems (yours truly excepted, of course) is not responding. Good analysts are expensive. Cheap analysts – well, you get what you pay for. And, quite possibly, some analytics you may like, but not what you ought to get.
There is lots of data, but a shortage of models. Though you may have the data and the data scientists, that does not mean that you have good models. It is actually a problem that as soon as you have numbers – even though they are bad – they become a focal point for decision makers, who show a marked reluctance to asking where the data is coming from, what it actually means, and how the constructed models have materialised.
And with that – if you are a participant, I look forward to seeing you in February. If you are not – well, you better boogie over to BIs web pages and sign up.