If the only tool you have is Artificial Intelligence, treat everything as if it were a prediction!

Probably no adage has been responsible for more bad and good business practices than the law of the Instrument, attributed to Abraham Maslow which reads “I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail.”

Hammer

The phrase is attributed to Maslow in The Psychology of Science, 1966, and his earlier book, Toward a Psychology of Being (1962). Sharlyn Lauby has drawn the following lesson from the law: “We need to choose the tools we work with carefully.” Some tools are adaptable, while others should be employed “only for their intended purpose”.

However, this may not be the way economists see the use of a hammer. From an economist’s point of view the more nails you can find the lower the unit costs of manufacturing your hammers, so paraphrasing Shakespeare, economists are most likely to say “the world’s mine nail, Which I with hammer will pummel.”

Regardless of it’s origins the simple adage it is having an impact in, of all places, the avant-garde world of Artificial Intelligence (AI). Great advances in AI have been made within the last few years, primarily because the approach to performing AI changed significantly. Initially AI researchers thought they would use “logic-based” AI but that became intractable so they switched to “statistical” AI . Since then advances have been dramatic and widely adopted. Manufacturers of AI products are now on the hunt for as many “statistical problems” as they can find.

Machine learning, a prominent type of statistical AI, is employed in a range of computing tasks where designing and programming explicit algorithms with good performance is difficult or infeasible. Examples of Machine Learning include email filtering, detection of network intruders or malicious insiders working towards a data breach,[6] optical character recognition (OCR), learning to rank, and computer vision.

The objective of a machine-learning model is the identification of statistically reliable relationships between certain features of “input data” and a “target variable” or
outcome. Newton’s principle of Induction is the first unwritten rule of machine learning. “We induce the most widely applicable rules we can and reduce their scope only when the data forces us to”.

Now the authors of “The Simple Economics of Machine Intelligence” have been even more informative in terms of the scope to which machine learning can be applied. According to Ajay Agrawal, Joshua Gans, and Avi Goldfarb, all professors at the University of Toronto’s Rotman School, machine intelligence technology is, in essence, a “prediction technology”, relying on statistics and probability. So, like those hammers and nails and “swords and oysters”, the more things we need to predict the more valuable AI and machine learning applications become.

Arthur Samuel, an American pioneer in the Field of computer gaming and artificial intelligence, coined the term “Machine Learning” in 1959 while at IBM. Within the field of data analytics, machine learning is a method used to devise models and algorithms that lend themselves to prediction; in commercial use, it’s known as “predictive analytics”. These analytical models allow researchers, data scientists, engineers, and analysts to “produce reliable, repeatable decisions and results” and uncover “hidden insights” through learning from historical relationships.
The calculation of probabilities has been around since the 15th century when Blaise Pascal was asked to resolve a gambling dispute among some French noblemen but it spawned many variations and has been widely applied to enable most artificial intelligence applications. The following are generally accepted Machine Learning techniques. The adoption of general statistical/prediction technologies should be apparent:

  • Regression Analysis
  • Clustering Analysis
  • Dimensionality ReductionSupport
  • Vector Machine
  • Artificial Neural Networks
  • Decision Trees
  • Association Analysis
  • Recommender Systems

According to the authors of “The Simple Economics of Machine Intelligence” an economic shift, related to the application of AI, will center around a drop in the cost of prediction, which will in turn yield two key implications:
As machine intelligence lowers the cost of prediction, we’ll begin to use it as an input for things for which we never previously did. (i.e. Hammers & Nails, Swords & Oysters, etc.)
The value of other inputs, related to prediction, like judgement and creativity, will increase based on the degree to which they compliment or substitute for prediction, compliments will increase in value and substitutes will decrease in value.
( https://hbr.org/2016/11/the-simple-economics-of-machine-intelligence )

Human beings are really bad at predicting the future. That includes experts. In the largest and best-known test of the accuracy of expert predictions, a study reported in Philip Tetlock’s book Expert Political Judgment: How Good Is It? How Can We Know? “…the average expert was found to be only slightly more accurate than a dart-throwing chimpanzee. Many experts would have done better if they had made random guesses. And even the best forecasters were beaten by arbitrary rules such as “always predict no change”.

Regardless of the technique used to produce intelligence, logic-based or statistical, the scope of AI is focused on achieving an objective or target state.

A simple example of computers developing actions to achieve a target state is the often maligned but widely used “credit scoring” model. Credit scoring has almost all of the features of the AI that machines could use to evaluate courses of action to achieve a target. This is how it works.

Credit score

The First step in credit scoring is the establishment of an objective or target state. In the case of credit scoring the target state is the “probability of default” (PD). Credit reporting agencies are aficionados of “Big Data”. Big data is really nothing more than databases with extraordinarily large amounts of data structured to enable identification of relationships, patterns and trends.

In the case of credit scoring, big data is used as the basis for calculating the probability of default for different groups of consumers. While “Big Data” may be relatively new, the calculation of probabilities has been around since the 15th century.
The second and maybe the most important step in the calculation of credit scores is the determination of variables correlated with the PD. Groups of consumers are then stratified into ranges from highest to lowest probability of default, based on having one or more variables, correlated with the PD, in their credit score. Points are assigned to the different strata. For most credit scoring models the strata along with their point values look something like the following:
• Excellent 815–850
• Very Good 755–814
• Good 666–754
• Fair 562–665
• Poor 504–561
• Very Poor 300–503

Once the strata are in place it’s a simple matter of assigning credit applicants to one strata or another based on the degree to which applicant data match variables correlated with the PD. The more an applicant’s data match data correlated with the probability of default, the worse the machine considers the applicant and the fewer points assigned.
So the general rules of the credit scoring application are:

  • Identify objective state (e.g. probability of default PD)
  • Identify variables correlated with objective state ( variables associated with probability of default)
  • Identify variables associated with input (i.e. credit applicant)
  • Find correlation of variables associated with input (credit applicant) to variables of objective state ( PD).
  • If correlation is high then PD for credit applicant is high.

Let’s look at another example that’s ai bit more abstract like “beauty”. Paralleldots Inc. recently announced they’ve developed a Deep Convolutional Neural Network (CNN) that can be “trained to recognize an image’s “aesthetic quality”, not unlike identifying the probability of default. They’ve provided a demo of how their CNN for visual analysis can be applied at this website: ( https://www.paralleldots.com/visual-analytics ).
According to the Paralleldots researchers, “Visual aesthetic analysis is the task of classifying images into being perceived as attractive or unattractive by humans.” In this regard a variety of standard CNN architectures have been pre-trained on the ImageNet dataset and are readily available as open source for use. ImageNet is an image database organized according to the WordNet hierarchy, in which each node of the hierarchy is depicted by hundreds and thousands of images.
Following is a sample of the Google AVA dataset. Researchers at Paralleldots used these objects along with hundreds of others in the Google AVA dataset to enable their CNN to perform “visual aesthetic analysis” by identifying AVA images that “went viral on the internet” (i.e. what the Paralleldots team defined as “visually aesthetic” ).

Google Visuals

The Virality Detection API was built by training a super deep neural network on a huge corpus of images (i.e. Google AVA dataset) and their scores crawled from the open web. The score that you get as output is the virality score of the input photo out of maximum score of 100. The Paralleldots team says their in-house experiments suggest that accuracy of their algorithm to predict image virality is as high as 85%.

For Example when I ran the following picture of an Arizona sunset through the vitality detection API it received a score of 92%. The picture of the Charles Bridge in Prague received a score of 34%.

Arizona

Charles BridgeSo the Paralleldots Virality Detection API is “predicting” humans will find the Arizona sunset more attractive than the Charles bridge, based on the probability that either will “go viral” on the Internet.
While the Virality detection application may appear to be much more sophisticated than a mundane credit scoring application the prediction principles on which both are based are the same:

 

 

 

  • Identify objective state (e.g. probability of going viral on Internet)
  • Identify variables correlated with objective state ( variables associated photos that went viral on Internet)
  • Identify variables associated with input (i.e. famous paintings)
  • Find correlation of variables associated with input (famous paintings) to variables of objective state ( vitality on Internet).
  • If correlation is high then “beauty” (i.e. going viral on the internet) for famous painting (e.g. Arizona Sunset) is high. ( i.e. the Arizona Sunset is “more beautiful” than the Charles Bridge)

All of this may be unsettling for art critics and many are probably saying AI people are simply seeing all evaluations of art as “nails” because they now have an AI hammer. However, according to the University of Toronto researchers, the value of other inputs, related to predicting the beauty of pieces of art, like judgement and creativity, will increase based on the degree to which they compliment or substitute for predicting the beauty of art, compliments will increase in value and substitutes will decline in value.
Art critics may want to think about developing their skills on tasks the University of Toronto researchers found where humans excel & machines struggle, like:

  • Learning to learn – Humans are good at learning to learn — they can learn a new skill completely unrelated to their current skill set, can decide what to learn and find and gather data accordingly, can learn implicitly/subconsciously, can learn from a variety of instruction formats, and can ask relevant questions to enhance their learning. Machines today are only beginning to learn to learn.
  • Common Sense – Humans are good at exercising “common sense”, i.e. judgment in universal ways without thinking expansively or requiring large data sets.
  • Intuition & “zeroing-in” – The human brain is good at exercising intuition and zeroing in, i.e. finding a fact, idea or course of action from a very large, complex, ambiguous set of options.
  • Creativity – True creativity would entail ability to generate novel solutions to problems not previously seen or to create truly innovative works of art
  • Empathy – remain uniquely human skills
  • Versatility – Machines and robots are still purpose built for specific tasks

While it’s true that as machine intelligence increases, the value of human prediction skills will decreases, just as machines did for arithmetic, that should not spell doom for human art critics. That’s because the value of the judgment skills of art critics will increase.

Using the language of economists, the judgment of art critics can be a complement to the automation of predicting beauty and therefore when the cost of predicting beauty falls, demand for art critics, with good judgment, will rise. The demand for more art critics with good human judgment will rise as AI manufacturers turn pieces of art into nails or oysters or anything else they can predict!

____________________________________________________________________________________________

Notes:

  1. William Shakespeare’s Pistol in “The Merry Wives of Windsor” Act II, Scene II.”
  2. Domingos, Pedro. The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World (p. 66). Basic Books. Kindle Edition.
  3. William Shakespeare’s Pistol in “The Merry Wives of Windsor” Act II, Scene II.”
  4. Oliver Theobald, Machine Learning for Beginners, 2017
  5. Philip E. Tetlock, Expert Political Judgement, Princeton University Press, 2005
  6. Artificial Intelligence — Human Augmentation is what’s here and now

Leave a comment