Defining Machine Learning

Am I the only one that finds that most definitions of Machine Learning are highly misleading? Google the exact phrase “without being explicitly programmed” and you’ll find thousands of variants that everyone seems to embrace. At the risk of being a contrarian, I find that this phrase conjures up images of computers learning like puppies or babies. Something which i believe is still years (or decades) from achieving.

Yes, I know that the phrase is attributed to Samuels (this is a topic worthy of a little internet searching too) and that it has a glimmer of truth to it, but I think we are all going to have to suffer though executive briefings where we are trying to explain why our supervised learning models have actually degraded over time, not magically improved.

I can across one particularly striking definition on Wikipedia: Machine learning (ML) is the study of computer algorithms that improve automatically through experience. It is attributed to Tom Mitchell’s Machine Learning (1997). I can’t claim to have read the whole book so I don’t want to offend by taking the definition out of context, but that sounds almost magical to me.

For the time being, supervised learning still dominates. We still need learning algorithms. See the 9th Law of Data Mining for more about how our models still degrade. Most techniques, other than Deep Learning, still need feature engineering to be effective. And since Deep Learning is black box many can’t use Deep Learning. We need to explain to our colleagues what machine learning is without appealing to magic. There are even t-shirts with phrase like “the future is unsupervised” which I find rather charming, but at client sites I’m still solving problems with supervised learning.

Not surprisingly Russell & Norvig do an excellent job with their definition.

An agent is learning if it improves its performance after making observations about the world. When the agent we call it machine learning: a computer observes some data, builds a model based on the data, and uses the model as both a hypothesis about the world and a piece of software that can solve problems.

Lately, I’ve been using the following definitions in my presentations and courses.

Machine Learning:

A broad term that generally refers to presenting carefully curated data to computer algorithms that find patterns and systematically generate models (formulas and rule sets).

Now, Supervised Machine Learning:

Given a dataset with a “Target variable” and “input variables” a modeling algorithm automatically generates a model (a formula or a ruleset) that establishes a relationship between the Target and some or all of the input variables.

So, while the algorithms are explicitly programmed, the models are not.

I could be accused of spending a lot of time reading definitions and crafting my own. I’d be guilty of that accusation. But I believe them to be important because carefully written they clarify what we are trying to do. Also, in a work setting, it can have the very practical impact of knowing “who should do what.”

How do we know what assignments to give the data science team if we haven’t decided what they do?