The Value of Going "Old School" in Data Science
By Dr. Tim Oates, Chief Data Scientist
When I entered graduate school to work on a Ph.D. in Artificial Intelligence, the outlook for the field was uncertain, with dwindling attendance at the main national conferences and at best a lukewarm job market. Things could not be more different today! There’s genuine excitement in academia and industry, and even the popular press about recent advances in AI and Machine Learning. We’ve got machines that beat the world’s best human players at Jeopardy! and Go and learn to play Atari games like people by watching the video screen and experimenting.
That may be fun and games, but underneath these successes are two driving forces, one more important than the other: new learning algorithms and more data. Which do you think is the more important one? Sometimes we solve new problems because we didn’t have the right algorithms before. More often than not, especially in the business world, we solve new problems because we didn’t have the data before or, more likely, we didn’t have the time to look at the data we had carefully enough.
I’ve seen this play out over and over again in talking to businesses about their data. Companies of all sizes know their own data well, and typically have great ideas about how they’d like to use it in new and interesting ways, but just don’t know how to get started. They may have heard about deep learning or deep neural networks and wonder if these new, powerful tools could be the right thing. But 9 times out of 10, it’s better to go old school and use tools that have been around in one form or another since I was a grad student. And this isn’t a simple case of nostalgia. The first International Conference on Machine Learning was in 1980, and very powerful, well-understood learning algorithms that have been around for decades at this point are often simpler, faster, easier to explain, and just produce better results that are easier to interpret than the latest and greatest 100-layer Convolutional Neural Network.
Sometimes you have to break out the algorithmic big guns, but more often than not there’s a better path to extracting value from data, where “better” means faster, cheaper, and more impactful for the business. The key is knowing the toolbox well and choosing the right tool for the job at hand.