Feb 3, 2016 · post
What History Teaches Us About Data Science
The FFL team at the New York Historical Society’s Silicon City exhibit
Study the past if you would define the future. — Confucius
Until April 17, 2016, the New York Historical Society is featuring an exhibition called Silicon City: Computer History Made in New York. The Fast Forward Labs team took a field trip to the museum back in December to augment our perspective on our current machine intelligence research (and, of course, to geek out and have fun).
One of the highlights was to see Fast Forward Labs itself appear at the end of the tour! Indeed, the Society closes the exhibition with an interactive map displaying where innovative NYC startups are located today across Manhattan and the boroughs. We’re honored to be featured in this company spotlight, where we talk about why New York is an ideal home for the tech industry and what trends we envision for the city in the future.
The exhibition led us to reflect on why we believe it’s important to remember our history as we build our future. Here are a few things we loved:
1. Claude Shannon, the father of UX design?
We tend to think that machine learning is just taking off, but the foundations of many contemporary techniques date back to the 1950s (Frank Rosenblatt built his hardware Perceptron, the ancestor of the modern neural network, in 1957).
One of our favorite items in Silicon City was a model of Theseus, the maze-solving mouse (as in rodent, not input device) Claude Shannon built using machine learning techniques. (Theseus was the Greek hero who navigated the labyrinth to kill the Minotaur, finding his way back out thanks to the ball of thread his lover Ariadne gave him to mark his path.)
In 1952, Bell Labs produced a video where Shannon himself describes how Theseus can solve a “certain class of problems with trial and error and then remember the solution; in other words, he can learn from experience.” One of the most interesting aspects of the device is that the computing power does not reside in the mouse, but rather in a vast amount of hardware underneath the maze display. Theseus, therefore, is really just a UX feature, the metaphorical interface humans engage with to understand machine learning in their own sensory and intellectual terms.
If we think about it, chat bots and other new linguistic interfaces are also a form of UX design. Thanks to advances in language generation technologies, designers can now present the output of complex data models as friendly conversations, not just simple and elegant buttons or visuals.
2. Products (and models) have their moment of maturity, but we must be wary of trends
The Greeks had three forms of time: aeon (eternity), chronos (linear time), and kairos (the right or opportune moment). Products, as data science techniques, follow the laws of kairos. Some products fail because they are released to market prematurely, only to soar in future generations (e.g., Apple Newton). Some algorithms fail because they don’t yet have adequate data to realize their computational potential, only to lead to amazing breakthroughs when more data is available (e.g., artificial neural networks).
That said, data scientists and engineers have to retain perspective when selecting the right model to solve a given problem. The fact that everyone’s talking about reinforcement learning doesn’t mean it’s always the right approach. Simple, transparent statistical models still do a great job on certain classes of problems or under practical constraints (e.g., training time).
We got a kick out of this failed Western Electric video conferencing system from 1968. Apparently adoption was low because people didn’t want to have to get out of their pajamas and comb their hair to talk on the phone! I must say I can empathize…
3. Statistics, computer science, and data science
We see many companies struggling to expand data efforts outside business intelligence/corporate financial analytics and into new product development. All the ink spilled of late on defining just what a data scientist is and what skills he/she should have is in part a symptom of the tectonic shift taking place in organizations.
But again, it’s worth putting the discipline of data science in perspective as an evolution of statistics and software engineering. In his recent essay, 50 Years of Data Science, Stanford Statistics Professor David Donoho contextualizes data science in a 50-year history of statistics. Sean Owen (from Cloudera) writes a good response emphasizing the importance of software engineering in practice.
4. IBM had really cool marketing in the 1930s-1950s
- Kathryn