Boston-based NextView Ventures runs a podcast series called Traction that features interviews with exciting new startups. Their latest podcast features an interview with our own Hilary Mason. Some highlights:
In its essence, data science is the practice of learning insights from a data set and building a product based upon these learnings.
The Fast Forward Labs team starts every data advisory engagement, be this with small startup or large enterprise clients, with a data census comprising three questions:
What data do we have? This is often harder to answer than it seems, especially in large enterprises with silo’d departments. We have our clients look at their products to write down what data they collect and look at their servers to inventory what kinds of data they store.
What data should we have? Here, we think about how the business runs and what types of questions are important for growth and success. We then ask clients to think about where they would store new data they may collect, be that in a Hadoop cluster, in Amazon’s S3, or on internal servers.
What assumptions have we made that we can now validate with data? Many companies have intuitions about market penetration and product opportunities. Analysis can verify or challenge these assumptions, opening new avenues for growth.
Fast Forward Labs was founded to provide a new vehicle for applied research, bridging the startup community with large enterprise to test and amplify machine learning technologies that will be impactful in the upcoming years.
The most important lesson Hilary has learned of late is how important it is to understand the history and evolution of different technologies. Contemporary programming environments are abstractions upon abstractions that can generate bizarre behavior when they hit edge cases. It’s important to grasp some of the historical quirks to resolve coding challenges.
Listen to the entire podcast here: https://soundcloud.com/nextview/15-skype-side-chat-on-data-science-inventing-the-future-hilary-mason-fast-forward-labs
More from the Blog
Dec 10 2015
We’re excited to announce a summer internship opportunity, which is open to current undergraduate and graduate students. To apply, send your resume and cover letter to firstname.lastname@example.org. Keep reading for details on responsibilities, qualifications, and perks. Research Engineering Intern Key Responsibilities You’ll spend the summer on our research engineering team. You’ll be expected...
Dec 21 2015
* Plus a few papers, videos, and other fun things. We love books, and it feels fitting to end 2015 with a round-up of a few that we thought were particularly excellent this year. Our favorite book for data science beginners: Data Science from Scratch: First Principles with Python by Joel Grus (and if you order from O’Reilly directly, use discount code PCBW and get 50% off most ebooks/videos a...
Aug 15 2017
by — The Tabula Rogeriana, a world map created by Muhammad al-Idrisi through traveler interviews in 1154. The Wikipedia corpus is one of the favorite datasets of the machine learning community. It is often used for experimenting, benchmarking and providing how-to examples. These experiments are generally presented separate from the Wikipedia user interface, however, which has remained true to the...