* Plus a few papers, videos, and other fun things.
We love books, and it feels fitting to end 2015 with a round-up of a few that we thought were particularly excellent this year.
Our favorite book for data science beginners: Data Science from Scratch: First Principles with Python by Joel Grus (and if you order from O’Reilly directly, use discount code PCBW and get 50% off most ebooks/videos and 40% off most print books).
Our favorite machine learning resources for developers and data scientists: There wasn’t one machine learning book we loved more than these papers and videos: Yoav Goldberg’s Primer on Neural Network Models for Natural Language Processing (PDF), Yann LeCun, Yoshua Bengio & Geoffrey Hinton’s paper on Deep Learning, Chris Olah’s neural network blog and finally, this YouTube channel with two minute paper descriptions and Tom Scott’s video Automated Weapons and the Battlefield of 2050. And if you’re a Python developer, we liked Fluent Python for brushing up on idiomatic Python 3.
Science fiction that helps us imagine the future: We loved Neal Stephenson’s Seveneves for the near-future space hacking and Ann Leckie’s Ancillary Justice for the unusual artificially intelligence characters in a far-future society (the third book in the trilogy was published this year). For a far far future space criminal caper with lots of imaginative future technology, give Hannu Rajaniemi’s The Quantum Thief a try.
Resources for managing data projects: We recommend this paper on Technical Debt in Machine Learning Systems (PDF) for some perspective on the long-term maintenance of machine learning models.
This is really not a book: …but we love the Talking Machines podcast, with lively conversation with guests from across the machine learning community.
More from the Blog
Dec 15 2015
Boston-based NextView Ventures runs a podcast series called Traction that features interviews with exciting new startups. Their latest podcast features an interview with our own Hilary Mason. Some highlights: In its essence, data science is the practice of learning insights from a data set and building a product based upon these learnings. The Fast Forward Labs team starts every data...
Jan 12 2016
We’re excited to start 2016 by putting our Natural Language Generation (NLG) report and prototype on sale! We’re offering multiple packages, including a short report focused on business value for C-level executives. NLG is a technology that allows software systems to write articles and reports. Our RoboRealtor prototype shows how to use NLG to generate real estate ads from a few apartment att...
Aug 15 2017
by — The Tabula Rogeriana, a world map created by Muhammad al-Idrisi through traveler interviews in 1154. The Wikipedia corpus is one of the favorite datasets of the machine learning community. It is often used for experimenting, benchmarking and providing how-to examples. These experiments are generally presented separate from the Wikipedia user interface, however, which has remained true to the...