Dec 8, 2016 · post

Dimensionality Reduction and Intuition


“I call our world Flatland, not because we call it so, but to make its nature clearer to you, my happy readers, who are privileged to live in Space.”

So reads the first sentence of Edwin Abbott Abbott’s 1884 work of science fiction and social satire, Flatland: A Romance of Many Dimensions. At the time, Abbott used contemporary developments in the fields of geometry and topology (he was a contemporary of Poincaré) to illustrate the rigid social hierarchies in Victorian England. A century later, with machine learning algorithms playing an increasingly prominent role in our daily lives, Abbott’s play on the conceptual leaps required to cross dimensions is relevant again. This time, however, the dimensionality shifts lie not between two human social classes, but between the domains of human reasoning and intuition and machine reasoning and computation.

Much of the recent excitement around artificial intelligence stems from the fact that computers are newly able to process data historically too complex to analyze. At Fast Forward Labs, we’ve been excited by new capabilities to use computers to perceive objects in images, extract the most important sentences from long bodies of text, and translate between languages. But making complex data like images or text tractable for machines involves representing the data in high-dimensional vectors, long strings of numbers that encode the complexity of pixel clusters or relationships between words. The problem is these vectors become so large that it’s hard for humans to make sense of them: plotting them often requires a space of way more than the three dimensions we live in and perceive!

On the other hand, machine learning techniques that entirely remove humans from the loop, like automatic machine learning and unsupervised learning, are still active areas of research. For now, machines perform best when nudged by humans. And that means we need a way to reverse engineer the high-dimensionality vectors machines compute in back down to the two and three dimensional spaces our visual systems have evolved to make sense of. 

What follows is a brief survey of some tools available to reduce and visualize high-dimensional data. Send us a note at if you know of others!

Google’s Embedding Projector

Yesterday, Google open-sourced the Embedding Projector, a web application for interactive visualization and analysis of high-dimensional data that is part of TensorFlow. The release highlights how the tool helps researchers navigate embeddings, or mathematical vector representations of data, which have proved useful for tasks like natural language processing. A popular example is to use embeddings to do “algebra” on words, using the space between vectors as a proxy for semantic relationships like man:king::woman:queen. Embedding Projector includes a few dimensionality reduction techniques like Principal Component Analysis (PCA) and t-SNE. Here’s an example of using PCA on an image data set (done before Google’s release).


t-Distributed Stochastic Neighbor Embedding (t-SNE) is an increasingly popular non-linear dimensionality reduction technique useful for exploring local neighborhoods and finding clusters in data. As explained in this post, t-SNE algorithms adapt transformations to the structure of the input data they work on, and have a tuneable parameter called “perplexity” that “says (loosely) how to balance attention between local and global aspects of your data.” While the algorithms are powerful, their output representations must be read with care, as the perplexity parameter can create confusion. 

Visualization of how distance between clusters vary widely under different parameters on a t-SNE algorithm.

Mike Tyka, a machine learning artist, has used t-SNE to cluster images per similarity in Deep Dream’s neural network architecture. The resulting “map” reveals some interesting conclusions, showing, for example, that Deep Dream clusters violins near trombones. As the shapes of these two instruments differ to our eyes, their proximity in the neural network space may mean that Deep Dream uses the context of “people playing instruments” as a discriminatory feature for classification. 

Topological Data Analysis

Palo Alto-based Ayasdi uses theory from topology, the study of geometrical properties that stay constant even when shapes are transformed, to help humans find patterns in large data sets. As CEO Gurjeet Singh explains in this O’Reilly interview, the two key benefits of using topology for machine learning are:

  • The ability to combine results from different machine learning algorithms, while still maintaining guarantees about the underlying shapes or distributions
  • The ability to discover the underlying shape of data so you don’t assume it and, thereby, impact the parameters for an optimization problem

Ayasdi’s product visualizes relationships in data as graphs, enabling users to visually perceive relationships that would be hard to uncover in the language of formal equations. We love the parallel insight that we, as humans, excel at what topologists call “deformation invariance,” the property that the letter A is still the letter A in different fonts. 

Machines using an autoencoder to reconstruct digits with moderate deformation invariance, as we explained in this blog post.

Data Visualization for the 3-D Web

Finally, Datavized is working on a data analytics tool fit for the 3-D web. While they’ve yet to work on dimensionality reduction, they have embarked on projects to give consumers of data a more empathic, first-person interpretation of statistics and conclusions. We look forward to the release of their product in 2017!


Our ability to represent rich, complex data, like images and text, in numbers required for mathematical functions on computers requires a Mephistophelean deal with the devil. These high-dimensional vectors are impossible to understand and interpret. But there’s been great progress in dimensionality reduction and visualization tools that enable us, in our Flatland, to make sense of the strange, cold world of machine intelligence. 

- Kathryn

Read more

Dec 12, 2016 · guest post
Nov 23, 2016 · whitepaper

Latest posts

Jul 7, 2021 · post

Exploring Multi-Objective Hyperparameter Optimization

By Chris and Melanie. The machine learning life cycle is more than data + model = API. We know there is a wealth of subtlety and finesse involved in data cleaning and feature engineering. In the same vein, there is more to model-building than feeding data in and reading off a prediction. ML model building requires thoughtfulness both in terms of which metric to optimize for a given problem, and how best to optimize your model for that metric! more
Jun 9, 2021 ·

Deep Metric Learning for Signature Verification

By Victor and Andrew. TLDR; This post provides an overview of metric learning loss functions (constrastive, triplet, quadruplet and group loss), and results from applying contrastive and triplet loss to the task of signature verification. Other posts in the series are listed below: Pretrained Models as Baselines for Signature Verification -- Part 1: Deep Learning for Automatic Offline Signature Verification: An Introduction Part 2: Pretrained Models as Baselines for Signature Verification Part 3: Deep Metric Learning for Signature Verification In our previous blog post, , we discussed how pretrained models can serve as strong baselines for the task of signature verification. more
May 27, 2021 · post

Pre-trained Models as a Strong Baseline for Automatic Signature Verification

By Victor and Andrew. Figure 1. Baseline approach for automatic signature verification using pre-trained models TLDR; This post describes how pretrained image classification models can be used as strong baselines for the task of signature verification. Other posts in the series are listed below: Pretrained Models as Baselines for Signature Verification -- Part 1: Deep Learning for Automatic Offline Signature Verification: An Introduction Part 2: Pretrained Models as Baselines for Signature Verification Part 3: Deep Metric Learning for Signature Verification As discussed in our introductory blog post, offline signature verification is a biometric verification task that aims to discriminate between genuine and forged samples of handwritten signatures. more
May 26, 2021 · post

Deep Learning for Automatic Offline Signature Verification: An Introduction

By Victor and Andrew. Figure 1. A summary of tasks that comprise the automatic signature verification pipeline (and related machine learning problems). TLDR; This post provides an overview of the signature verification task, use cases, and challenges. A complete list of the posts in this series is outlined below: Pretrained Models as Baselines for Signature Verification -- Part 1: Deep Learning for Automatic Offline Signature Verification: An Introduction Part 2: Pretrained Models as Baselines for Signature Verification Part 3: Deep Metric Learning for Signature Verification Given two signatures, automatic signature verification (ASV) seeks to determine if they are produced by the same user (genuine signatures) or different users (potential forgeries). more
Nov 15, 2020 · post

Representation Learning 101 for Software Engineers

by Victor Dibia · Figure 1: Overview of representation learning methods. TLDR; Good representations of data (e.g., text, images) are critical for solving many tasks (e.g., search or recommendations). Deep representation learning yields state of the art results when used to create these representations. In this article, we review methods for representation learning and walk through an example using pretrained models. Introduction Deep Neural Networks (DNNs) have become a particularly useful tool in building intelligent systems that simplify cognitive tasks for users. more
Jun 22, 2020 · post

How to Explain HuggingFace BERT for Question Answering NLP Models with TF 2.0

by Victor · Given a question and a passage, the task of Question Answering (QA) focuses on identifying the exact span within the passage that answers the question. Figure 1: In this sample, a BERTbase model gets the answer correct (Achaemenid Persia). Model gradients show that the token “subordinate ..” is impactful in the selection of an answer to the question “Macedonia was under the rule of which country?". This makes sense .. good for BERTbase. more

Popular posts

Oct 30, 2019 · newsletter
Exciting Applications of Graph Neural Networks
Nov 14, 2018 · post
Federated learning: distributed machine learning with data locality and privacy
Apr 10, 2018 · post
PyTorch for Recommenders 101
Oct 4, 2017 · post
First Look: Using Three.js for 2D Data Visualization
Aug 22, 2016 · whitepaper
Under the Hood of the Variational Autoencoder (in Prose and Code)
Feb 24, 2016 · post
"Hello world" in Keras (or, Scikit-learn versus Keras)


In-depth guides to specific machine learning capabilities


Machine learning prototypes and interactive notebooks


A usable library for question answering on large datasets.

Explain BERT for Question Answering Models

Tensorflow 2.0 notebook to explain and visualize a HuggingFace BERT for Question Answering model.

NLP for Question Answering

Ongoing posts and code documenting the process of building a question answering model.

Interpretability Revisited: SHAP and LIME

Explore how to use LIME and SHAP for interpretability.


Cloudera Fast Forward is an applied machine learning reseach group.
Cloudera   Blog   Twitter