Blog

Aug 26, 2016 · guest post

Exploring Deep Learning on Satellite Data

This is a guest post featuring a project Patrick Doupe, now a Senior Data Analyst at Icahn School of Medicine at Mount Sinai, completed as a fellow in the Insight Data Science program. In our partnership with Insight, we occassionally advise fellows on month-long projects and how to build a career in data science.

Machines are getting better at identifying objects in images. These technologies are used to do more than organise your photos or chat your family and friends with snappy augmented pictures and movies. Some companies are using them to better understand how the world works. Be it by improving forecasts on Chinese economic growth from satellite images of construction sites or estimating deforestation, algorithms and data can help provide useful information about the current and future states of society.

In early 2016, I developed a prototype of a model to predict population from satellite images. This extends existing classification tasks, which ask whether something exists in an image. In my prototype, I ask how much of something not directly visible is in an image? The regression task is difficult; current advice is to turn any regression problem into a classification task. But I wanted to aim higher. After all, satellite image appear different across populated and non populated areas.

##### Populated region

##### Empty region

The prototype was developed in conjuction with Fast Forward Labs, as my project in the Insight Data Science program. I trained convolutional neural networks on LANDSAT satellite imagery to predict Census population estimates. I also learned all of this, from understanding what a convolutional neural network is, to dealing with satellite images to building a website within four weeks at Insight. If I can do this in a few weeks, your data scientists too can take your project from idea to prototype in a short amount of time.

LANDSAT-landstats

Counting people is an important task. We need to know where people are to provide government services like health care and to develop infrastructure like school buildings. There are also constitutional reasons for a Census, which I’ll leave to Sam Seaborn.

We typically get this information from a Census or other government surveys like the American Community Survey. These are not perfect measures. For example, the inaccuracies are biased against those who are likely to use government services.

If we could develop a model that could estimate the population well at the community level, we could help government services better target those in need. The model could also help governments that facing resources constraints that prevent the running of a census. Also, if it works for counting humans, then maybe it could work for estimating other socio-economic statistics. Maybe even help provide universal internet access. So much promise!

So much reality

Satellite images are huge. To keep the project manageable I chose two US States that are similar in their environmental and human landscape; one State for model training and another for model testing. Oregon and Washington seemed to fit the bill. Since these states were chosen based on their similarity, I thought I would stretch the model by choosing a very different state as a tougher test. I’m from Victoria, Australia, so I chose this glorious region.

Satellite images are also messy and full of interference. To minimise this issue and focus on the model, I chose the LANDSAT Top Of Atmosphere (TOA) annual composite satellite image for 2010. This image is already stitched together from satellite images with minimal interference. I obtained the satellite images from the Google Earth Engine. I began with low resolution images (1km) and lowered my resolution in each iteration of the model.

For the Census estimates, I wanted the highest spatial resolution, which is the Census block. A typical Census block contains between 600 and 3000 people, or about a city block. To combine these datasets I assigned each pixel its geographic coordinates and merged each pixel to its census population estimates using various Python geospatial tools. This took enough time that I dropped the bigger plans. Best get something complete than a half baked idea.

A very high level overview of training Convolutional Neural Networks

The problem I faced is a classic supervised learning problem: train a model on satellite images to predict census data. Then I could use standard methods, like linear regression or neural networks. For every pixel there is number corresponding to the intensity of various light bandwidths. We then have the number of features equal to the number of bandwidths by the number of pixels. Sure, we could do some more complicated feature engineering but the basic idea could work, right?

Not really. You see, a satellite image is not a collection of independent pixels. Each pixel is connected to other pixels and this connection has meaning. A mountain range is connected across pixels and human built infrastructure is connected across pixels. We want to retain this information. Instead of modelling pixels independently, we need to model pixels in connection with their neighbours.

Convolutional neural networks (hereafter, “convnets”) do exactly this. These networks are super powerful at image classification, with many models reporting better accuracy than humans. What we can do is swap the loss function and run a regression.

Figure of Convolutional Neural Network showing the processing of an image of a horse

##### Diagram of a simple convolutional neural network processing an input image. From Fast Forward Labs report on Deep Learning: Image Analysis

Training the model

Unfortunately convnets can be hard to train. First, there are a lot of parameters to set in a convnet: how many convolutional layers? Max-pooling or average-pooling? How do I initialise my weights? Which activations? It’s super easy to get overwhelmed. Micha suggested I use the well known VGGNet as a starting base for a model. For other parameters, I based the network on what seemed to be the current best practices. I learned these by following this winter’s convolutional neural network course at Stanford.

Second, they take a lot of time and data to train. This results in training periods of hours to weeks, while we want fast results for a prototype. One option is to use pre-trained models, like those available at the Caffe model zoo. I was writing my model using the Keras python library, which at present doesn’t have as large a zoo of models. Instead, I chose to use a smaller model and see if the results pointed in a promising direction.

Results

To validate the model, I used data from on Washington and Victoria, Australia. I show the model’s accuracy on the following scatter plot of the model’s predictions against reality. The unit of observation is the small image-observation used by the network and I estimate the population density in an image. Since each image size is the same, this is the same as estimating population. Last, the data is quasi log-normalised[6]. Let’s start with Washington

##### Washington State

We see that the model is picking up the signal. Higher actual population densities are associated with higher model predictions. Also noticeable is that the model struggles to estimate regions of zero population density. The R² of the model is 0.74. That is, the model explains about 74 percent of the spatial variation in population. This is up from 26 percent in the four weeks achieved in Insight.

##### Victoria

A harder test is a region like Victora with a different natural and built environment. The scatter plot of model performance shows the reduced performance. The model’s inability to pick regions of low population is more apparent here. Not only does the model struggle with areas of zero population, it predicts higher population for low population areas. Nevertheless, with an R² of 0.63, the overall fit is good for a harder test.

An interesting outcome is that the regression estimates are quite similar for both Washington and Victoria: the model consistently underestimates reality. In sample, we still have a model that underestimates population. Given that the images are unlikely to have enough information to identify human settlements at current resolution, it’s understandable that the model struggles to estimate population in these regions.

Variable	A perfect model	Washington	Victoria	Oregon (in sample)
Intercept	0	-0.43	-0.37	-0.04
Slope	1	0.6	0.6	0.86
R²	1	0.74	0.63	0.96

Conclusion

LANDSAT-landstats was an experiment to see if convnets could estimate objects they couldn’t ‘see.’ Given project complexity, the timeframe, and my limited understanding of the algorithms at the outset, the results are promising. We’re not at a stage to provide precise estimates of a region’s population, but with improved image resolution and advances in our understanding of convnets, we may not be far away.

-Patrick Doupe

Newer

Nov 23, 2016 · whitepaper

Probabilistic Data Structure Showdown: Cuckoo Filters vs. Bloom Filters

Older

Aug 25, 2016 · post

New TensorFlow Code for Text Summarization

Latest posts

Nov 15, 2022 · newsletter

CFFL November Newsletter

November 2022 Perhaps November conjures thoughts of holiday feasts and festivities, but for us, it’s the perfect time to chew the fat about machine learning! Make room on your plate for a peek behind the scenes into our current research on harnessing synthetic image generation to improve classification tasks. And, as usual, we reflect on our favorite reads of the month. New Research! In the first half of this year, we focused on natural language processing with our Text Style Transfer blog series.

Nov 14, 2022 · post

Implementing CycleGAN

by Michael Gallaspy · Introduction This post documents the first part of a research effort to quantify the impact of synthetic data augmentation in training a deep learning model for detecting manufacturing defects on steel surfaces. We chose to generate synthetic data using CycleGAN,1 an architecture involving several networks that jointly learn a mapping between two image domains from unpaired examples (I’ll elaborate below). Research from recent years has demonstrated improvement on tasks like defect detection2 and image segmentation3 by augmenting real image data sets with synthetic data, since deep learning algorithms require massive amounts of data, and data collection can easily become a bottleneck.

Oct 20, 2022 · newsletter

CFFL October Newsletter

October 2022 We’ve got another action-packed newsletter for October! Highlights this month include the re-release of a classic CFFL research report, an example-heavy tutorial on Dask for distributed ML, and our picks for the best reads of the month. Open Data Science Conference Cloudera Fast Forward Labs will be at ODSC West near San Fransisco on November 1st-3rd, 2022! If you’ll be in the Bay Area, don’t miss Andrew and Melanie who will be presenting our recent research on Neutralizing Subjectivity Bias with HuggingFace Transformers.

Sep 21, 2022 · newsletter

CFFL September Newsletter

September 2022 Welcome to the September edition of the Cloudera Fast Forward Labs newsletter. This month we’re talking about ethics and we have all kinds of goodies to share including the final installment of our Text Style Transfer series and a couple of offerings from our newest research engineer. Throw in some choice must-reads and an ASR demo, and you’ve got yourself an action-packed newsletter! New Research! Ethical Considerations When Designing an NLG System In the final post of our blog series on Text Style Transfer, we discuss some ethical considerations when working with natural language generation systems, and describe the design of our prototype application: Exploring Intelligent Writing Assistance.

Sep 8, 2022 · post

Thought experiment: Human-centric machine learning for comic book creation

by Michael Gallaspy · This post has a companion piece: Ethics Sheet for AI-assisted Comic Book Art Generation I want to make a comic book. Actually, I want to make tools for making comic books. See, the problem is, I can’t draw too good. I mean, I’m working on it. Check out these self portraits drawn 6 months apart: Left: “Sad Face”. February 2022. Right: “Eyyyy”. August 2022. But I have a long way to go until my illustrations would be considered professional quality, notwithstanding the time it would take me to develop the many other skills needed for making comic books.

Aug 18, 2022 · newsletter

CFFL August Newsletter

August 2022 Welcome to the August edition of the Cloudera Fast Forward Labs newsletter. This month we’re thrilled to introduce a new member of the FFL team, share TWO new applied machine learning prototypes we’ve built, and, as always, offer up some intriguing reads. New Research Engineer! If you’re a regular reader of our newsletter, you likely noticed that we’ve been searching for new research engineers to join the Cloudera Fast Forward Labs team.

Reports

In-depth guides to specific machine learning capabilities

FF24

Text Style Transfer

The NLP task of text style transfer (TST) aims to automatically control the style attributes of a piece of text while preserving the content, which is an important consideration for making NLP more user-centric. In this report, we explore text style transfer through an applied use case — neutralizing subjectivity bias in free text. Along the way, we describe our sequence-to-sequence modeling approach leveraging HuggingFace Transformers, and present a set of custom, reference-free evaluation metrics for quantifying model performance. Finally, we conclude with a discussion of ethics centered around our prototype: Exploring Intelligent Writing Assistance.

Read the report →

FF22

Inferring Concept Drift Without Labeled Data

Concept drift occurs when the statistical properties of a target domain change overtime causing model performance to degrade. Drift detection is generally achieved by monitoring a performance metric of interest and triggering a retraining pipeline when that metric falls below some designated threshold. However, this approach assumes ample labeled data is available at prediction time - an unrealistic constraint for many production systems. In this report, we explore various approaches for dealing with concept drift when labeled data is not readily accessible.

Read the report →

FF19

Session-based Recommender Systems

Being able to recommend an item of interest to a user (based on their past preferences) is a highly relevant problem in practice. A key trend over the past few years has been session-based recommendation algorithms that provide recommendations solely based on a user’s interactions in an ongoing session, and which do not require the existence of user profiles or their entire historical preferences. This report explores a simple, yet powerful, NLP-based approach (word2vec) to recommend a next item to a user. While NLP-based approaches are generally employed for linguistic tasks, here we exploit them to learn the structure induced by a user’s behavior or an item’s nature.

Read the report →

FF18

Few-Shot Text Classification

Text classification can be used for sentiment analysis, topic assignment, document identification, article recommendation, and more. While dozens of techniques now exist for this fundamental task, many of them require massive amounts of labeled data in order to be useful. Collecting annotations for your use case is typically one of the most costly parts of any machine learning application. In this report, we explore how latent text embeddings can be used with few (or even zero) training examples and provide insights into best practices for implementing this method.

Read the report →

Prototypes

Machine learning prototypes and interactive notebooks

Notebook

ASR with Whisper

Explore the capabilities of OpenAI's Whisper for automatic speech recognition by creating your own voice recordings!

https://colab.research.google.com/github/fastforwardlabs/whisper-openai/blob/master/WhisperDemo.ipynb

Library

NeuralQA

A usable library for question answering on large datasets.

https://neuralqa.fastforwardlabs.com

Notebook

Explain BERT for Question Answering Models

Tensorflow 2.0 notebook to explain and visualize a HuggingFace BERT for Question Answering model.

https://colab.research.google.com/drive/1tTiOgJ7xvy3sjfiFC9OozbjAX1ho8WN9?usp=sharing

Notebooks

NLP for Question Answering

Ongoing posts and code documenting the process of building a question answering model.

https://qa.fastforwardlabs.com

Cloudera Fast Forward Labs

Making the recently possible useful.

Cloudera Fast Forward Labs is an applied machine learning research group. Our mission is to empower enterprise data science practitioners to apply emergent academic research to production machine learning use cases in practical and socially responsible ways, while also driving innovation through the Cloudera ecosystem. Our team brings thoughtful, creative, and diverse perspectives to deeply researched work. In this way, we strive to help organizations make the most of their ML investment as well as educate and inspire the broader machine learning and data science community.

Cloudera Blog Twitter

Aug 26, 2016 · guest post

Exploring Deep Learning on Satellite Data

LANDSAT-landstats

So much reality

A very high level overview of training Convolutional Neural Networks

Training the model

Results

Conclusion

Read more

Nov 23, 2016 · whitepaper

Aug 25, 2016 · post

Latest posts

Nov 15, 2022 · newsletter

CFFL November Newsletter

Nov 14, 2022 · post

Implementing CycleGAN

Oct 20, 2022 · newsletter

CFFL October Newsletter

Sep 21, 2022 · newsletter

CFFL September Newsletter

Sep 8, 2022 · post

Thought experiment: Human-centric machine learning for comic book creation

Aug 18, 2022 · newsletter

CFFL August Newsletter

Popular posts

Oct 30, 2019 · newsletter

Nov 14, 2018 · post

Apr 10, 2018 · post

Oct 4, 2017 · post

Aug 22, 2016 · whitepaper

Feb 24, 2016 · post

Reports

FF24

FF22

FF19

FF18

Prototypes

Notebook

Library

Notebook

Notebooks

Cloudera Fast Forward Labs