Satellite yield estimation: how good is it, and what can we learn?

StanfordFoodSecurity
8 min readJan 5, 2021
Annual maize yields in the US Corn Belt produced with the satellite-based Scalable Crop Yield Mapper (SCYM)

With today’s computing power and free satellite imagery, generating a crop yield map from satellite data is relatively easy. Knowing if that map is accurate or not, however, can be surprisingly difficult.

When ground data is sparse, researchers often evaluate yield maps by comparing regional means against government statistics. Sometimes they are able to combine this with some location-specific ground data. These are useful measures when it’s all we have available, but ideally satellite products should be evaluated at the scale of the satellite pixels, which is often now many times smaller than a standard single field in commercial agricultural systems.

This leaves a key question unsatisfactorily answered: just how good are satellite yield maps? Are we justified in using them to assess agricultural trends or yield impacts of management practices at field and sub-field scales?

A million kernels of truth

Through a partnership with Corteva AgriScience, we had a golden opportunity: ground truth data for 1,002,958 field observations from 2008 to 2018 in the US Corn Belt. No, that’s not a typo — it’s a lot of data. The data come from combine harvesters equipped with on-board yield monitors — essentially, as farmers harvest their fields, grain weight and location are recorded in real-time. Our partners clean and format this data to produce high-resolution (5 m) yield maps for each field. And we had over a million field maps spanning a decade of growing conditions.

In a recent study (open access) published in Remote Sensing of Environment, we use this data to evaluate our satellite yield mapping approach, the Scalable Crop Yield Mapper (SCYM, first described in this 2015 paper by Lobell et al.). Our main goal was to see how sensitive SCYM was to management, soil, and weather impacts on yields. Along the way, we tested a few improvements to SCYM that shed light on the best ways to translate satellite data to yield estimates. If you’re into the nitty gritty methods, I’ll come back to that below…keep scrolling.

Example of ground truth field yield maps (left, 5 m resolution) and Landsat satellite-based SCYM yield estimates (right, 30 m resolution) for maize fields in 2018

Satellites can help yield gap analyses

SCYM is designed to be scalable and operate over large regions and many years. This means that it needs to work without any difficult-to-acquire information about a particular farm’s management or soil properties. In other words, the only signals we have for yield outcomes based on soil quality or management choices are the changes in light reflectance picked up by the satellite. Is this enough to support yield gap analyses?

The answer seems to be a solid yes, in most cases. The figure below compares SCYM yield estimates (blue) with ground data (orange) along variation in management, soil quality, and annual weather.

Satellite yield estimates from SCYM (blue) track well with ground-based yield observations from combine yield monitors (orange) across variation in management, soils, and weather.

Despite SCYM having no idea about how a field was planted, it picked up a positive yield response to planting density and a yield penalty for later planting dates. SCYM even detects some nonlinear responses to water availability, with peak yields at intermediate levels, and it responds similarly to soil quality and heat stress. SCYM wasn’t great at picking up yield impacts from late harvest dates, which makes sense since it’s based mostly on information around the middle of the growing season. This probably holds for a lot of satellite yield maps — they’re likely to miss end of season impacts unless specifically designed to account for them.

To be honest, I didn’t expect the relationships to look this good. It’s really encouraging and means that we can use satellite-derived yield maps to provide consistent data across space and time with sub-field granularity, capturing the scale of soil variation and management decisions.

Yield map accuracy is scale dependent

We found SCYM captured 40% of yield variation at the 30 m pixel level, 45% at the field scale, and 67% of yield variation at the county level. These scale-dependent measures give an unsurprising but important result: when high-resolution yield maps are aggregated to political boundaries for accuracy assessment, the resulting accuracy doesn’t apply to smaller field-level or pixel-level scales. We believe that agreement at aggregated scales is still informative and useful when it is the best available comparison, but researchers should strive to provide an assessment of yield mapping methods at a comparable scale for high resolution products.

The importance of a scalable approach

I’ve mentioned a few times that SCYM is a scalable approach. What does that mean exactly, and why does it matter? In order to estimate yields from satellite data, we need a model that links what the satellites see — differences in reflected light from how dense, green, and healthy the crop is — with yield outcomes. If there’s good ground data, we can empirically calibrate this model to ground observations. Many studies take this approach, typically using ground data from tens to several hundreds of fields.

These ground-calibrated models, however, are limited to where we have good data. Often, the empirical relationships don’t generalize well outside of the training data, so it may be risky to use them in different years or new places. And importantly, there tends to be the least amount of ground data where we’d need it most: smallholder agricultural systems outside of big commercial agriculture.

SCYM was designed for these needs. Rather than relying on ground data, SCYM builds the yield model from crop simulations tailored to the study region. We then make a statistical model that can be applied at (relatively) low computational cost across regions and years.

But how well does SCYM perform in comparison to an extensively ground-calibrated model? Since we had a rather unique dataset with even coverage across a large region and eleven years, we could answer this question. We used a standard machine learning approach (random forests) to train yield models with different numbers of ground data. The accuracy for all models was compared against the same test dataset of ~400,000 pixels evenly distributed across the region and study years.

As the figure below shows, you would need about 1000 ground samples to train a model as good as SCYM. But the distribution of those 1000 ground samples greatly affected accuracy. If we limited the 1000 samples to be randomly drawn from a single state, or a single year, the performance of the ground-calibrated models tanked. This was especially true for models trained on a single year; 2014 was the worst, which is odd because we aren’t aware of anything particularly unique about that year.

Comparison of SCYM’s accuracy with multiple ground-calibrated machine learning yield models based on training sample size (black dots). Restricting the training data to a single year or state reduces performance.

And here’s a really important thing to keep in mind about this: having this much data to train ground-calibrated models on is rare. While it’s challenging to keep tabs on all the advances and publications coming out in this field, the ones we know of that have used yield monitor data have between 39 and a couple hundred fields, sometimes just in 1 year, and often in highly localized settings. There’s a lot of work being done to figure out how to make models more transferable outside of the training data, and we’re hopeful for improvements. But in the meantime, we found that scalable approaches like SCYM can perform about 70% as well as a ground model trained on 200,000 fields, and at a fraction of the computational costs. Pretty cool.

The nuts and bolts of effective yield mapping

We also used the ground data for some fairly nuts-and-boltsy methods comparisons. For example, there are several different ways to summarize the information in satellite observations over the course of the year. In our study region, maize typically “greens up” shortly following planting and reaches peak greenness in late July before drying down before harvest. The following figure gives a graphical overview of the different ways we tested to quantify crop growth for each satellite pixel.

A graphical overview of different ways we tested to quantify crop growth from Landsat satellite observations (blue dots)

We found that smoothing the annual satellite time series with harmonic regressions improved accuracy — the “partial integral” method did best. It also allowed us to apply SCYM to more maize area each year compared to the prior implementation of SCYM in the Corn Belt, which couldn’t work if clouds prevented good satellite observations during either crop window (gray boxes).

We were also interested in how much data we needed to make good yield estimates. Sometimes, due to clouds or satellite malfunctions, certain pixels can lack data for key moments of the growing season. We found that having ~8 good satellite observations throughout the year, including at least 1 during a critical window during the peak growing season, was important for accuracy. The Landsat satellite family was able to meet this requirement for ~97% of maize area during the study.

Final Thoughts

Satellite yield mapping provides the opportunity to understand trends and drivers of crop yields over time, a valuable tool for informing management practices and guiding future adaptation efforts. For example, we’ve used these datasets to find that conservation agriculture techniques such as reduced tillage can be used with minimal and typically positive yield impacts. More recently, we looked at changes in drought sensitivity in US maize, which David wrote about here. And because SCYM can be applied in data poor regions as well, our lab group was able to quantify yield variation in smallholder systems in Kenya and Tanzania and identify soil nitrogen as a primary driver. Our study demonstrates that satellite-derived yield maps can inform yield gap analyses, management interventions, and food security efforts.

About the Author:

Jillian Deines is a postdoctoral scholar with the Center on Food Security and the Environment at Stanford University.

--

--

StanfordFoodSecurity

Stanford's Center on Food Security and the Environment (FSE) leads cutting-edge research on global issues of food, hunger, poverty and the environment.