A Machine Learning Approach to Forecasting Remotely Sensed Vegetation Health
📝 Abstract
Drought threatens food and water security around the world, and this threat is likely to become more severe under climate change. High resolution predictive information can help farmers, water managers, and others to manage the effects of drought. We have created an open source tool to produce short-term forecasts of vegetation health at high spatial resolution, using data that are global in coverage. The tool automates downloading and processing Moderate Resolution Imaging Spectroradiometer (MODIS) datasets, and training gradient-boosted machine models on hundreds of millions of observations to predict future values of the Enhanced Vegetation Index. We compared the predictive power of different sets of variables (raw spectral MODIS data and Level-3 MODIS products) in two regions with distinct agro-ecological systems, climates, and cloud coverage: Sri Lanka and California. Our tool provides considerably greater predictive power on held-out datasets than simpler baseline models.
💡 Analysis
Drought threatens food and water security around the world, and this threat is likely to become more severe under climate change. High resolution predictive information can help farmers, water managers, and others to manage the effects of drought. We have created an open source tool to produce short-term forecasts of vegetation health at high spatial resolution, using data that are global in coverage. The tool automates downloading and processing Moderate Resolution Imaging Spectroradiometer (MODIS) datasets, and training gradient-boosted machine models on hundreds of millions of observations to predict future values of the Enhanced Vegetation Index. We compared the predictive power of different sets of variables (raw spectral MODIS data and Level-3 MODIS products) in two regions with distinct agro-ecological systems, climates, and cloud coverage: Sri Lanka and California. Our tool provides considerably greater predictive power on held-out datasets than simpler baseline models.
📄 Content
A Machine Learning Approach to Forecasting
Remotely Sensed Vegetation Health
John Nay1*, Emily Burchfield 2 and Jonathan Gilligan3
1 School of Engineering, Vanderbilt University; * Correspondence: john.j.nay@gmail.com
2 Department of Civil & Environmental Engineering, Vanderbilt University;
emily.k.burchfield@vanderbilt.edu
3 Department of Earth & Environmental Sciences, Vanderbilt University; jonathan.gilligan@vanderbilt.edu
Abstract: Drought threatens food and water security around the world, and this threat is likely to
become more severe under climate change. High resolution predictive information can help
farmers, water managers, and others to manage the effects of drought. We have created an open
source tool to produce short-term forecasts of vegetation health at high spatial resolution, using data
that are global in coverage. The tool automates downloading and processing Moderate
Resolution Imaging Spectroradiometer (MODIS) datasets, and training gradient-boosted machine
models on hundreds of millions of observations to predict future values of the Enhanced Vegetation
Index. We compared the predictive power of different sets of variables (raw spectral MODIS data
and Level-3 MODIS products) in two regions with distinct agro-ecological systems, climates, and
cloud coverage: Sri Lanka and California. Our tool provides considerably greater predictive
power on held-out datasets than simpler baseline models.
Keywords: Forecasting; Predictive Modeling; Machine Learning; Vegetation Health
- Introduction
Drought significantly reduces agricultural production, destabilizing food systems and threatening
food security [1]. Remotely sensed measures of vegetation health, such as the Normalized
Difference Vegetation Index (NDVI) or the Enhanced Vegetation Index (EVI), are widely used to
monitor spatiotemporal variations in agricultural responses to drought [2, 3]. Providing managers
and farmers with accurate information about vegetation health increases system-wide capacity to
prepare for and adapt to water scarcity [4, 5]. These indices can be used to identify vulnerable
agricultural systems, to understand past agricultural responses to drought, and to guide efforts to
increase resilience to future drought.
Agricultural systems often exhibit nonlinear responses to sudden changes in water availability or
human activity. However, many agricultural prediction tools rely on linear models to predict
future vegetation health [6, 7, 8, 2]. Though more complex, nonlinear models have been used to
predict rainfall in agricultural systems [9, 10], metrics of agricultural drought such as vegetation
health better capture changes in farmer livelihoods than the coarse resolution meteorological
metrics of drought used in these studies. Coarse resolution models are not able to examine fine-
grained intra-system dynamics and justify resource transfers. Higher resolution models tend to
rely on datasets only available in data-rich regions of the world [7, 11, 12, 13]. Furthermore, data
scarce regions tend to lack the economic resources required to buffer against the effects of drought.
Our objective was to create a user-friendly predictive software tool that will increase the capacity of data-scarce agricultural systems to prepare for and respond to drought in the future. We have created a tool that (1) predicts future vegetation health values at a (2) high spatial resolution using (3) open source tools and data that are (4) global in coverage. All scripts and documentation can be downloaded from http://johnjnay.com/forecastVeg/ and https://github.com/JohnNay/forecastVeg . With simple user inputs, our software downloads, processes, models, and forecasts vegetation health at 16-day intervals at a 250-meter resolution
2 of 15
anywhere in the world. The tool applies a gradient-boosted machine model to Moderate
Resolution Imaging Spectroradiometer (MODIS) datasets openly available on NASA’s LP DAAC
server. The model learns potentially complex relationships between past remotely sensed
variables (and their interactions) and future vegetation health as measured by the Enhanced
Vegetation Index (EVI).
In this paper, we apply the tool in two locations: Sri Lanka and California. We selected these
regions based on their distinct agro-ecological systems, climates, and levels of cloud cover. We
compared the predictive performance of the model using past values of raw spectral MODIS data
(MOD09A1) and Level 3 MODIS products (MOD11A2, MOD13Q1, MOD15A2, MOD17A2) as
predictor variables. In what follows, we describe our approach, compare predictor variable sets,
demonstrate strong out-of-sample forecasting performance, and analyze the performance of the
model across time periods and land cover in both locations.
2. Materials and Methods
We designed an experiment across location and data dimensions to assess how well our process
performs under different conditi
This content is AI-processed based on ArXiv data.