Top Banner
2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden, Gerry Wiener
13

2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden,

Mar 26, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden,

2009 AMS Artificial Intelligence Conference

A Data Mining Approach to Soil Temperature and Moisture Prediction

14 January 2009

Bill Myers Seth Linden, Gerry Wiener

Page 2: 2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden,

Project Overview and Goals• Improve soil temperature and moisture

prediction• Integrate and Evaluate NASA-MODIS data sets

– Leaf Area Index (LAI) – Green Vegetation Fraction (i.e. FPAR)

– Albedo

• Deliver tailored products to end users– Soil forecasts will drive Agriculture-specific models (e.g.

pest models)– RAL partnered with DTN/Meteorlogix – DTN DSS delivers Ag-specific forecasts to 80,000 users

Page 3: 2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden,

Soil State Prediction

Subsurface Nodes

FixedNode

Solar Energy Weather

• Current soil state modified by atmospheric forcing conditions

• Heat and moisture are transferred between adjacent nodes

• Typically done with a physical model, called a Land-Surface Model (LSM)

Page 4: 2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden,

Physical Model• This project uses the High Resolution Land Data

Assimilation System and the Noah LSM– Used by NCEP as part of the NAM (WRF model)

• Many parameters are necessary to model soil type and land surface characteristics– Affect incident solar energy, heat transfer, etc– Parameters must be generalized

• “Sandy loam” will have same parameterization at all sites• Chemical compositions of “sandy loam” differ between sites

– Heat and moisture transfer will not be exact at ANY site

• Goal of this study:Determine if a data mining approach can produce

results comparable to those of the physical model

Page 5: 2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden,

Data Mining System• Regression Tree (Cubist)

– Available from www.rulequest.com– Looks for patterns in data– Builds rule-based numerical models– Rules are developed based on training data– At each leaf node, a regression equation is developed that best fits that

subset of the training data– Effectively, linear approximations are being made when certain conditions

are met– Soil state forecasts are generated by applying rule set to forecast data

• Training Data– 29 Soil Climate Analysis Network (SCAN) sites– Two years of observational history at each site used to develop rules– NCAR scientists were consulted to determined most important inputs to soil

state evolution– These were extracted or derived from observed variable set

Page 6: 2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden,

Regression Tree Model Generation

• 10 Regression trees were developed for each site– One regression tree for soil temperature and soil moisture at

each depth (5, 10, 20, 50, 100 cm)

• Input variables:– Julian day– Air Temperature– Delta air temperature (in current hr)– Downward Shortwave Radiation– Wind Speed– Dew point temperature– Precip amt– Previous soil state:

• Previous hour’s soil temperature and moisture at adjacent depths

• A target variable (e.g. Current Soil Temp at 5 cm) was provided with each hour’s data

Page 7: 2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden,

Example training data• | Names file for 5cm temperature prediction

• ST5_curr | Predictand in list of variables below

• siteID: ignore | SCAN site ID• date: ignore | YYYYMMDDHH• mon: continuous | fraction of Julian year• AirT: continuous | 2m air temp (avg over last hr)• deltaT: continuous | air temp change over last hour• dsw: continuous | avg downward shortwave radiation over last hr• wspd: continuous | avg wind speed over last hour• TD: continuous | avg dew point temp over last hour• qpf: continuous | precip amt over last hour• ST5_prev: continuous | 5 cm soil temp at previous hour• ST10_prev: continuous | 10 cm soil temp at previous hour• SM5_prev: continuous | 5 cm soil moisture at previous hour• SM10_prev: continuous | 10 cm soil moisture at previous hour• ST5_curr: continuous | 5 cm soil temp at previous hour

Sample line of training data2001, 2007110211, 0.9167, 4.53, -0.89, 0.00, 2.81, -3.28, 0.00, 8.158, 9.847, 33.858, 39.616, 8.32

Previous hour’s soil temperature at5 cm and 10cm

Previous hour’s soil moisture at5 cm and 10cm

Current hour’s5 cm Soil T(Predictand)

Air TempDewpoint

Temp

Wind Speed

Air Temp Falling in this hour No downward

Radiation (night)

No Precip

Time of year

Page 8: 2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden,

Rules Development and Application• Regression Trees generated for each predictand at each site

– Separate tree for Soil Temperature and Moisture at each depth – Two years of training data for most sites– Example rule and associated regression:if dsw <= 0.09 and ST5_prev > 12.05 ST5_curr = -0.211 + 0.3165 dsw + 0.83 ST5_prev +

0.13 ST10_prev + 0.02 AirT + 0.02 TD

• 48 hour forecasts were generated iteratively– Starting with observed soil state and first hour’s weather predictions– Regression trees were applied for each predictand to generate forecast

state at hour 1– Using the forecast soil state and weather predictions, the next hours’

forecasts were generated iteratively

• Soil forecasts generated for 2007 growing season (April-June)– Data Mining and HRLDAS forecasts were compared to observations

Page 9: 2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden,

• Statistically, data mining better than HRLDAS at nearly all the 29 SCAN sites

• Median (and quartile) MAEs significantly lower for data mining

• Data mining errors generally 30%+ lower than HRLDAS errors

Results

Page 10: 2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden,

Soil Temperature ErrorsData Mining Solid Lines, HRLDAS dashed

0

1

2

3

4

0 1 M 3 4

Quartile

de

gC

5 cm

10 cm

20 cm

50 cm

Page 11: 2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden,

Summary

• Data mining with Cubist Regression Trees• Reduces soil temperature and moisture errors• Simple to develop rules• Rules/Regressions can be displayed easily• Regression Tree forecasts tuned to the site• HRLDAS forecast parameters are more generic

• Applicability to non-observing sites• Rules, as developed are site specific• Not valid away from that location• HRLDAS can generate forecasts at any location• Observing sites do not begin to cover all land use and

soil type combinations

Page 12: 2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden,

Future Directions• Add vegetation state (from NASA MODIS data) to data

mining training sets to determine see these results can be improved upon

• Train Cubist with all obs sites lumped together but include land use and soil type as input variables

• Investigate combining data mining approach and LSM to get best of both

Page 13: 2009 AMS Artificial Intelligence Conference A Data Mining Approach to Soil Temperature and Moisture Prediction 14 January 2009 Bill Myers Seth Linden,

Acknowledgements

This research effort has been supported by a NASA-ROSES

grant.

We appreciate the help provided by personnel at the USDA

Natural Resources Conservation Service, and various

NASA labs.

Soil forecast web site:www.rap.ucar.edu/projects/nasa-ag/ hrldas/display_hrldas_animation.html

Cubist is available at www.rulequest.com