Top Banner
Data Driven Model Development David LeBauer, Mike Dietze, Deepak Jaiswal, Rob Kooper, Stephen P. Long, Shawn Serbin, Dan Wang
28
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Le Bauer:  Data Driven Model Development

Data Driven Model DevelopmentDavid LeBauer, Mike Dietze, Deepak Jaiswal, Rob Kooper, Stephen P. Long, Shawn Serbin, Dan Wang

Page 2: Le Bauer:  Data Driven Model Development

Objective: Useful Predictions

Clark et al. 2001 Ecological Forecasts, An Emerging Imperative. Science

Precision, Accuracy

In

form

ati

on

Page 4: Le Bauer:  Data Driven Model Development

An error has occurred. To continue:

Press Enter to return to Windows, or

Press CTRL+ALT+DEL to restart your computer. If you do this, you will loose any unsaved information in all open applications

Error: 0E : 016F : BFF9B3D4

Press any key to continue _

Windows

Page 5: Le Bauer:  Data Driven Model Development

Technical UncertaintyA Cautionary Tale

Yie

ld

Observed

Page 6: Le Bauer:  Data Driven Model Development

Technical UncertaintyA Cautionary Tale

Yie

ld

Pri

ors

Observed

Page 7: Le Bauer:  Data Driven Model Development

Technical UncertaintyA Cautionary Tale

Yie

ld

+ T

rait

Data

Pri

ors

Observed

Page 8: Le Bauer:  Data Driven Model Development

Technical UncertaintyA Cautionary Tale

Yie

ld

+ T

rait

Data

+ F

lux D

ata

Pri

ors

Observed

Page 9: Le Bauer:  Data Driven Model Development

Technical UncertaintyA Cautionary Tale

Yie

ld

+ T

rait

Data

+ F

lux D

ata

Pri

ors

Observed

Annual Merge

Page 10: Le Bauer:  Data Driven Model Development

Technical UncertaintyA Cautionary Tale

Yie

ld

+ T

rait

Data

+ F

lux D

ata

+ L

ate

st V

ers

ion

Pri

ors

Observed

Annual Merge

Page 11: Le Bauer:  Data Driven Model Development

Best Practices Write programs for people, not computers

Automate repetitive tasks

Use the computer to record history

Make incremental changes

Use version control

Don't repeat yourself (or others)

Plan for mistakes

Optimize software only after it works correctly

Document the design and purpose of code

Conduct code reviewsWilson et al 2012. Best Practices for Scientific Computing. arXiv:1210.0530v3

Page 12: Le Bauer:  Data Driven Model Development

Best Practices Write programs for people, not computers

Automate repetitive tasks

Use the computer to record history

Make incremental changes

Use version control

Don't repeat yourself (or others)

Plan for mistakes

Optimize software only after it works correctly

Document the design and purpose of code

Conduct code reviewsWilson et al 2012. Best Practices for Scientific Computing. arXiv:1210.0530v3

Page 13: Le Bauer:  Data Driven Model Development

Best Practices 1: Automation

Altintas et al 2004. Kepler: an extensible system for design and execution of scientific workflows. Proc 16th ICSSDM

Write programs for people, not computers

Automate repetitive tasks

Use the computer to record history

Make incremental changes

Use version control

Don't repeat yourself (or others)

Plan for mistakes

Optimize software only after it works correctly

Document the design and purpose of code

Conduct code reviews

Page 14: Le Bauer:  Data Driven Model Development

Parameter Uncertainty: Test Case

Single Analysis:

Contribution of parameter uncertainty to uncertainty in Switchgrass Yield prediction.

LeBauer, Wang, Richter, Davidson, and Dietze 2013. Facilitating Feedbacks between ecological models and data. Ecological

Monographs

Page 15: Le Bauer:  Data Driven Model Development

Parameter Uncertainty: Automated

Contribution of parameter uncertainty to model uncertainty.

* 17 Plant functional types

* 6 biomes

* 8 scientists

* 6 Months

Dietze, Serbin, LeBauer, Davidson, Desai, Feng, Kelly, Kooper, LeBauer, Mantooth, McHenry, and Wang. submitted

A quantitative assessment of a terrestrial biosphere model's data needs across North American biomes. JGR

% S

D

Exp

lain

ed

Page 16: Le Bauer:  Data Driven Model Development

Best Practices 2: Iteration with Testing

Wilson et al 2012. Best Practices for Scientific Computing. arXiv:1210.0530v3

Write programs for people, not computers

Automate repetitive tasks

Use the computer to record history

Make incremental changes

Use version control

Don't repeat yourself (or others)

Plan for mistakes

Optimize software only after it works correctly

Document the design and purpose of code

Conduct code reviews

Page 17: Le Bauer:  Data Driven Model Development

Case Study:C4 Crop Coppice Willow

C3 Photosynthes

isPerennial

StemLeaf

Senescence

Page 18: Le Bauer:  Data Driven Model Development

Benchmark Data

Aboveground Biomass

23 Calibration Sites

72 Observations

Observed (Mg/ha)

40.0

20.0

0.0

60.0

Page 19: Le Bauer:  Data Driven Model Development

RMSE*

Correlation

Standard Deviation*

0

1

1

*Scaled to sddata = 1

Results:

Start (C4 Grass)

+ C3 Photosynthesis

+ Perennial Stem

+ Fixed Respiration

+ Leaf Senescence

Page 20: Le Bauer:  Data Driven Model Development

0.74

0.67

0.20

RMSE*

Correlation

Standard Deviation*

0

1

1

*Scaled to sddata = 1

Results:

Start (C4 Grass)

+ C3 Photosynthesis

+ Perennial Stem

+ Fixed Respiration

+ Leaf Senescence

Page 21: Le Bauer:  Data Driven Model Development

0.74

0.67

0.20

RMSE*

Correlation

Standard Deviation*

0

1

1

*Scaled to sddata = 1

Results:

Start (C4 Grass)

+ C3 Photosynthesis

+ Perennial Stem

+ Fixed Respiration

+ Leaf Senescence

Page 22: Le Bauer:  Data Driven Model Development

0.74

0.67

0.20

RMSE*

Correlation

Standard Deviation*

0

1.46

1

1

*Scaled to sddata = 1

Results:

Start (C4 Grass)

+ C3 Photosynthesis

+ Perennial Stem

+ Fixed Respiration

+ Leaf Senescence

Page 23: Le Bauer:  Data Driven Model Development

0.74

0.67

0.20

RMSE*

Correlation

Standard Deviation*

0

1.46

1

1

*Scaled to sddata = 1

Results:

Start (C4 Grass)

+ C3 Photosynthesis

+ Perennial Stem

+ Fixed Respiration

+ Leaf Senescence

Page 24: Le Bauer:  Data Driven Model Development

0.74

0.67

0.20

RMSE*

Correlation

Standard Deviation*

0

0.30

0.87

0.84

1.46

1

1

*Scaled to sddata = 1

Results:

Start (C4 Grass)

+ C3 Photosynthesis

+ Perennial Stem

+ Fixed Respiration

+ Leaf Senescence

Page 25: Le Bauer:  Data Driven Model Development

Observed

Aboveground Biomass (Mg/ha)

Pre

dic

ted

0.0

50.0

100

80.0

50.0

0.0

Page 26: Le Bauer:  Data Driven Model Development

Conclusions * Best practices lead to more effective and efficient modeling

* Applied integration tests to support model development

* Controlling technical error produces more robust and accurate inference

Page 27: Le Bauer:  Data Driven Model Development

Future Directions * Track benchmark metrics for specific model runs

* Maintain ability to reproduce published results

* Automated testing with each code commit or major release

* Current Metrics to define limits of model credibility

Page 28: Le Bauer:  Data Driven Model Development

More Information Email: [email protected]

Web: pecanproject.org

Development:github.com/pecanproject