Top Banner
1 Data Warehousing/data Mining: A Crop Insurance Application Presented By Ashley Lovell Director of Agricultural Programs & Professor of Agricultural Economics At the National Risk Management Conference Click Icon for Program => D-FW Airport March 26, 2003
41
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Warehousing/data Mining: A Crop Insurance Application

1

Data Warehousing/data Mining:A Crop Insurance Application

Presented By

Ashley LovellDirector of Agricultural Programs &

Professor of Agricultural Economics

At the National Risk Management Conference

Click Icon for Program=>

D-FW Airport

March 26, 2003

Page 2: Data Warehousing/data Mining: A Crop Insurance Application

2

Data Warehousing/data Mining:A Crop Insurance Application

Presented By

Ashley LovellDirector of Agricultural Programs &

Professor of Agricultural Economics

CAE StaffCenter for Agribusiness Excellence

Tarleton State UniversityThe Texas A&M University System

[email protected]

Page 3: Data Warehousing/data Mining: A Crop Insurance Application

3

Overview

•ARPA 2000•Data Warehouse •Data Mining Research•CAE Participants•Overview of Activities & Research•Cost-Benefit Analysis•Conclusion

Page 4: Data Warehousing/data Mining: A Crop Insurance Application

4

Agricultural Risk Protection Act of 2000

• Crop Insurance Coverage

• Program Integrity

• Research and Pilot Programs

• Education and Risk Management Assistance

Page 5: Data Warehousing/data Mining: A Crop Insurance Application

5

RM

A C

usto

mer

Dis

tri b

uti o

n19

9 9

Page 6: Data Warehousing/data Mining: A Crop Insurance Application

6

RM

A C

usto

mer

Dis

trib

utio

n

200

1

Page 7: Data Warehousing/data Mining: A Crop Insurance Application

7

• Improving Program Integrity – By Reducing Fraud, Waste and Abuse to Build a Stronger Crop Insurance Program and Lower Producer Costs– RMA and FSA to reconcile producer information*– RMA to establish methods to identify agents and

adjusters who may be abusing the program– Center for Agribusiness Excellence (CAE) established

*Spot Check Lists

Agricultural Risk Protection Act of 2000

Page 8: Data Warehousing/data Mining: A Crop Insurance Application

8

• Center for Agribusiness Excellence (CAE)– Mission: To conduct research using a single

data warehouse and associated data mining tools for enhancing the integrity of the Federal Crop Insurance Program, thus improving the program integrity

– Began operating in January 2001

Agricultural Risk Protection Act of 2000

Page 9: Data Warehousing/data Mining: A Crop Insurance Application

9

Data Warehouse

Description & Contents

Page 10: Data Warehousing/data Mining: A Crop Insurance Application

10

Data Warehouse• Massively Large Relational Database (Multi

Gigabyte - Terabytes)

• Generally Many Variables (Columns)

• Usually > 1 Million Observations (Rows)

• Multiple Tables (E.G., Data Tables)

• Consistent Representation (Dates, Units, Etc.)

Page 11: Data Warehousing/data Mining: A Crop Insurance Application

11

• > 800 Million Records

• Includes:– RMA Insurance Data 1991-2003– NOAA Weather Data

CAE Data Warehouse Contents

Com

ple

ted

Tas

ks

Page 12: Data Warehousing/data Mining: A Crop Insurance Application

12

CAE Data Warehouse Contents

• GIS Linkage of Weather Station Data

• Integration of Soil Data

Tas

ks

in P

rogr

ess

Page 13: Data Warehousing/data Mining: A Crop Insurance Application

13

CAE Data Warehouse(Other Data Bases to Be Loaded)

• Remote Sensing Data

-Collaboratively with Spatial Sciences Lab (SSL), Texas A&M University

• Climatological Data

-Collaboratively with University of Nebraska-Lincoln, USDA National Drought Mitigation Center – (NDMC/UN-L)

Page 14: Data Warehousing/data Mining: A Crop Insurance Application

14

CAE Data Warehouse(Other Data Bases to Be Loaded)

• Economic (e.g., Cash and Futures Market Data)

• Soil Series Data

-Collaboratively with USDA NRCS National Cartography Laboratory, SSL/TAMU & NDMC/UN-L

Page 15: Data Warehousing/data Mining: A Crop Insurance Application

15

Data Mining Research

Page 16: Data Warehousing/data Mining: A Crop Insurance Application

16

Overview of Data Mining

ConditionalConditionalLogicLogic

Trends andTrends andVariationsVariations

Affinities andAffinities andAssociationsAssociations

OutcomeOutcomePredictionPrediction

ForecastingForecasting

Link AnalysisLink Analysis

DeviationDeviationDetectionDetection

DiscoveryDiscovery

PredictivePredictiveModelingModeling

Forensic AnalysisForensic Analysis

Data MiningData Mining

GraphicalGraphical

Page 17: Data Warehousing/data Mining: A Crop Insurance Application

17

Modeling Methodology

• Linear Regression

• Logistic Regression

• Neural Networks

• Cluster Analysis

• Classification Trees

• Link Analysis

• Genetic Algorithms

Page 18: Data Warehousing/data Mining: A Crop Insurance Application

18

Center for Agribusiness Excellence

Page 19: Data Warehousing/data Mining: A Crop Insurance Application

19

CAE’s Partners

• Tarleton State University and Planning Systems Inc. (PSI) Are Partners in the Data Warehouse and Data Mining

• USDA Risk Management Agency Research Project– Cooperative Agreement Signed on December

14, 2000– Competitive Contract Awarded July 24, 2002,

Effective September 1, 2002

Page 20: Data Warehousing/data Mining: A Crop Insurance Application

20

CAE’s Partner Contributions

• PSI has Expertise in Data Warehouse Development and Implementation

• RMA provides the data base and program operational experience

• Tarleton has Expertise in Agriculture and CIS and is the Project Contractor & Coordinator

Page 21: Data Warehousing/data Mining: A Crop Insurance Application

21

Overview of CAE Activities

January 2001-March 2003

Page 22: Data Warehousing/data Mining: A Crop Insurance Application

22

CAE Activities• University Personnel Assigned Jan 2001• Data Model Finished Jun 2001• RY 2000 Data “Readied” Jun 2001• Producer Watch List Jun 2001• Data Warehouse Loaded 1991-2000 Sep 2001• ARPA 150% Delivered Oct 2001• Updated 1998-2000 Data, 2001 Data Nov 2001 – Dec

2001• Growing Season Spotcheck Lists Mar2002• NASS Data Integrated May 2002• Web Interface Operational Aug 2002• 49 Completed Projects Sept2002 – Jan

2003• RMA Spot Check List 2002 Delivered Feb 2003• Last Delivery & Loading of Data Mar2003

Page 23: Data Warehousing/data Mining: A Crop Insurance Application

23

CAE Research Drivers

• Legislation

• Work Orders

• Scenarios

Page 24: Data Warehousing/data Mining: A Crop Insurance Application

24

CAE Research Drivers

Legislation, specifically

• ARPA of 2000 “…The Secretary shall establish procedures under which the Corporation will be able to identify the following: …

Page 25: Data Warehousing/data Mining: A Crop Insurance Application

25

CAE Research Drivers• Any person performing loss adjustment services

relative to coverage offered under this title where such loss adjustments performed by the person result in accepted or denied claims equal to or greater than 150 percent … of the mean for accepted or denied claims (as applicable) for all other persons performing loss adjustment services in the same area, as determined by the Corporation….”

• In addition to crop adjusters, ARPA included crop insurance agents.

Page 26: Data Warehousing/data Mining: A Crop Insurance Application

26

CAE Research Drivers

• Work Orders - RMA Personnel Routinely Submit Requests (That Result in Work Orders) Which Focus the Research Resources of CAE

• Scenarios* – Over sixty scenarios/sub-scenarios

– Initiated scenario development early in 2001

*Indicators of Fraud, Waste, and Abuse

Page 27: Data Warehousing/data Mining: A Crop Insurance Application

27

Spot Check List: 2002 Data for ARPA Requirement

Scenarios for Spot Check:

•Triplets•Frequent Filers•Yield Switching•Prevented Planting Frequent Filers•Producers Associated With All or Nothing Agents•Crop Units With Excessive Yields•Under Reported Harvested Production•Rare Big Losers

Page 28: Data Warehousing/data Mining: A Crop Insurance Application

28

Rare Big Losers• Identify Rare Multi-year Losers, Using the Probability

of Loss

• Local Yield Variability Considered

• Cluster and Factor Analysis Show the Importance of Local Conditions

• A Producer’s Loss Ratios Strongly Related to Insurance Plan and Coverage Level

Spo

t Che

ck 2

002

Page 29: Data Warehousing/data Mining: A Crop Insurance Application

29

Iowa & Oklahoma Are Different!Claims by Insurance Plan and Coverage Level

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

CAT 50 55 60 65 70 75 80 85

Coverage Levels

Po

licie

s w

ith

Cla

ims

Iowa APH claims Iowa Revenue claims

Oklahoma APH claims Oklahoma Revenue Claims

Region, Insurance Plan, & Coverage Level

Spo

t Che

ck 2

002

Page 30: Data Warehousing/data Mining: A Crop Insurance Application

30

Rare Big Losers

Average Indemnity of $98,664

Spo

t Che

ck 2

002

Page 31: Data Warehousing/data Mining: A Crop Insurance Application

31

Rare Big Losers Results*• 350 Unique Producers Accounting for

$34,532,565 of Indemnity in 2002

• They Were Flagged at the 0.0001 Level

• Average Indemnity of $98,664

• 72.5 Percent of Their Policies Resulted in a Significant Loss

*Indicators of Fraud, Waste, and Abuse

Spo

t Che

ck 2

002

Page 32: Data Warehousing/data Mining: A Crop Insurance Application

32

All or Nothing• Producers Who Are Associated With All or

Nothing Agents

• All or Nothing Agents are those Agents Who Have Disproportionate Numbers of Crop Policies With Total Losses Compared to Other Agents Within Same Area

• Associated Producers Have Total Loss Claims

• Associated Producers Who Were Indemnified in More Than One Year

Spo

t Che

ck 2

002

Page 33: Data Warehousing/data Mining: A Crop Insurance Application

33

Spo

t Che

ck 2

002

All or Nothing Producers $12,150,707 Indemnity for 236 Producers

Page 34: Data Warehousing/data Mining: A Crop Insurance Application

34

2002 Spot Check List Summary

Scenario* Indemnity Producers

Triplets $ 4,332,310 99Frequent Fliers $21,718,632 328Yield Switching $15,486,631 285Prevented Planting FF $7,011,644 60All or Nothing $12,150,707 236Excessive Yield $36,201,574 389Under Reported Harvest Prod $23,502,812 225Rare Big Losers $32,817,867 323

Unduplicated Totals $137,678,258 1,808

*Indicators of Fraud, Waste, and Abuse

Page 35: Data Warehousing/data Mining: A Crop Insurance Application

35

Total 2002 Spot Check List$137,678,258 Indemnity for 1808 Insureds

Page 36: Data Warehousing/data Mining: A Crop Insurance Application

36

Data Mining ActivitiesPublicized in Weekly Newsletter

Newsletter Volume 1, No. 1 Week of February 7, 2003 

This week, the development of the 2003 Spot Check List is a continuing major research activity and includes all CAE staff members. The following scenarios are the basis for the Spot Check List (SCL) that will be finalized for delivery to RMA early in March.

Page 37: Data Warehousing/data Mining: A Crop Insurance Application

37

Cost-Benefit Analysis

Data Mining Pays Off

Page 38: Data Warehousing/data Mining: A Crop Insurance Application

38

Cost-Benefit Analysis Examples

• Data Mining In Texas Similar to CAE’s, Identified Areas of Tax Underpayment

• In FY 2000, The State of Texas Comptroller Collected An Additional $43 Million in Taxes From Areas of Underpayment Identified Through Data Mining

Page 39: Data Warehousing/data Mining: A Crop Insurance Application

39

Cost-Benefit Analysis

• Texas Blue Cross-Blue Shield Developed a Medical Insurance Data Warehouse

• In the First Three Months, Data Mining Identified Enough Medical Fraud to Pay for the Data Warehouse & Mining

Page 40: Data Warehousing/data Mining: A Crop Insurance Application

40

Conclusions• Data Mining Can Detect Patterns of Waste, Fraud,

and Abuse

• Millions of Taxpayer and Insurance Provider Dollars Can Be Saved Through Data Mining Using Forensic Analysis Techniques

• This Research Provides USDA with Analysis Tools Previously Unavailable

Page 41: Data Warehousing/data Mining: A Crop Insurance Application

41

Conclusions

• Crop Insurance Is Vulnerable to Multiple Methods of Fraud, Waste and Abuse

• A Small Number of Agents, Adjusters and Producers Are Linked to Anomalous Behavior