Data Mining on the Farm Accelerating the search for a better pesticide John B. Kinney, Senior Research Associate DuPont Biosolutions Enterprise Spotfire.

Post on 12-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Data Mining on the Farm

Accelerating the search for a better pesticide

John B. Kinney, Senior Research Associate

DuPont Biosolutions Enterprise

Spotfire User’s Conference, May 3-4 2001© 2001, DuPont, Inc. - All Rights Reserved.

DuPont Biosolutions Enterprise

Crop Protection Products* Control weeds, insects, and plant diseases

Pioneer Hi-Bred High performance seeds

Protein Technologies Soy protein isolates used in the food industry

Qualicon Food safety

* Focus of today’s talk

CPP Goals

Control pests Efficaciously Safely Environmentally Cost effectively

CPP Research & Data Styles

in vitroin vivoField

in vivo CPP Research

“Treating the bed to cure the patient”

Plants in potsLength of test is a factor “Extra” data

Herbicide Test Unit

CRL

BYG

FTI

MOG

PWX

VEL

BGC

Test SubstanceControl

Field Tests

Same as in vivo, but with less control!

“Extra” dataDegradation and movement in the

environment are major issues

Data Issues

Biological variability(Highly) Multivariate dataEC50 results are uncommonHistorical data is valuable

Successful Applications of Data Visualization

Sourcing: Preformatted data sets for sample acquisition analysis

Hit Followup: R-group visualization and analysis

Lead Optimization: Color-coded reports for rapid, high-dimensional comparisons

Browsing Acquisition Analysis Data

Challenge: Characterize and evaluate offerings from compound brokers and collaborators

Solution: External system to characterize offerings and build tables for browsing in Spotfire

Minimal interface...

User selection from existing “evaluation tables”

Spotfire for browsing

Parallel Synthesis Hit Followup

Visualization and analysis of combinatorial library

Row and Column layout useful, but not chemically relevant!

Merging synthetic schemes combined with biology

Hansch-style characterization often helpful for identifying trends and features

Fragment properties and whole molecule data can provide insights

NR1

R2

R1 == methyl, ethyl, propyl, etc

R2 == -F, -Cl, -Br, -I

Plate layout vs. Fragment Data

Lead Optimization

Numerous test and characterization values for each compound

History of complex, printed data reportsPRIMARY PLANT RESPONSE (WEEDS)

INCODE = CPD1 DEPT = 8 DATE = 891127 ############################################### SUBMITTER = # # N.B = 056898 N.B.PAGE = 021 #INCODE= CPD1 # AMT =.21G % = 100 FORM = # # LEAD AREA = # # #/MOLNM # #/Info= CHEMICAL NAME AVAILABLE UPON REQUEST # # # # # ###############################################

YY/MM/DD TYPE RATE UNITS MORN COCKL VELV PIG CRAB GIANT FOXTL B Y CHEAT DOWNY WILD SOR COMMENT TEST GLORY BUR LEAF WEED GRASS FOXTL MILLT GRASS GRASS BROME OATS GUM-------- ---- ------- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----------90/01/02 POST 1.0 KG/HA 90H 70H 70H 10C 10C 0 40G 0 30C HERBICIDE90/01/02 PRE 2.0 KG/HA 0 10H 30H 0 20H 0 0 0 30C

HeLo

Project Overview w/Heat Maps

Future Challenges

Better data extraction/formatting techniques

Expanding data warehouse to include non-traditional data sources

Computer screen real estate!

Acknowledgements

At the risk of missing someone...Kevin Kranis (retired)Laurie ChristiansonDan Kleier

The entire Discovery Organization -- They generated the data!

top related