Top Banner
Making workflows Work Enterprise KNIME deployment at Lilly James A. Lumley (Research IT UK) ChemAxon UGM Budapest 2014
27

EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Jul 14, 2015

Download

Software

ChemAxon
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Making workflows Work

Enterprise KNIME deployment at Lilly

James A. Lumley (Research IT UK)

ChemAxon UGM Budapest 2014

Page 2: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Making workflows Work!

1. Why KNIME?

2. Old meets New

3. Don’t mention structures

4. Better conversions

20/05/2014

Page 3: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Making workflows Work!

1. Why KNIME?

20/05/2014

Page 4: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly
Page 5: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

KNIME@Lilly ‘Freemium’ turned ‘Premium’

• 2010: Strong usage including opensource contributions

by Mike Bodkins UK CompChem group

• 2012: Research IT consolidated workflow tools via

KNIME.com Enterprise license and built an infrastructure

to develop and deploy the tool globally

20/05/2014 Company Confidential © 2014 Eli Lilly and Company

Page 6: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

5/20/2014 Company Confidential © 2014 Eli Lilly and Company 6

Java/Eclipse platform allows easy

creation of custom extensions

(including security model)

Server helps drive Sci/IT collab

and knowledge capture

Integration with existing legacy

systems & data (esp. via SOA)

Strong precedence for

Workflow Software in Pharma

Page 7: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Infrastructure to support the deployment

Company Confidential © 2014 Eli Lilly and Company

Page 8: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

OpenSource Nodes:

+

• Due a ‘refresh’

• Chemaxon dependency in many

nodes including:

• Chemical structure handling

(conversions), sketcher (Marvin),

Molecule Difference check (testing)

and rendering (views)

5/20/2014 Company Confidential © 2014 Eli Lilly and Company 8

Page 9: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Example Lilly Node

using Chemaxon:

• Multi-molecule sketcher

extension based on Marvin

• Configure to sketch and edit

multiple structures or reactions

• Output multiple structures

(port_0) or reactions (port_1)

on node execution

• Internally reuse code for

sketcher applet in webportal

5/20/2014 Company Confidential © 2014 Eli Lilly and Company 9

Page 10: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

2 years on, significant usage*:

• CompChem/MedChem, ADME Reporting, Analytical

Technologies Automation, Sample Management,

Automating Data ETL & Data Exploration…

http://www.knime.com/files/004_kuduk2013-jamesalumley-lilly.pdf

http://www.knime.com/knime-user-day-uk-2013-news

5/20/2014 Company Confidential © 2014 Eli Lilly and Company 10

*

Page 11: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Making workflows Work!

2. Old Meets New:

KNIME working alongside legacy systems

Page 12: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Many nodes link legacy systems:

1. Retain ‘trusted’ status of internal data access tools

(e.g.: internal system for integrated data access,

Mobius)*

2. Retain power of in house legacy predictive modelling

code e.g.: SVM models unix code

3. Interface with new systems e.g.: AT Structure

Verification tools

50% of >100 internal nodes use SOA or similar to

serve analytics tools and data to KNIME

http://www.triconference.com/11/ird

20/05/2014 Company Confidential © 2014 Eli Lilly and Company

*

Page 13: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Making workflows Work!

3. Don’t Mention Structures:

Getting KNIME to work with different data

security models

Page 14: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

20/05/2014 Company Confidential © 2014 Eli Lilly and Company

14

Huge reliance on SOA to provide Tools and Data to KNIME:

+ moves data security issues to web service layer

+ reduces CPU load on ‘office’ laptops

- Services needs constant monitoring

- Large work effort adding NTLM Auth to Webservice nodes

Page 15: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

5/20/2014 Company Confidential © 2014 Eli Lilly and Company 15

• In application support page/tab

• Status of Webservices (separates node errors from service layer errors)

• Links to Webpages

• Known Bugs/Issues from Redmine

Page 16: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Making workflows Work!

4. Better conversions

Ensuring good interplay between the many

chemical data types in KNIME without users feeling

the pain

Page 17: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

• Converter nodes in top 20 most commonly used nodes in analysis

of >2000 workflows on Lilly KNIME server

• Some workflows contain around 50% converter nodes

• New users confused by multiple molecule types and conversions

(Analysis from Summer 2012)

20/05/2014 Company Confidential © 2014 Eli Lilly and Company

Page 18: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

20/05/2014 Company Confidential © 2014 Eli Lilly and Company

Lilly Matched Pairs node

requires RDKit type

Internal unix code

(service layer) requires

Smiles value Property calculator

needs CDK type

Internal data

retrieval system

serves data and

molecule as chime

type

converter

converter

converter converter

Page 19: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

• Different Chemical

Types don’t work well

together

• Users constantly

converting chemical

data from one ‘type’ to

another

• Worse for Lilly nodes

that utilise many formats

with no ‘standard’

vendor like

representation

20/05/2014 Company Confidential © 2014 Eli Lilly and Company

Page 20: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Aim:

• Remove need for user to manually add chemical converter nodes

• Ensure Nodes that use different Chemical formats to work together

better

20/05/2014 Company Confidential © 2014 Eli Lilly and Company

Page 21: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

KNIME.com introduced “Adaptor Cell” in 2.9

• Container with several representations of same entity

• Node can add additional representations that can be re-used by

downstream nodes

• Avoids multiple conversions

• Original representation still present

• Vendor Specific! No pseudo standards such as SDF

5/20/2014 Company Confidential © 2014 Eli Lilly and Company 21

SDF RDKit CDK Indigo

Page 22: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Lilly Solution for (Pseudo) standards:

• Extension point for handling Molecule Type conversions

• Depends on Marvin library for Molecule conversions

+

• In development!

• Will be released opensource

5/20/2014 Company Confidential © 2014 Eli Lilly and Company 22

Page 23: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Before

5/20/2014 Company Confidential © 2014 Eli Lilly and Company 23

Page 24: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

• Extension point moves conversions into Node configuration

• Workflow still documents explicit type conversions

• Still retains support for Converter nodes if/when appropriate

5/20/2014 Company Confidential © 2014 Eli Lilly and Company 24

Page 25: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

5/20/2014 Company Confidential © 2014 Eli Lilly and Company 25

Converters could be ‘chained’ if

direct conversion not available

(e.g.: InChI or Chime). Example

shown in dialogue:

Before

After

Requires SMILES

Page 26: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Making Workflows Work:

• Added many legacy tools and data services into

KNIME via custom nodes and SOA

• Aided usability by adding dashboard for service layer

monitoring

• Added authentication handling via NTLM Auth to

provide data authentication at source

• Adding molecule handling framework to reduce

number of molecule conversions users need

20/05/2014 Company Confidential © 2014 Eli Lilly and Company

Page 27: EUGM 2014 - James Lumley (Eli Lilly and Co.): Making Workflows Work: Enterprise deployment of KNIME at Lilly

Acknowledgements

5/21/2014 Company Confidential © 2014 Eli Lilly and Company 27

Java Coding & Infrastructure (Lilly):

Luke Bullard, Tom Wilkin

Project Management, End User support, Expert Users (& Testers), Previous Developers etc.:

Derek Marren, Marnie Williams, Pip Turner, Matt Hirst, Dave Thorner, Dave Evans, Mike Bodkin,

Niko Fechner, Roger Robinson, Jibo Wang, Christos Nicolaou, Beth Wright, Gary Sharman,

Simon Richards, Stuart Morton, Jason Ochoada, Jim Hughes

(In no particular order!)

KNIME.com

Bernd, Thorsten, Thomas, Aaron ++