3D ligand-based virtual screening in the cloud (Blaze training) · 3D ligand-based virtual screening in the cloud (Blaze training) Cresset European User Group Meeting –Workshops

Post on 16-Feb-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

3D ligand-based virtual screening in the cloud

(Blaze training)

Cresset European User Group Meeting – Workshops

June 2016

© Cresset

Files for this workshop

> The files used in this workshop are available for download on

request

> Please send an email to enquiries@cresset-group.com stating the

name of the workshop that you wish to get the files for

© Cresset

Comparing structurally disparate molecules

PDB:2ogz PDB:3g0g

BioisosteresBioisosteric

groups

© Cresset

Effective ligand-based virtual screening

© Cresset

> Search a database for new

structures

> Uses a Linux CPU or GPU cluster

> Software, service or rental (SaaS)

Virtual screening with Blaze

> Diverse new structures

> Complementary to other techniques

> New uses for existing drugs

© Cresset

Complementarity of Blaze

Shape

Hits

Standard hits

Search Blaze

1. Standard method

2. Shape only setting

3. Compare results

© Cresset

Where does it perform poorly?

> Start from ‘Ala-Glu-Asp-Phe-Gly-Trp’

> Not enough 3D ligand info

> Start from Empty protein

> Search for new metal binding warhead

> Not chemically intelligent

> Search for covalent inhibitors

> All known actives have MW > 1000

> Conformation space too big

© Cresset

About Blaze

> Full virtual screening system

> Compound and collection management

> User and project permissions

> Integrates with standard queueing systems (e.g., SGE or LSF)

> Search history and archiving

> Choice of interface

> Web browser

> Command line

> RESTful API (aka ‘REST API’ or ‘web service’)

> Access from Pipeline Pilot, KNIME etc

> Access from Cresset’s Forge

© Cresset

Blaze operation

Search Retrieve

© Cresset

Search molecules

© Cresset

Search molecule choice

> Field points describe potential binding

> Choose the smallest (in field terms), most active compound

(ligand efficiency)

> More active = more interactions

> More efficient = fewer extraneous groups

> In field space, charged groups are BIG – e.g.

> CO2 >>>>>> iPr in field space

> remove extraneous groups (e.g., solubilisers)

© Cresset

Multiple references are good for:

> Alignment (Pose) prediction> Extra references to add information

> Electrostatic

> Shape

> Scaffold hopping (Spark)> Optimising a lower active by scoring against a high active

> Virtual Screening ?> Require molecules to look like both references

> Theoretically should be good

> Practically not always true

> Separate searches with data fusion performs better (?)

We prefer multiple searches to multiple references

© Cresset

Data fusion for multiple searches

> We use ‘combine on rank’

> E.g., run 3 searches and take the top 1,000 from each search

> We don’t recommend combining on score

> Score is local not global

> Value is dependent on size of search query

> Currently investigating the use of Z-scores and other approaches

> Find ways to evaluate ‘quality’ of scores

> Identify ‘frequent hitters’

> Score normalisation

© Cresset

Database molecules

© Cresset

Numbers

> 10M molecules is standard

> Uses compute cluster - scales to at least 500 CPU cores, tens of

thousands of GPU cores

> Pre-populated conformations

> Molecules in collections split by heavy atom ranges

> Duplicates across collections get filtered during searching

> GPU or CPU

© Cresset

Other issues

> Tautomers> Enumerate outside of Cresset and load as separate molecules

> Protomers> Enumerate externally

or

> Let Cresset choose

> Rules based on pH 7

> Good for amines, carboxylates

> Less good for e.g. aminopyridines

> Flat chirality> Fully explored

> Up to 3 flat centres

> Speed penalty

© Cresset

Searching

© Cresset

Searching

> 4 levels

> FieldPrint

> Nasty, very fast

> Clique (Fast mode)

> Alignment by matching field points, single-point true score

> Fast

> Good

> Simplex

> Alignment by matching field points, optimised score

> Protein excluded volumes

> Slower – best poses and scores

> Filter

© Cresset

Blaze: Workflow

New Search

Upload Molecule

Check Conversion

Setup Experiment

Name

Collections

Refinements

clique 50%+

simplex 10%+

Get results

Repeat or perfect

© Cresset

Let’s go

> Searching ChEMBL fragment like molecules

> Upload a search molecule

> SDF format (Mol2 also)

> Check the search molecule in Forge

> Run the search

> Marvel at the enrichment!

© Cresset

Connect to Blaze

Cresset Demo: http://blaze.cresset-group.com/blaze/ui/

Username Signup at http://blaze.cresset-group.com

User Preferences

Email when results available

Turn on/off automatic help

Every Page has context sensitive help here

© Cresset

Start a new search

Choose

New Search

© Cresset

Upload the search molecule

Choose Browse

Choose

Blaze/A2C_blaze_training.sdf

Choose Submit

© Cresset

Field addition and constraints

Download with field points

Need a viewer that displays field points

Use to

check file conversion

find field points to constrain

© Cresset

Open search molecule in Forge

© Cresset

> Click on the molecule

> Press Shift-i

Molecule should be displayed with

the index of the field point

> Press Shift-f

Field points should have a number

next to them (the size)

> These are needed to identify a field

point to constrain

Finding field points to constrain

© Cresset

Moving on…

© Cresset

Simple Search Page – 3 sections, part 1

Unique Name

A2C_<Uname>

Choose “EUGM2016”

© Cresset

Simple Search Page – 3 sections, part 2

Heavy atom ranges

“Collections”

Choose “Chembl20_filtered”

Choose “11-20 atoms”

© Cresset

Simple Search Page – 3 sections , part 3

50-100%

Always!

10% is good

Find the ID, Enter the size

Used in Simplex refinement

Excluded volume only

© Cresset

Submit the search

> And wait ......

> Any Questions?

© Cresset

> View results

> Select search:

EUGM_SEARCH_1

> You should be taken to the parent

result page

Download pre-calculated results

© Cresset

> These are the worst results to

download!

> The links at the top show all

refinements of this parent

> We want the last (right most)

refinement – should be green

> Click on ‘Simplex’

Caution!

© Cresset

> Check that you have a ‘Download

Overlays’ button

NO? Not on the right page

YES? Download the top 100 results

as sdf file

Always use sdf files

> Open in Forge

Download results 3D

© Cresset

Opening results in Forge

© Cresset

Some good results

Result assessment

But many lack the positive charge

© Cresset

From the initial search

Filter results to give only positive compounds

From the completed results

© Cresset

Filters

Must contain any one of these

Must contain any one of these

Must NOT contain any one of these

AND

AND

Must obey each one of these

AND

© Cresset

Positive filter

Unique Name

‘positive’

= 1

© Cresset

Filter Early ~10,700 mols

Filter early > Filter later

Filter late ~3,300 mols

© Cresset

Download results

Download top 100 results

© Cresset

> Quickest download is of 1D text file

of results

Download Results 1D

cressetgroup

Questions welcomed

support@cresset-group.com

Example files available from

enquiries@cresset-group.com

Contact us for our tailored training courses

top related