Scalable 3D molecular graphics on the web PDB a billion atom archive Steps 1 + 2: Speeding-up download & parsing Funding and acknowledgements Step 4: Molecular representations geometry Step 3: Efficient storage & access Steps to display a structure Growth of the PDB archive Step 5: Rendering without plugins Does it scale? > 1 Billion atoms in the asymmetric units Instancing: Create geometry once, send to GPU once, then transform position & render multiple times. • Green surface is reused 59 times for highly symmetric virus capsid Impostors: For each pixel GPU tests intersection of sphere and camera ray to reduce triangle count. • Quality resolution independent as more pixels are tested • Impostors also used for cylinders. • Java Applets have provided fast execution and GPU access • Removed from Google Chrome in version 45 (Sep 2015) • Oracle to deprecate Java plugin in upcoming JDK 9 • Browsers don’t need plugins anymore • JavaScript approaches native speed • WebGL offers plugin-free access to the graphics card HIV-1 capsid at three scales: 216 hexameric and 12 pentameric subunits, ~2.4M unique atoms Faustovirus major capsid: 2760 instances of 14478 unique atoms, ~40M overall atoms Alexander S. Rose, Anthony R. Bradley, Yana Valasatava, Jose M. Duarte, Andreas Prlić, Peter W. Rose • Largest structure: HIV-1 Capsid (PDB ID 3J3Q) • ~2.4M unique atoms • gzipped mmCIF file: 48.7MB • 68 of the 100 largest structures deposited in past 3 years • Advances in experimental techniques fuel the growth Download File Decompress & Parse Populate Data Model Create Geometry Render BD2K Grant: U01 CA198942 RCSB PDB Team • MMTF • Libraries & Specification: http://mmtf.rcsb.org • NGL Viewer • Supports MMTF, uses columnar stores & WebGL • Openly developed: https://github.com/arose/ngl AS Rose & PW Hildebrand. NGL Viewer: a web application for molecular visualization. Nucl. Acids Res. (1 July 2015) 43 (W1). doi:10.1093/nar/gkv402 • The PDB archive is growing: structures are getting larger and more complex • Here we present approaches for scalable 3D molecular graphics on the web • The MMTF format provides highly compressed structure files • The NGL Viewer efficiently stores & renders millions of atoms • Even the largest structures can be rapidly downloaded & displayed in about 1 second to 1 minute depending on device and connection speed NGL Viewer • MacroMolecular Transmission Format (MMTF) • New file format, optimized for transmission of macromolecules • Binary - for fast parsing • MessagePack “binary JSON" as an extensible container • Bespoke compression strategies - for small file size • Comparison with mmCIF (whole PDB archive, gzipped) • Size reduced by a factor of >4 (30GB to 7GB) • Parsing time reduced by a factor of ~12 (205 min to 17 min) using JavaScript libraries MMTF • Columnar stores • Single TypedArray per property • Parsed data can be copied in blocks • Convenient access via proxy objects Software availability PDB ID 1RB8 PDB ID 3J3Q PDB ID 5J7V Impostor Geometry 2 Triangles Normal Geometry 320 Triangles 1. 2. 3. 4. 5.