Nick Nystrom · Sr. Director of Research · [email protected] SC16 · NVIDIA Theater · November 16, 2016 A Converged HPC & Big Data System for Nontraditional and HPC Research
Nick Nystrom · Sr. Director of Research · [email protected] · NVIDIA Theater · November 16, 2016
A Converged HPC & Big Data Systemfor Nontraditional and HPC Research
2
is delivering Bridges
All trademarks, service marks, trade names, trade dress, product names, and logos appearing herein are the property of their respective owners.
Acquisition and operation of Bridges are made possible by the National Science Foundation through award #ACI-1445606 ($17.2M):
Bridges: From Communities and Data to Workflows and Insight
3
Gaining Access to Bridges
Bridges is allocated through XSEDE: https://www.xsede.org/allocations– Startup Allocation
• Can request anytime; requires only a brief proposal/abstract• Up to 50,000 core-hours on RSM and GPU (128GB) nodes, 1,000 TB-hours on LSM (3TB) and ESM
(12TB) nodes, and a user-specified (in GB) amount of storage on Pylon• Can request XSEDE ECSS (Extended Collaborative Support Service)
– Research Allocation (XRAC)• Appropriate for larger requests; can request ECSS• Community allocations can support gateways• Can be up to millions to tens of millions of SUs• Quarterly submission windows: March 15–April 15, June 15–July 15,
September 15–October 15, December 15–January 15
– Educational Allocations• To support use of Bridges for courses
Up to 10% of Bridges’ SUs are available on a discretionary basis to industrial affiliates, Pennsylvania-based researchers, and others to foster discovery and innovation and broaden participation in data-intensive computing.
4
The Shift to Big Data and New Users
Pan-STARRS telescopehttp://pan-starrs.ifa.hawaii.edu/public/
Genome sequencers(Wikipedia Commons)
NOAA climate modelinghttp://www.ornl.gov/info/ornlreview/v42_3_09/article02.shtml
CollectionsHorniman museum: http://www.horniman.ac.uk/get_involved/blog/bioblitz-insects-reviewed
Legacy documents(Wikipedia Commons)
Environmental sensors: Water temperature profiles from tagged hooded sealshttp://www.arctic.noaa.gov/report11/biodiv_whales_walrus.html
Library of Congress stackshttps://www.flickr.com/photos/danlem2001/6922113091/
Video(Wikipedia Commons)
Social networks and the Internet
New Emphases
Structured, regular, homogeneous
Unstructured, irregular, heterogeneous
5
Motivating Use Cases
Data-intensive applications & workflowsGateways – the power of HPC without the programmingShared data collections & related analysis toolsCross-domain analyticsDeep learningGraph analytics, machine learning, genome sequence assembly, and other large-memory applications Scaling beyond the laptopScaling research to teams and collaborationsIn-memory databasesOptimization & parameter sweepsDistributed & service-oriented architecturesData assimilation from large instruments and Internet dataLeveraging an extensive collection of interoperating software
Research areas that haven’t used HPC
New approaches to “traditional HPC” fields (e.g., machine learning)
Coupling applications in novel ways
Leveraging large memory, GPUs, and high bandwidth
}
6
Objectives and Approach
• Bring HPC to nontraditional users andresearch communities.
• Apply HPC effectively to big data.
• Bridge to campuses to streamline accessand provide cloud-like burst capability.
• Leveraging PSC’s expertise with sharedmemory, Bridges will feature 3 tiers oflarge, coherent shared-memory nodes:12TB, 3TB, and 128GB.
• Bridges implements a uniquely flexible environment featuring interactivity, gateways, databases, distributed (web) services, high-productivity programming languages and frameworks, and virtualization, and campus bridging.
7
Interactivity
• Interactivity is the feature mostfrequently requested bynontraditional HPC communities.
• Interactivity provides immediatefeedback for doing exploratorydata analytics and testing hypotheses.
• Bridges offers interactivity through a combination of shared and dedicated resources to maximize availability while accommodating different needs.
8
High-Productivity Programming
Supporting languages that communities already use is vital for them to apply HPC to their research questions.
9
Bridges’ large memory is great for Spark!
Bridges enables workflows thatintegrate Spark/Hadoop, HPC, GPU,and/or large shared-memory components.
Spark, Hadoop & Related Approaches
10
800 HPE Apollo 2000 (128GB) compute nodes
20 “leaf” Intel® OPA edge switches
6 “core” Intel® OPA edge switches:fully interconnected,2 links per switch
42 HPE ProLiant DL580 (3TB) compute nodes
20 Storage Building Blocks, implementing the parallel Pylonstorage system(10 PB usable)
4 HPE Integrity Superdome X (12TB)
compute nodes …
12 HPE ProLiant DL380 database nodes
6 HPE ProLiant DL360 web server nodes
4 MDS nodes2 front-end nodes
2 boot nodes8 management nodes
Intel® OPA cables
16 RSM nodes, each with 2 NVIDIA Tesla K80 GPUs
32 RSM nodes, each with 2 NVIDIA Tesla P100 GPUs
… each with 2 OPA↔IB gateway nodes
http://staff.psc.edu/nystrom/bvtBridges Virtual Tour:
Purpose-built Intel® Omni-PathArchitecture topology for
data-intensive HPC
11
GPU Nodes
Bridges’ GPUs are accelerating both deep learning and simulation codes
Phase 1: 16 nodes, each with:• 2 × NVIDIA Tesla K80 GPUs (32 total)• 2 × Intel Xeon E5-2695 v3 (14c, 2.3/3.3 GHz)• 128GB DDR4-2133 RAM
Phase 2: +32 nodes, each with:• 2 × NVIDIA Tesla P100 GPUs (64 total)• 2 × Intel Xeon E5-2683 v4 (16c, 2.1/3.0 GHz)• 128GB DDR4-2400 RAM
12
NVIDIA Tesla K80
Kepler architecture2496 CUDA cores (128/SM)7.08B transistors on 561mm2 die (28nm)2×24 GB GDDR5; 2×240.6 GB/s562 MHz base – 876 MHz boost2.91 Tf/s (64b), 8.73 Tf/s 32bBridges: 32 K80 GPUs → 279 Tf/s (32b)
13
NVIDIA Tesla P100
Pascal architecture3584 CUDA cores (64/SM)15.3B transistors on 610mm2 die (16nm)16GB CoWoS® HBM2 at 720 GB/s w/ ECC1126 MHz base – 1303 MHz boost4.7 Tf/s (64b), (9.3) Tf/s 32b, (18.7) Tf/s 16bPage migration engine improves unified memoryBridges: 64 P100 GPUs → 600 Tf/s (32b)
From http://www.nvidia.com/object/tesla-p100.html. Bridges results forthcoming.
14
Applying Deep Learning to ConnectomicsGoal: Explore the potential of deep learning to automate segmentation of high-resolution scanning electron microscope (SEM) images of brain tissue and the tracing of neurons through 3D volumes to automate generation of the connectome, a comprehensive map of neural connections.
Motivation: This project builds substantially on anongoing collaboration between PSC, HarvardUniversity, and the Allen Brain Institute, throughwhich we have access to high-quality raw andhuman-annotated data. The SEM data volume formouse cortex imaging is ~3TB/day, and dataprocessing is currently human-intensive. Forthcomingcamera systems will increase data bandwidth by 65×.
Datasets: Zebrafish larva (1024×1024×4900),mouse, …
Collaborators: Ishtar Nyawĩra (Pitt), Iris Qian & Annie Zhang (CMU), John Urbanic, Joel Welling, and Nick Nystrom (PSC/CMU).
15
Connectomics
Connectomics is the building of complete mappings of the neural pathways in organisms (connectomes)
Can we use CNNs to automate
connectomics?
… especially considering how big
biomedical image data tends
to be?
17
Most Likely!Build on training data & pixel-wise classification
Pixel-wise image classification of cancer cells by Andrew Janowczyk
Extensive training data (1.4M points) from ongoing collaboration1. Bock, D. D. et al., Network anatomy and in vivo physiology of visual cortical
neurons, Nature 471, 177–182 (10 March 2011), doi:10.1038/nature09802.2. Lee, W.-C. A. et al., Anatomy and function of an excitatory network in the
visual cortex, Nature 532, 370–374 (21 April 2016), doi:10.1038/nature17192.
Our MethodsChoosing a direction...
• Data Approach
• Neural Network
• Framework
19
Choosing A Data Approach
5-Slice Approach• Using 4 surrounding slices of a single slice
as channel dimension• (H × W × C) → (1024 × 1024 × 5)• Take into account that surrounding slices
provide minute (rather than drastic) variances
• Difficult to use the kind of click-point data that we have
Cube Approach• Using image volumes → 41 × 41 × 41 cubes• Provides us with a single neuron path
through a small volume of the full fish• Could help by providing the net with
pre-existing paths• Which makes it easier to use the kind of
click-point data we have
Something to work around:Our annotated data consists of single click locations within each
neuron in the 4900 image slices instead of outlines of each neuron
20
Spotting the Neurons
Images courtesy Florian Engert, David Hildebrand, and their students at the Center for Brain Science, Harvard
21
Approach
• Test two distinct data decompositions• Select and optimize CNN architectures for the data under study• Implement in Caffe (Caffe-SegNet) and TensorFlow• Results planned for GTC 2017
22
Deep Learning Projects Using Bridges (examples)
Florian Metze (CMU): Automatic Building of Speech Recognizers for Non-Experts
Peter Rossky (Rice U.): Development of a Hybrid Computational Approach for Macroscale Simulation of Exciton Diffusion in Polymer Thin Films, Based on Combined Machine Learning, Quantum-Classical Simulations and Master Equation Techniques
Junbo Xu (Toyota Technological Institute at Chicago): Developing Large-Scale Distributed Deep Learning Methods for Protein Bioinformatics
Deva Ramanan (CMU): Learning to Parse Images and Videos
Eric Xing (CMU): Petuum, a Distributed System for High-Performance Machine Learning
Mark Ohman (Scripps Institution of Oceanography): Quantifying California Current Plankton using Machine Learning
Xinghua Lu (U. of Pittsburgh): Deciphering Cellular Signaling System by Deep Mining a Comprehensive Genomic Compendium
Michael Reale (SUNY Polytechnic Institute): Automatic Pain Assessment
Dokyun Lee (CMU): Education Allocation for the Course Unstructured Data & Big Data: Acquisition to Analysis
23
Deep Learning Projects Using Bridges (examples)
Manuela Veloso (CMU): Deep Learning of Game Strategies for RoboCup
Diane Litman (U. of Pittsburgh): Automatic Evaluation of Scientific Writing
Param Singh (CMU): Image Classification Applied in Economic Studies
Matt Fredrikson (CMU): Exploring Stability, Cost, and Performance in Adversarial Deep Learning
Adriana Kovashka (U. of Pittsburgh): Enabling Robust Image Understanding Using Deep Learning
Gil Alterovitz (Harvard Medical School/Boston Children’s Hospital): Preparing Grounds to Launch All-US Students Kaggle Competition on Drug Prediction
Shaun Mahony (Penn State): Deep Learning the Gene Regulatory Code
Michael Lam (Oregon State): Deep Recurrent Models for Fine-Grained Recognition
24
Thank You
Questions?
25
Additional Content
26
For Additional Information
Website: www.psc.edu/bridges
Bridges PI: Nick [email protected]
Co-PIs: Michael J. LevineRalph RoskiesJ Ray Scott
Project Manager: Robin Scibek
27
High-throughputgenome sequencers
Data Infrastructure for theNational Advanced Cyberinfrastructure Ecosystem
Bridges is a new resource on XSEDE, interoperating with other XSEDE resources, Advanced Cyberinfrastructure (ACI) projects, campuses, and instruments nationwide.
Data Infrastructure Building Blocks (DIBBs)‒ Data Exacell (DXC)‒ Integrating Geospatial Capabilities into HUBzero‒ Building a Scalable Infrastructure for Data-Driven
Discovery & Innovation in Education‒ Other DIBBs projects
Other ACI projects
Reconstructing brain circuits from high-resolution electron microscopy
Temple University’s new Science, Education, and Research Center
Carnegie Mellon University’s Gates Center for Computer Science
Examples:
Social networks and the Internet
28
Campus Bridging
Through a pilot project with Temple University, the Bridges project will explore new ways to transition data and computing seamlessly between campus and XSEDE resources.
Federated identity management will allow users to use their local credentials for single sign-on to remote resources, facilitating data transfers between Bridges and Temple’s local storage systems.
Burst offload will enable cloud-like offloading of jobs from Temple to Bridges and vice versa during periods of unusually heavy load.
Federated identity management
Burst offload
http://www.temple.edu/medicine/research/RESEARCH_TUSM/
29
Gateways and Tools for Building Them
Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex workflows, and manage data from their browsers.
- Provide “HPC Software-as-a-Service”- Extensive use of VMs, databases, and distributed services
Interactive pipeline creation in GenePattern(Broad Institute)
Galaxyhttps://galaxyproject.org/
Col*Fusion portal for the systematic accumulation, integration, and utilization of
historical data, from http://colfusion.exp.sis.pitt.edu/colfusion/
30
Virtualization and Containers
• Virtual Machines (VMs) enable flexibility, security, customization, reproducibility, ease of use, and interoperability with other services.
• User demand is for custom database and web server installations to develop data-intensive, distributed applications and containers for custom software stacks and portability.
• Bridges leverages OpenStack to provision resources, between interactive, batch, Hadoop, and VM uses.
31
Database and Web Server Nodes
Dedicated database nodes power persistent relational and NoSQL databases
– Support data management and data-driven workflows– SSDs for high IOPs; HDDs for high capacity
Dedicated web server nodes– Enable distributed, service-oriented architectures– High-bandwidth connections to XSEDE and the Internet
(examples)
32
Data Management
Pylon: A large, central, high-performance storage system– 10 PB usable– Visible to all compute nodes– Currently implemented as two complementary file systems:
• /pylon1: Lustre, targeted for $SCRATCH use; non-wiped directories available by request
• /pylon2: SLASH2, targeted for large datasets, community repositories, and distributed clients
Distributed (node-local) storage– Enhance application portability– Improve overall system performance– Improve performance consistency to the shared filesystem– Aggregate 7.2 PB (6 PB non-GPU RSM: Hadoop, Spark)
33
Intel® Omni-Path Architecture (OPA)
Bridges is the first production deployment of Omni-PathOmni-Path connects all nodes and the shared filesystem, providing Bridges and its users with:
– 100 Gbps line speed per port;25 GB/s bidirectional bandwidth per port
– Measured 0.93μs latency, 12.36 GB/s/dir– 160M MPI messages per second– 48-port edge switch reduces
interconnect complexity and cost– HPC performance, reliability, and QoS– OFA-compliant applications supported without modification– Early access to this new, important, forward-looking technology
Bridges deploys OPA in a two-tier island (leaf-spine) topology developed by PSC for cost-effective, data-intensive HPC
34
Omni-Path Architecture User Group (OPUG)
• Opportunity for OPA early adopters to share experiences• Led by PSC
– Bridges is the first large-scale deployment of Omni-Path– Initial conversations on advanced interconnect technology date
back to 2007
• Sign up at http://www.psc.edu/index.php/opug• PSC OPUG contacts
– Dave Moses [email protected]– J. Ray Scott [email protected]– Nick Nystrom [email protected]
OPUG meeting at SC16November 17, 1:30-3:00p, Room 155-Ahttp://sc16.supercomputing.org/presentation/?id=bof160&sess=sess354
35
Type RAMa Ph n CPU / GPU / other Server
ESM 12 TB1 2 16 × Intel Xeon E7-8880 v3 (18c, 2.3/3.1 GHz, 45MB LLC) HPE Integrity
Superdome X2 2 16 × Intel Xeon E7-8880 v4 (22c, 2.2/3.3 GHz, 55MB LLC)
LSM 3TB1 8 4 × Intel Xeon E5-8860 v3 (16c, 2.2/3.2 GHz, 40 MB LLC)
HPE ProLiant DL5802 34 4 × Intel Xeon E7-8870 v4 (20c, 2.1/3.0 GHz, 50MB LLC)
RSM 128 GB 1 752 2 × Intel Xeon E5-2695 v3 (14c, 2.3/3.3 GHz, 35MB LLC)
HPE Apollo 2000RSM-GPU 128 GB
1 16 2 × Intel Xeon E5-2695 v3 + 2 × NVIDIA Tesla K80
2 32 2 × Intel Xeon E5-2683 v4 (16c, 2.1/3.0 GHz, 40MB LLC) + 2 × NVIDIA Tesla P100
DB-s128 GB 1
6 2 × Intel Xeon E5-2695 v3 + SSD HPE ProLiant DL360
DB-h 6 2 × Intel Xeon E5-2695 v3 + HDDs HPE ProLiant DL380
Web 128 GB 1 6 2 × Intel Xeon E5-2695 v3 HPE ProLiant DL360
Otherb 128 GB 1 16 2 × Intel Xeon E5-2695 v3 HPE ProLiant DL360, DL380
Gateway 128 GB1 4
2 × Intel Xeon E5-2695 v3 HPE ProLiant DL3802 4
Storage 128 GB1 5 2 × Intel Xeon E5-2680 v3 (12c, 2.5/3.3 GHz, 30MB LLC)
Supermicro X10DRi2 15 2 × Intel Xeon E5-2680 v4 (14c, 2.4/3.3 GHz, 35MB LLC)
Total 282 TB 908
a. RAM in Bridges’ Intel Xeon v3 nodes is DDR4-2133, and RAM in Bridges’ Intel Xeon v4 nodes is DDR4-2400b. Other nodes = front end (2) + management/log (8) + boot (4) + MDS (4)
Bridges: Node Types
36
Com
pute
Nod
es
Phase 1 Phase 2 Technical Upgrade
RSMRSM GPU
LSM ESMPh. 1 Total
RSMGPU
LSM ESMPh. 2 Total
Total†
Peak fp64 (Tf/s)
774.9 80.5 18.0 21.2 894.6 333.3 91.4 24.8 449.5 1344
RAM (TB) 94 2 24 24 144 4 102 24 130 274
HDD (TB) 6016 128 128 128 6400 256 544 128 928 7328
Bridges: System Capacities
Stor
age
Phase 1 Phase 2 Total
PylonPB, raw 3.5 10.5 14
PB, usable 2.5 7.5 10
Local† PB, raw 6.4 0.9 7.3
† Excludes utility nodes (database, web server, front-end, management, storage, etc.
37
Connectionist Temporal ClassificationFlorian Metze, Hari Parthasarathi, Yajie Miao, and Dong Yu,
Carnegie Mellon University
A revived approach to sequenceclassification that works well withspeech at the phoneme, syllable,character or word level
Bridges is being used to train the Eesen Toolkit part of the Speech Recognition Virtual Kitchen, a web-based resource that allows users to check out VMs with complete deep neural network (DNN) environments.
38
Causal Discovery PortalCenter for Causal Discovery, an NIH Big Data to Knowledge Center of Excellence
Web node
VMApache Tomcat
Messaging
Database node
VM
Other DBs
LSM Node (3TB)
ESM Node(12TB)
Analytics:FGS, IMaGES, and other algorithms, building on
TETRAD
Browser-based UI
• Authentication• Data mgmt.• Provenance
Execute causal discovery algorithms
Omni-Path
• Prepare and upload data
• Run causal discovery algorithms
• Visualize results
Internet
Memory-resident datasets
Pylon filesystem
TCGAfMRI
…
39
De Novo Assembly of the Sumatran Rhino GenomeJames Denvir and Swanthana Rekulapa, Marshall University
First assembled the 1 gigabase Narcissus flycatcher(Ficedula narcissina) genome– On the users’ local resources, they could only assemble
⅓ of the data due to memory limitations, taking 16 hours– Assembly of all data on a 3TB node of Bridges required
1.5 TB of memory and only 6.6 hours, which was at least3-4 faster than anticipated based on execution times elsewhere
Then assembled the 3 gigabase Sumatran rhino(Dicerorhinus sumatrensis) genome– Required 1.9 TB of memory and completed in only
11 hours, again 3-4 faster than anticipatedSumatran Rhinoceroses at the Cincinnati Zoo & Botanical Garden, Charles W. Hardin, CC BY 2.0. https://upload.wikimedia.org/wikipedia/commons/3/33/Sumatran_Rhino_2.jpg
Narcissus Flycatcher (Ficedula narcissina) in Osaka, Japan, Kuribo, CC BY-SA 2.0. https://commons.wikimedia.org/wiki/File:Narcissus_Flycatcher-cropped.jpg
40
Improved SNP Detection in Metagenomic PopulationsWenxuan Zhong, Xin Xing, and Ping Ma, Univ. of Georgia
Assembled 378 Gigabase pairs (Gbp) of gut microbial DNA from normal and diabetic patients– Massive metagenome assembly took only 16 hours using
an MPI-based metagenome assembler, Ray, on 20 BridgesRM nodes connected by Omni-Path
Identified 2480 species-level clusters in assembledsequence data– Ran MetaGen clustering software across 10 RM nodes,
clustering 500,000 contiguous sequences in only 14 hours– Currently testing new likelihood-based statistical method
for fast and accurate SNP detection from these metagenomicsequencing data
→The team is now using Bridges to test a new statistical method on the sequence data to identify critical differences in gut microbes associated with diabetes.
Environmental Shotgun Sequencing (ESS). (A) Sampling from habitat; (B) filtering particles, typically by size; (C) DNA extraction and lysis; (D) cloning and library; (E) sequence the clones; (F) sequence assembly. By John C. Wooley, Adam Godzik, IddoFriedberg -http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000667, CC BY 2.5, https://commons.wikimedia.org/w/index.php?curid=17664682
41
Characterizing Diverse Microbial Ecosystemsfrom Terabase-Scale Metagenomic Data
Brian Couger, Oklahoma State University
Assembled 11 metagenomes sampled from diverse sources, comprising over 3 trillion bases of sequence data– Including a recent massive assembly of 1.6 Tbp
of metagenomic data from an oil sands tailings pond, a bioremediation target
– Excellent performance of MPI-based Ray assembler on 90 RM nodes completed assembly in only 4.25 days
– Analysis of assembled data in progress to characterize organisms present in these diverse environments and identify new microbial phyla
Oil sands tailings pond.By NASA Earth Observatoryhttp://earthobservatory.nasa.gov/IOTD/view.php?id=40997, Public Domain, https://commons.wikimedia.org/w/index.php?curid=8346449
42
Investigating Economic Impacts of Images and Natural Language in E-commerce
Dokyun Lee, CMU Tepper School of Business• Security and uncertain quality create challenges for
sharing economies– Lee et al. studied the impact of high-quality, verified photos
for Airbnb hosts– 17,000 properties over 4 months– Used Bridges’ GPU nodes
• Security and uncertain quality create challenges for sharing economies Difference-in-Difference (DD) analysis showed that on
average, rooms with verified photos are booked 9% more often
Separating effects of photo verification from photo quality and room reviews indicates that high photo quality resultsin $2,455 of additional yearly earnings
They found asymmetric spillover effects: on the neighborhood level, there appears to be higher overall demand if more rooms have verified photos
43
Thermal Hydraulics of Next-GenerationGas-Cooled Nuclear Reactors
PI Mark Kimber, Texas A&M University
Completed 3 Large Eddy Simulations of turbulent thermal mixing of helium coolant in the lower plenum of the Gen IV High-Temperature Gas-cooled Reactor (HTGR)– Performed using OpenFOAM, a open-source CFD code – Simulations are for scaled-down versions of representative sections of
the HTGR lower plenum, each containing 7 support posts and 6 high-Reynolds number jets in a cross-flow
– Each simulation involves a high-quality block-structured mesh with 16+ million cells, and was run on 20 regular Bridges nodes (560 cores) for ~10 million time steps (~1 second of simulated time)
– Helps to understand the turbulent thermal mixing in great detail, as well as temperature distributions and hotspots on support posts
Anirban Jana (PSC) is co-PI and computational lead for this DOE project
Distribution of vorticity in arepresentative “unit cell” of the HTGRlower plenum containing 7 supportposts and 6 jets in a crossflow (right toleft). Turbulent mixing of core coolantjets in the lower plenum can causetemperature fluctuations and thermalfatigue of the support posts.
Acknowledgements:• DOE-NEUP (Grant nos. DE-NE0008414)• NSF-XSEDE (Grant no. CTS160002)
44
Simulations of Nanodisc SystemsJifey Qi and Wonpil Im, Lehigh University
A nanodisc system showing two helical proteins warping around a discoidal lipid bilayer.
Using NAMD on Bridges’ NVIDIA P100 GPU nodes to simulate a number of nanodisc systems to be used as examples for the Nanodisc Builder module in CHARMM-GUI For a system size of 263,888 atoms, the simulation
speed on one P100 node (6.25 ns/day) is faster than on three dual-CPU nodes (4.09 ns/day)
− CHARMM-GUI provides a web-based graphical user interface to generate various molecular simulation systems and input files (for CHARMM, NAMD, GROMACS, AMBER, GENESIS, LAMMPS, Desmond, OpenMM, and CHARMM/OpenMM) to facilitate and standardize the usage of common and advanced simulation techniques
45
Mechanisms of Chromosomal RearrangementsJose Ranz and Edwin Solares, UC Irvine
Assembled Drosophila willistroni genome four ways with various assemblers– Mix of large memory and regular memory nodes on Bridges enables
testing various assemblers and sequencing technologies to obtain the highest-quality genome
– High-quality assemblies are critical to understanding chromosomal rearrangements, which are pivotal in natural adaptation
– Bridges’ high-core, high-memory nodes saved the researchers at least a month compared to other resources
Assembled Anopheles arabiensis genome and transcriptomes– Again, using multiple sequencing data types and assembly software– Bridges’ flexibility key is to supporting these diverse needs
Drosophila willistoni Adult Male.By Mrs. Sarah L. Martin - Plate XVI, J.T. Patterson and G.B. Mainland. The Drosophilidae of Mexico. University of Texas Publications, 4445:9-101, 1944., Public Domain, https://commons.wikimedia.org/w/index.php?curid=30607942
Anopheles gambiae mosquitoBy James D. Gathany - The Public Health Image Library , ID#444, Public Domain, https://commons.wikimedia.org/w/index.php?curid=198377
46
Calculated the integration efficiencies for protein sequences by means of their own coarse-grained (CG) model, running ensembles of ~400−1200 simulations in parallel
As of 9/13/16, they’ve calculated membrane protein integration efficiencies for 16 proteins on Bridges
The calculated integration efficiencies correlate well with experimentally-observed expression levels for a series of three homologs of the membrane protein TatC
Experimentally observed expression levels for a series of three homologs of themembrane protein TatC (left) correlate well with the calculated integrationefficiency (right).
Schematic representation of integration. Ribosome (violet), nascent loops (cyan),nascent transmembrane domains (red), lateral gate (green), lipids (light gray).
Integration of Membrane Proteins into the Cell MembraneMichiel Niesen and Thomas Miller, Caltech