Top Banner
© 2016 Nuance Communications, Inc. All rights reserved. Spectrum Scale Performance Tools Deployment at Nuance Bob Oesterlin, Sr Storage Engineer Nuance HPC Grid June 10th, 2016 [email protected]
17

Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

Jul 09, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2 0 1 6 Nu a nce Co mmu nica tio ns, In c. A ll r ig ht s rese rv ed.

Spectrum Scale Performance ToolsDeployment at NuanceBob Oesterlin, Sr Storage EngineerNuance HPC GridJune 10th, 2016

[email protected]

Page 2: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 2

– Quick Overview of Nuance and HPC Grids– Performance GUI – Experiences and Limitations– Spectrum Scale Performance Tools – Deployment– Collector Sizing/Federation– Dashboards using the Zimon-Grafana Bridge– What’s next

Topics

Page 3: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 3

Reinventing the relationship between people and technology— Defining the next generation of

human-computer interaction: Intelligent Systems

— Deeply invested in creating effortless and natural user experiences

— Best known for rapidly advancing voice-recognition technology

Page 4: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 4

Nuance Natural Language FrameworkThe engine that drives Intelligent Systems

Anything with George Clooney on tonight?”

Yes, I’ve found three shows, one of which is starting in just a few minutes.”

Page 5: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 5

– Supports the Worldwide Nuance R&D Community– Approximately 2000 users– ~7500 TB per day of data processed

– 85% Read, 15% Write– ~20,000,000 jobs processed per month– Over 6 PB of Elastic Storage across multiple clusters– On-premise Object storage, 4+ PB– VMs for casual access/job submission

Nuance HPC Grids

Page 6: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 6

– Large number of locally written tools– Collectl for system stats (CPU, disk, network, etc)– Periodic mmpmon collections feeds local database– Scripts to track RPC waiters– Dashboards based around Grafana

Performance data collection - legacy

Page 7: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 7

– Part of all releases since 4.1.1– Integrated with Spectrum Scale– Wide variety of metrics, both system and GPFS– New metrics being added (RPC waiters)– Integrates with Spectrum Control

SS Performance Sensors (aka “zimon”)

Page 8: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 8

– Provides access to all key SS performance metrics– Early beta participant– “Fairly” easy deployment– RH 7 dependency proved to be a challenge; current grids are all

RH 6.6 based– Table provide good overview of overall performance– Better for my Ops team than Engineering– Graphs – problematic in large clusters

IBM Performance GUI

Page 9: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 9

IBM Performance GUI

Page 10: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 10

– Using the default sensor configuration in large cluster is a bad idea– Deployment with federated (multiple) collectors– Which sensors drive the GUI?

Sensor Deployment - Problems

Page 11: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 11

– Default configuration is perfect for small environments– Collector memory requirements grow quickly– Difficult to retain large numbers of frequently collected metrics– Keep an eye on scaling:

– 500 NSDs * 500 nodes * 16 metrics = 4 million!– Example:

– 500 nodes, 500 NSDs, 16 file systems, 7 days of 1/min data = 66GB collector memory

Collector Sizing and deployment

Page 12: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 12

– IBM GUI is a great start – but limited, especially on larger clusters– Zimon-grafana bridge code by Metin Feridun @ IBM ZRL

– Provides Open TSDB Interface to IBM zimon data– Simple python script, runs on collector node, lightweight– All collected zimon metrics are available– Easy to construct complex/custom dashboards

– Distribution…– IBM Developerworks?

Grafana with Zimon

Page 13: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 13

Sample Grafana Dashboards

Page 14: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 14

Sample Grafana Dashboards

Page 15: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 15

Sample Grafana Dashboards

Page 16: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2016 Nuance Communications, Inc. All rights reserved. 16

– SS 4.2.1 Upgrade– RPC Waiter metrics in zimon– Cloud Tiering

– Consolidation of Grids– Combine Compute/NSD Clusters– Consistent deployment architecture

– Move from CNFS to CES

What’s Next

Page 17: Spectrum Scale Performance Tools Deployment at Nuancefiles.gpfsug.org/presentations/2016/anl-june/SSUG_Nuance... · 2016-06-11 · © 2016 Nuance Communications, Inc. All rights reserved.

© 2 0 1 6 Nu a nce Co mmu nica tio ns, In c. A ll r ig ht s rese rv ed.

Thank you