‹#› Cover your PaaS with the Cloud Foundry dashboard for operational metrics Daniel Krook Senior Certified IT Specialist, IBM @danielkrook - krook.info
May 08, 2015
‹#›
Cover your PaaSwith the Cloud Foundry dashboard for operational metrics
Daniel KrookSenior Certified IT Specialist, IBM@danielkrook - krook.info
Your presenter▪ Built the DevOps infrastructure to deploy and manage the first large scale
Cloud Foundry clusters on OpenStack inside of IBM !
▪ Helps customers understand the value of the Platform-as-a-Service cloud delivery model and adopt systems of engagement !
▪ Enjoys meeting and sharing knowledge with the community that builds open cloud architectures (and founded NYC CF)
IBM runs Cloud Foundry on hundreds of SoftLayer VMs
BlueMix
In the past year, we’ve learned how to
• Manage hundreds of DEAs, service nodes, fabric nodes in the beta • Several other development and staging environments before that • Deployed first with Chef, then with BOSH over 18 months • All environments have benefited from the Admin UI
• Keep Cloud Foundry running smoothly • Discover and prevent impending problems • Resolve unexpected issues quickly
1. Show the type and volume of data and why we want to monitor it
2. Show how we monitor that data with the Admin UI (the dashboard for operational metrics)
3. Show you a demo of the Admin UI and how to install it either standalone or via BOSH
4. Show you how to get involved with the GitHub incubator project and improve it
Goals for this talk
We are looking to get better at this, and help the community get better as well.
What’s the important data and how do we find it?
What metrics matter?
Data that can be tracked over time to see trends and behaviors
Data that can help us predict problems before they happen
DEAs and apps health
▪Memory reserved as a proportion of the memory available
General health of all components
▪Health of the virtual machines ▪Status of the processes running on them
Database nodes and services
▪Number of provisioned services against capacity available
At the PaaS layer, that means:
▪ Deliver continuous availability in the cloud !
▪ Proactively solve problems rather than react to them !
▪ Understand the behavior of the system to automate it
Why do we need this data?
▪ NATS message bus • Discover the components to interrogate • Query their varz endpoints
Where can we find it?
▪ Cloud Controller REST API ▪ UAA REST API
!
Enter the Admin UI
1. Views of component health !
2. Resource usage details !
3. Ongoing growth trends !
4. Access to logs and raw varz !
5. Email notifications
The Admin UI provides…
▪ Components nearing capacity or failure ▪ Already failed components ▪ Out of control apps and noisy users
!!!
▪ Active/inactive users and apps ▪ Growth trends and runtime/service adoption
It helps us find (and fix) problems
It helps us see patterns
Link to the spaces, users, and apps tabs above, with search filter enabled
Organizations
Clicking the icon will dump raw JSON data
Link to the orgs, users, and apps tabs above, with search filter enabled
Spaces
Clicking the icon will dump raw JSON data
Apps
Link to the spaces, orgs, and DEAs tabs above, with search filter
App URL is linked and the bound services are listed
Clicking the icon will dump raw JSON data
Link to the spaces and orgs tabs above, with search filter enabled
Users
Clicking the icon will dump raw JSON data
Link to the apps tabs above, with search filter enabled
DEAs
Clicking the varz link will dump raw JSON data
Cloud controllers
Clicking the varz link will dump raw JSON data
Health manager
Clicking the varz link will dump raw JSON data
Service gateways
Individual service node capacities are listed
Clicking the varz link will dump raw JSON data
Routers
Clicking the varz link will dump raw JSON data
Components
Clicking the varz link will dump raw JSON data
LogsThis only shows logs on the VM where the app is installed !Logs are vertically and horizontally scrollable
Nightly stats
This summary page is publicly viewable
Running the Admin UI
Run the Admin UI as a standalone service !
$ git clone https://github.com/cloudfoundry-incubator/admin-ui.git $ cd admin-ui $ ruby bin/admin
Then open http://localhost:8070 !!Run the Admin UI as a BOSH job
- name: admin_ui release: admin-ui template: - admin_ui_v2 instances: 1 resource_pool: logger persistent_disk: 10240 networks: - name: default default: [dns, gateway]
Update config/default.yml (works as is for BOSH-lite)
Older screenshots of the Admin UI
Latest screenshots are on GitHub https://github.com/cloudfoundry-incubator/admin-ui
User and app trends
There is also one unauthenticated page for high level stats
DEA list
DEA details
Service node list
Service node details
User list
User details
App list
App details
Log list
Log details
Email notifications
ibm.com/cloud