NGINX High Availability and Monitoring
Post on 16-Jul-2015
1284 Views
Preview:
Transcript
NGINX High Availability and Monitoring
Introduced by Andrew Alexeev
Presented by Owen Garrett
Nginx, Inc.
About this webinar
No one likes a broken website. Learn about some of the techniques that NGINX
users employ to ensure that server failures are detected and worked around, so that
you too can build large-scale, highly-available web services.
The cost of downtime
The causes of downtime
“ Through 2015, 80% of outages impacting mission-
critical services will be caused by people and process issues, and more than 50% of those outages will be caused by change/configuration/release integration and hand-off issues. ”
Configuration Management for Virtual and Cloud Infrastructures
Ronni J. Colville and George Spafford, Gartner
Hardware failures, disasters
People and Process
INTRODUCING NGINX…
What is NGINX?
Internet
N
Web ServerServe content from disk
Application ServerFastCGI, uWSGI, Passenger…
ProxyCaching, Load Balancing… HTTP traffic
Application Acceleration
SSL and SPDY termination
Performance Monitoring
High Availability
Advanced Features: Bandwidth Management
Content-based Routing
Request Manipulation
Response Rewriting
Authentication
Video Delivery
Mail Proxy
GeoLocation
143,000,000Websites
NGINX Accelerates
22%Top 1 million websites
37%Top 1,000 websites
NGINX and NGINX Plus
NGINX F/OSS
nginx.org
3rd party modules
Large community of >100 modules
NGINX and NGINX Plus
NGINX F/OSS
nginx.org
3rd party modules
Large community of >100 modules
NGINX Plus
Advanced load balancing featuresEase-of-managementCommercial support
IMPROVING AVAILABILITY WITH NGINX
Quick review of load balancingserver {
listen 80;
location / {
proxy_pass http://backend;
}
}
upstream backend {
server webserver1:80;
server webserver2:80;
server webserver3:80;
server webserver4:80;
}
Internet
N
Three NGINX Techniques for High Availability
NGINX: Basic Error Checks
NGINX Plus: Advanced Health Checks
Live software upgrades
1
2
3
1. Basic Error Checks
• Monitor transactions as they happen
– Retry transactions that ‘fail’ where possible
– Mark failed servers as dead
Basic Error Checksserver {
listen 80;
location / {
proxy_pass http://backend;
proxy_next_upstream error timeout; # http_503..., off
}
}
upstream backend {
server webserver1:80 max_fails=1 fail_timeout=10s;
server webserver2:80 max_fails=1 fail_timeout=10s;
server webserver3:80 max_fails=1 fail_timeout=10s;
server webserver4:80 max_fails=1 fail_timeout=10s;
}
More sophisticated retriesserver {
listen 80;
location / {
# On error/timeout, try the upstream group one more time
error_page 502 504 = @fallback;
proxy_pass http://backend;
proxy_next_upstream off;
}
location @fallback {
proxy_pass http://backend;
proxy_next_upstream off;
}
}
2. Advanced Health Checks
• “Synthetic Transactions”
– Probes server health
– Complex, custom tests are possible
– Available in NGINX Plus
Advanced Health Checksserver {
listen 80;
location / {
proxy_pass http://backend;
health_check;
}
}
upstream backend {
zone backend 64k;
server webserver1:80;
server webserver2:80;
server webserver3:80;
server webserver4:80;
}
health_check:interval = period between checksfails = failure count before deadpasses = pass count before aliveuri = custom URI
Default:5 seconds, 1 fail, 1 pass, uri = /
Advanced usageserver {
listen 80;
location / {
proxy_pass http://backend;
health_check uri=/test.php match=statusok;
proxy_set_header Host www.foo.com;
}
}
match statusok {
# Used for /test.php health check
status 200;
header Content-Type = text/html;
body ~ "Server[0-9]+ is alive";
}
Health checks inherit all parameters from location block.
match blocks define the success criteria for a health check
Edge cases – variables in configurationserver {
location / {
proxy_pass http://backend;
health_check;
proxy_set_header Host $host;
}
}
This may not work as expected.
Remember – the health_checktests run in the context of the enclosing location.
Edge cases – variables in configurationserver {
location / {
proxy_pass http://backend;
health_check;
proxy_set_header Host $host;
}
}
server {
location /internal-check {
internal;
proxy_pass http://backend;
health_check;
proxy_set_header Host www.foo.com;
}
}
This may not work as expected.
Remember – the health_checktests run in the context of the enclosing location.
This is the common alternative.
Use a custom URI for the location.Tag the location as internal.Set headers manually.Useful for authentication.
Examples of using health checks
• Verify that pagesdon’t contain errors
• Run internal tests (e.g. test.php => DB connect)
• Managed removal of servers$ touch $DOCROOT/isactive.txt
Advantages of ‘Health Checks’
• Run tests asynchronously (find errors faster)
• Custom tests (not related to ‘real’ traffic)
• More flexibility to specify success/error
MORE NGINX PLUS FEATURES…
Slow start
• When basic error checks and advanced health checks recover:
upstream backends {
zone backends 64k;
server webserver1 slow_start=30s;
}
NGINX Plus status monitoring
http://demo.nginx.com/ and http://demo.nginx.com/status
Total data and connectionsCurrent data and conns.
Split per ‘server zone’
Cache statistics
Upstream statistics:TrafficHealth and Error status
(web) (JSON)
3. Live software upgrades
• Upgrade your NGINX binary on-the-fly
– No downtime
– No dropped connections
No downtime – ever!
• Reload configuration with SIGHUP# nginx –s reload
• Re-exec binary with copy-and-signalhttp://nginx.org/en/docs/control.html#upgrade
NGINX parent process
NGINX workers
NGINX workers
NGINX workers
NGINX workers
In summary...
Basic Error checks and retry logic On-the-fly upgrades
Advanced health checks + slow start Extended status monitoring
NGINX F/OSS:
NGINX Plus:
Compared to other load balancers and ADCs, NGINX Plus is uniquely well-suited to a devops-driven environment.
Closing thoughts
• 37% of the busiest websites use NGINX– In most situations, it’s a drop-in extension
• Check out the blogs on nginx.com
• Future webinars: nginx.com/webinars
Try NGINX F/OSS (nginx.org) or NGINX Plus (nginx.com)
top related