Top Banner
Varnish @ Opera v3 / DevOps Norway Meetup Oslo, 17th September 2014 Cosimo Streppone <[email protected]>
58

How we use and deploy Varnish at Opera

Jan 21, 2018

Download

Software

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: How we use and deploy Varnish at Opera

Varnish @ Opera

v3 / DevOps Norway MeetupOslo, 17th September 2014

Cosimo Streppone <[email protected]>

Page 2: How we use and deploy Varnish at Opera

• October 2009• 1 old recycled machine, 2 Gb of disk allocated• Started serving static pictures (1M+ req/day)• Then more...• Even more...• ...• ~15% of all My Opera requests were «varnished»• Around 8M req/day

1st Varnish deployment: My Opera

Page 3: How we use and deploy Varnish at Opera

• Still using Debian EtchFirst Varnish instance was running v1.x from Etch.several years old, not good

• Experienced VIPs– ”Very Interesting Problems”– User X getting User Y's session– Random users getting admin powers. Nightmare!

• Theory: Varnish was caching response bodies that containedSet-Cookie: opera_session=<session_id>

My Opera – The start

Page 4: How we use and deploy Varnish at Opera

if (req.url ~ "^/community/users/avatar\.pl/[0-9]+$"

|| req.url ~ "^/.+/avatar\.pl$"

|| req.url ~ "^/.+/picture\.pl\?xscale=100$"

|| req.url ~ "^/desktopteam/xml/atom/blog/?$"

|| req.url ~ "^/desktopteam/xml/rss/blog/?$"

|| req.url ~ "^/community/api/users/friends\.pl\?user=.+$"

|| req.url ~ "^/community/api/users/groups\.pl\?user=.+$"

) {

unset req.http.Cookie;

unset req.http.Authorization;

lookup;

}

My Opera – The start

Page 5: How we use and deploy Varnish at Opera

...

# Check for cookie only after always-cache URLs

if (req.http.Cookie ~ "(opera_session|opera_persistent_)") {

pass;

}

# DANGER, Will Robinson! Caching the front-page

# At this point, lots of Google Analytics cookies will go in.

# No problem. It's stuff used by Javascript

if (req.url ~ "^/community/$") {

lookup;

}

pass;

}

My Opera – Pass logged in users

Page 6: How we use and deploy Varnish at Opera

My Opera: testing Varnish setup

...ok 289 - Got response from backend for /community/ (from ...) ok 290 - Correct status line# Adding header [Cookie] => [language=it]# ----------# GET http://cache01.my.opera.com:6081/community/# Host: my.opera.com# ------------ok 291 - 2nd request: got response from backend for /community/ (from...)ok 292 - Correct status line# X-Varnish: 1211283813 1211283812# X-Varnish-Status: hit# X-Varnish-Cacheable: yes, language cookie# X-Varnish-URL: /community/ok 293 - URL '/community/' was handled correctly by varnish# cookie_header:ok 294 - URL '/community/' has correct cookies (or no cookies)1..294

All tests successful.

X-Varnish: 1211283813 1211283812X-Varnish-Status: hitX-Varnish-Cacheable: yes, language cookieX-Varnish-URL: /community/

Page 7: How we use and deploy Varnish at Opera

My Opera – Next steps

Page 8: How we use and deploy Varnish at Opera

My Opera – Next steps

● Front page caching● Static assets and UGC● On-the-fly thumbnails● “Shields-up” configuration

Page 9: How we use and deploy Varnish at Opera

Problem

• Very dynamic, i18n• Accept-Language

header variation• Vary: Accept-Language sub-optimal

Front page caching

Solution

• varnish-accept-language “extension”

Page 10: How we use and deploy Varnish at Opera

Client sends

Accept-Language: ru, uk;q=0.9

Accept-Language: es-ES, es;q=0.8

Accept-Language: fr, it;q=0.7

Accept-Language: fr

Front page caching - Accept-Language

Backend receives

Accept-Language: ru

Accept-Language: es

Accept-Language: it

Accept-Language: ben

SUPPORTED_LANGUAGES = “:de:es:it:ru:”DEFAULT_LANGUAGE = “en”

Page 11: How we use and deploy Varnish at Opera

Front page caching

Page 12: How we use and deploy Varnish at Opera

Problem

• One central location• SPOF• High latency US -> NO

Static assets and UGC servers

Solution

• Decentralized varnish servers in multiple DC

• Talking to 1 backend• Very long TTL• Health probes• Cache invalidation API• Built our GeoDNS

Page 13: How we use and deploy Varnish at Opera

Problem

• Change of Design™ made our millions of pre-generated thumbnails useless

Thumbnail generation and caching

Solution

• Switch to on-the-fly generation model

• Used mod_dims (AOL)• Varnish on :80• 2 backends

300k objects95% hit rate avg800 req/s/backend peak

Page 14: How we use and deploy Varnish at Opera

Thumbnail generation and caching

How it works

http://localhost/dims/ crop/472x360/ contrast/+1/ quality/90/ /actual/picture/url.jpg (remote too!)

Using rewrite rules

Http://localhost/tn/small/ /actual/picture/url.jpg

Page 15: How we use and deploy Varnish at Opera

Thumbnail generation and caching

● Recognize mobile/non-mobile

● Scale thumbnails on the fly

● Reduce JPEG quality Ex.: /thumb/small/quality/80/some/path/pic.jpg

Page 16: How we use and deploy Varnish at Opera

Problem

• Original setup too specific to My Opera

• Long tail of non-popular content “unprotected”

• Can we find some more generic setup?

Shields-up configuration

Solution

• DDoS• Varnish in front, rather

than after frontends• Cache most logged out

requests with lower TTL• Compromise solution,

but generic enough

Page 17: How we use and deploy Varnish at Opera

Other projects

Page 18: How we use and deploy Varnish at Opera

Many since then!

• Sitecheck• Opera.com• TV Store• Speeddials• Discover• ...

Other projects

Page 19: How we use and deploy Varnish at Opera

Opera Discover

My current project

80M backend API requests/day260M image requests/day

Page 20: How we use and deploy Varnish at Opera

GeoIP country check

Page 21: How we use and deploy Varnish at Opera

Country-level ban

• Contract mandates that TV Store shouldn't be available in specific countries

• Country check in the backend means no caching is possible

• Implemented with varnish-geoiphttps://github.com/cosimo/varnish-geoip

Page 22: How we use and deploy Varnish at Opera

https://github.com/cosimo/varnish-geoip

Page 23: How we use and deploy Varnish at Opera

Country-level bansub country_ban_list_check {

# Allow testing of country ban if (req.http.Cookie ~ "x_geo_ip_forced\s*=\s*country:..") { set req.http.X-Geo-IP = regsuball( req.http.Cookie, "^.*x_geo_ip_forced\s*=\s*(country:..).*$", "\1" ); log "Forced X-Geo-IP to '" req.http.X-Geo-IP "'"; }

# Block access to tvstore in these countries if (req.http.X-Geo-IP && req.http.X-Geo-IP ~ "^country:(C1|C2|C3|...)$") { log "Country ban"; error 750 "tvstore is not available in your country"; }}

sub vcl_recv { C{ vcl_geoip_country_set_header_xff(sp); }C call country_ban_list_check;}

Page 24: How we use and deploy Varnish at Opera

VCL library

Page 25: How we use and deploy Varnish at Opera

accept-encoding.vcl (now obsolete)

# STD: Deal with different Accept-Encoding formatssub accept_encoding_normalize {

if (req.http.Accept-Encoding) {

if (req.http.Accept-Encoding ~ "gzip") { set req.http.Accept-Encoding = "gzip"; }

elsif (req.http.Accept-Encoding ~ "deflate") { set req.http.Accept-Encoding = "deflate"; }

else { unset req.http.Accept-Encoding; }

}}

Page 26: How we use and deploy Varnish at Opera

accept-language.vclC{

/* * Accept-language header normalization * * - Parses client Accept-Language HTTP header * - Tries to find the best match with the supported languages * - Writes the best match as req.http.X-Varnish-Accept-Language * * http://github.com/cosimo/varnish-accept-language */

#include <ctype.h> /* isupper */#include <stdio.h>#include <stdlib.h> /* qsort */#include <string.h>

#define DEFAULT_LANGUAGE "en"#define SUPPORTED_LANGUAGES ":de:en:es-la:fr:fy:hu:ja:no:pl:pt-br:ru:sk:sq:sr:tr:uk:vn:xx-lol:zh-tw:"

Page 27: How we use and deploy Varnish at Opera

maintenance.vcl + {up,down}.shinclude "/etc/varnish/accept-encoding.vcl";

backend oopsy { .host = "10.20.21.22”; .port = "80";}

sub vcl_recv {

set req.backend = oopsy;

# Serve page from within Varnish. See vcl_error() if (req.url == "/ping.html") { error 700; }

call accept_encoding_normalize;

# Collapse URLs, so that we have just one cached object set req.url = "/maintenance-down";

remove req.http.Cookie; remove req.http.Authorization; return (lookup);}

Page 28: How we use and deploy Varnish at Opera

purge.vclacl purge { … }

sub vcl_recv {

if (req.request == "PURGE") { If (! (client.ip ~ purge)) { error 405 "Not allowed."; } purge("req.url == " req.url); error 200 "Purged."; }

else if (req.request == "PURGE_SUFFIX") { set req.http.X-URL = regsuball(req.url, "\[|\]|[\\^.$|()*+?{}]", "\\\0") "$"; purge_url(req.http.X-URL); unset req.http.X-URL; error 200 "Purged suffix."; }

else if (req.request == "PURGE_PREFIX") { … }

}

Ugly!

Page 29: How we use and deploy Varnish at Opera

X-forwarded-for.vcl

# See http://www.varnish-cache.org/trac/ticket/540sub inject_forwarded_for {

# Rename the incoming XFF header to work around a Varnish bug if (req.http.X-Forwarded-For) { # Append the client IP set req.http.X-Real-Forwarded-For = req.http.X-Forwarded-For ", " regsub(client.ip, ":.*", ""); } else { # Simply use the client IP set req.http.X-Real-Forwarded-For = regsub(client.ip, ":.*", ""); }}

Wat!?

Page 30: How we use and deploy Varnish at Opera

Testing VCLs – http-cuke

Page 31: How we use and deploy Varnish at Opera

http-cuke – csrf.test

Feature: Site uses cookies to protect against CSRF attacks

In order to protect the users from CSRF attacks As a web site developer I want to verify that some pages send out a CSRF cookie token to the browser or device

Scenario: Accessing the Backgammon application URL

Given a "OPR/24.0.1558.23 (Linux … Opera)" user agent When I go to "https://server/store/app/backgammon" Then the final HTTP status code should be "200" And the page should contain "A board game for one player"

And the page should not be cached by varnish

And the server should send a CSRF token

Page 32: How we use and deploy Varnish at Opera

http-cuke – prove-like output

$ http-cuke --test ./csrf.test

$ http-cuke --test-dir ./some-dir

Page 33: How we use and deploy Varnish at Opera

http-cuke – a sample test run

# ============================================================# FEATURE: Web site uses cookies to protect against CSRF attacks# ============================================================# ------------------------------------------------------------# SCENARIO: Accessing the Backgammon application URL# ------------------------------------------------------------ok 1 - Given a "OPR/24... (Linux...)" user agentok 2 - When I go to "https://server/app/backgammon"ok 3 - Status code is 200 (expected 200)ok 4 - Then the final HTTP status code should be "200"ok 5 - String 'A board game for one player' was found in the pageok 6 - Then the page should contain "A board game for one player"ok 7 - X-Varnish header contains only current XID (523289525)ok 8 - Age of cached resource is zerook 9 - Then the page should not be cached by varnishok 10 - CSRF token was found (49a0da1b2758bf62a028072e4f7f36dc)ok 11 - Then the server should send a CSRF token

Page 34: How we use and deploy Varnish at Opera

https://github.com/cosimo/http-cuke

Page 35: How we use and deploy Varnish at Opera

Dumping varnishlog

Page 36: How we use and deploy Varnish at Opera

vlogdump

$ varnishlog | vlogdump

Page 37: How we use and deploy Varnish at Opera

vlogdump – a sample test run

$ varnishlog | vlogdump -v only_misses=1

172.22.0.15 => GET /assets/tn/m/mq/e85ed...6733a48802 HTTP/1.0 MISS <= 200 OK 172.22.0.18 => GET /assets/icons/.....-technology.png HTTP/1.0 MISS <= 304 Not Modified

$ varnishlog | vlogdump -v show_req_headers=1

172.22.0.15 => GET /assets/3a626ed......e168914568080 HTTP/1.0 MISS <= 200 OK 51.483 ms req.http.Host = discovery.opera.com req.http.User-Agent = Amazon CloudFront req.http.X-Forwarded-For = 11.12.34.56 req.http.X-Amz-Cf-Id = ...0AZZaPkt87avA== req.http.Connection = keep-alive ...

Page 38: How we use and deploy Varnish at Opera

vlogdump – demo?

Page 39: How we use and deploy Varnish at Opera

https://github.com/cosimo/vlogdump

Page 40: How we use and deploy Varnish at Opera

vlogdump + rtail

Remote tailing made easy.60 lines of Perl.

$ rtail --host=h1 --host=h2 --host=h3 ... \ --command varnishlog \ | vlogdump

Page 41: How we use and deploy Varnish at Opera

vlogdump + rtail

No github yet :-)

Page 42: How we use and deploy Varnish at Opera

Puppet module

Page 43: How we use and deploy Varnish at Opera

varnish/manifests/init.pp

class varnish {

package { "varnish": ensure => "installed" }

file { "/etc/init.d/varnish": … }

file { "/etc/sysctl.conf": … } exec { "update-sysctl": … }

file { "/usr/share/varnish/purge-cache": … }

service { "varnish": ensure => "running", … }

munin::plugin::custom { "varnish_": } munin::plugin { [ "varnish_backend_traffic", "varnish_expunge", … }}

Page 44: How we use and deploy Varnish at Opera

Custom init script

# Lower stack limit demand for every Varnish thread# http://projects.linpro.no/pipermail/varnish-misc/2009-August/002977.html# Still relevant for Varnish 3 ??ulimit -s 256

# Startup with custom cc_command fails# http://stackoverflow.com/a/8333333# Filed Debian bug #659005if bash -c "start-stop-daemon \ --start --quiet --pidfile ${PIDFILE} \ --exec ${DAEMON} -- -P ${PIDFILE} \ ${DAEMON_OPTS} > ${output} 2>&1"; then log_end_msg 0else …

Page 45: How we use and deploy Varnish at Opera

Custom init script

# Optionally warm up the cache## Drop a custom script into this path# to have it being picked up by the# main init script.

if [ -x /usr/share/varnish/cache-warmup ]; then /usr/share/varnish/cache-warmup fi

Page 46: How we use and deploy Varnish at Opera

Custom sysctl settings

# From http://varnish.projects.linpro.no/wiki/Performance# + our own tweaking and tuning

net.ipv4.ip_local_port_range = 1024 65536net.core.rmem_max = 16777216net.core.wmem_max = 16777216net.ipv4.tcp_rmem = 4096 87380 16777216net.ipv4.tcp_wmem = 4096 65536 16777216net.ipv4.tcp_fin_timeout = 30net.core.netdev_max_backlog = 30000net.ipv4.tcp_no_metrics_save = 1net.core.somaxconn = 262144net.ipv4.tcp_syncookies = 1net.ipv4.tcp_max_orphans = 262144net.ipv4.tcp_max_syn_backlog = 262144net.ipv4.tcp_synack_retries = 2net.ipv4.tcp_syn_retries = 2

Page 47: How we use and deploy Varnish at Opera

Purge cache script

Modeled after Debian vcl-reload script

$ purge-cache -a$ purge-cache -u http://some.url$ purge-cache -r '^/(home|user)/'

Page 48: How we use and deploy Varnish at Opera

Cache warmup script

Drop-in script in

/usr/share/varnish/cache-warmup

Invoked right after startup

Page 49: How we use and deploy Varnish at Opera

varnish/manifests/init.pp – 2define varnish::config ( $vcl_conf="default.vcl", $listen_address="", $listen_port=6081, $thread_min=400, $thread_max=5000, $thread_timeout=30, $storage_type="malloc", $storage_size="12G", $ttl=60, $thread_pools=$processorcount, $sess_workspace=131072, $cc_command="", $sess_timeout=3 ) {

file { "/etc/default/varnish": ensure => "present", owner => "root", group => "root", mode => 644, content => template("varnish/debian-defaults.erb"), require => Package["varnish"], notify => Service["varnish"], }

}

Page 50: How we use and deploy Varnish at Opera

Example of varnish::config

varnish::config { 'cache-varnish-config': vcl_conf => 'cache.vcl', storage_type => 'malloc', storage_size => '20G', listen_port => 80, sess_workspace => 131072, ttl => 86400, thread_pools => 4, thread_min => 800, thread_max => 2000, # Necessary for GeoIP cc_command => 'exec cc -fpic -shared -Wl,-x \ -L/usr/include/GeoIP.h -lGeoIP -o %o %s',}

Page 51: How we use and deploy Varnish at Opera

varnish/manifests/init.pp – 3

define varnish::vcl ($source) {

file { "/etc/varnish/${name}.vcl": ensure => 'file', owner => 'root', group => 'root', Mode => '0644', source => $source, require => Package['varnish'], notify => Service['varnish'], }

}

Page 52: How we use and deploy Varnish at Opera

https://github.com/cosimo/puppet-modules

Page 53: How we use and deploy Varnish at Opera

Migration to Varnish 3

Page 54: How we use and deploy Varnish at Opera

Following Debian stable

• Wheezy now ships with 3.0.2

• 2.0 > 2.1 migration was painless

• 2.1 > 3.0 migration was painless too

Page 55: How we use and deploy Varnish at Opera

Next steps?

Page 56: How we use and deploy Varnish at Opera

Next Steps?

• Personalized (but cached) content

• A/B Testing

• ESI?

• VMods?

• Migration to V4

Page 57: How we use and deploy Varnish at Opera

Spørsmål?

Page 58: How we use and deploy Varnish at Opera

Takk!