Split testing using nginx proxy cache

Nov 11 2013No Comments nginx, proxy, split testing by benjamin

My company recently discovering the joys of using nginx as a reverse proxy cache server. This allowed us to significantly reduce the load on our application servers. Of course, as soon as we got this setup working nicely, a request for A/B testing came down the pipeline.

There are some obstacles to conducting A/B testing while using nginx as a reverse proxy cache server.

Obstacle 1: Lack of "sticky" sessions in free nginx product. While there is support for session affinity as part of the nginx commercial subscription, the product didn't suit our needs. Without sticky sessions each page load would potentially go to a different upstream server. This would render many tests unusable, and would make the site feel disjointed.

Obstacle 2: Since pages are being cached by nginx, all requests received the same cached response. This means you couldn't serve different versions of the same page.

Obstacle 3: To keep code complexity down, we didn't want to have to modify our application to be aware of other tests we were performing.

We were able to overcome these obstacles using only the default modules that were part of nginx 1.4.x.

The following are snippets of our server config. The file exists entirely in the nginx http context. I won't go into the configuration of nginx outside of this file as that information is readily available elsewhere. I'm going to jump around a bit to ease in explanation. The file will be shown in its entirety at the bottom.

   1: upstream upstreamServerA {
   2:     server upstreamServerA.net;
   3: }
   4:  
   5: upstream upstreamServerB {
   6:     server upstreamServerB.net;
   7: }

The first thing is to define our upstream server groups. In this setup we have defined two server groups (upstreamServerA and upstreamServerB) each with a single server. Each upstream server group represents a version of the site we are testing. We could increase the number of tests by adding more upstream server groups. The server definition is shown with a standard .net domain name for ease of reading, this should be the IP address or location of your application server.

   1: split_clients "seedString${remote_addr}${http_user_agent}${date_gmt}" $upstream_variant {
   2:     50%               upstreamServerA;
   3:     50%               upstreamServerB;
   4: }

Here we make use of one of nginx's default modules; ngx_http_split_clients_module. The idea here is to setup the split percentage for our tests. What's actually happening here is that nginx is creating a string composed of the seed string "seedString" concatenated with the client IP address, the client's user agent, and the current time. Nginx then hashes this string into a number. The lower 50% of the number range gets assigned upstreamServerA and the upper 50% of the number range gets assigned upstreamServerB. This gets saved into the $upstream_variant variable. This segment is only used for each client's first request.

   1: map $cookie_sticky_upstream $upstream_group {
   2:     default             $upstream_variant;
   3:     upstreamServerA     upstreamServerA;
   4:     upstreamServerB     upstreamServerB;
   5: }

With this segment we are going to check for the presence of a cookie named "sticky_upstream" in the client request. The goal here is to set the variable named $upstream_group based on this cookie. If the value of the cookie is "upstreamServerA" we set $upstream_group to "upstreamServerA". We do similarly if the value is "upstreamServerB". If the value of the cookie is neither of these, or if the cookie is not present, we use the value of the $upstream_variant variable as we defined in the previous segment.

Now we can define our server context.

   1: server {
   2:     listen       80;
   3:     server_name  upstreamServer.com;
   4:  
   5:     location / {
   6:         #Snipped for brevity
   7:     }
   8:  
   9:     location /admin {
  10:         #Snipped for brevity
  11:     }
  12:  
  13:     error_page   500 502 503 504  /50x.html;
  14:     location = /50x.html {
  15:         root   /usr/share/nginx/html;
  16:     }
  17: }

We are defining two locations here "/" and "/admin". We treat "/admin" differently as we want all admin requests to go to a single upstream server. This may not be needed in all setups but I thought I'd show how to accomplish it.

The first thing we want to do in the "location /" context is to set the "sticky_upstream" cookie.

   1: add_header Set-Cookie "sticky_upstream=$upstream_group;Path=/;";

This will make all subsequent requests from the client "stick" to the same upstream server group.

   1: proxy_pass http://$upstream_group;

Now we tell nginx to use the value of the $upstream_group variable as the upstream server group.

   1: proxy_cache_key "$scheme$host$request_uri$upstream_group";

This segment allows us to cache responses based on the $scheme, $host, $request_uri and (the important bit for this post) the $upstream_group. So that we have different caches for each test.

As I discussed briefly, what if we want to send all admin interactions to a single upstream server group? Let's look at the "location /admin" context:

   1: set $upstream_admin upstreamServerB;
   2: add_header Set-Cookie "sticky_upstream=$upstream_admin;Path=/;";
   3:  
   4: proxy_pass http://$upstream_admin;

We are defining the variable $upstream_admin and setting it to "upstreamServerB". Then setting the client's "sticky_upstream" cookie equal to it. The final bit is to tell nginx to use the value of $upstream_admin as the upstream server.

The file in its entirety can be found below:

   1: upstream upstreamServerA {
   2:     server upstreamServerA.net;
   3: }
   4:  
   5: upstream upstreamServerB {
   6:     server upstreamServerB.net;
   7: }
   8:  
   9: #split clients by the following percentages 
  10: #  according to remote IP, user agent, and date
  11: split_clients "seedString${remote_addr}${http_user_agent}${date_gmt}" $upstream_variant {
  12:     50%               upstreamServerA;
  13:     50%               upstreamServerB;
  14: }
  15:  
  16: #override if "sticky_upstream" cookie is present in request
  17: #  this assures clients session are "sticky"
  18: #  this also allows us to manually set an upstream with a cookie
  19: map $cookie_sticky_upstream $upstream_group {
  20:     default             $upstream_variant;    #no cookie present use result of split_clients
  21:     upstreamServerA     upstreamServerA;    #use cookie value
  22:     upstreamServerB     upstreamServerB;    #use cookie value
  23: }
  24:  
  25: server {
  26:     listen       80;
  27:     server_name  upstreamServer.com;
  28:  
  29:     location / {
  30:         #Set the client cookie so they always get the same upstream server
  31:         #  during this session
  32:         add_header Set-Cookie "sticky_upstream=$upstream_group;Path=/;";
  33:  
  34:         #Set the upstream server group as defined in the above map
  35:         proxy_pass http://$upstream_group;
  36:         proxy_redirect          off;
  37:  
  38:         # Cache
  39:         proxy_cache one; #use the "one" cache
  40:         proxy_cache_valid  200 302  60m;
  41:         proxy_cache_valid  404      1m;
  42:         add_header X-Cache-Status $upstream_cache_status;
  43:         proxy_ignore_headers X-Accel-Expires Expires Cache-Control;
  44:  
  45:         # Don't cache if our_auth cookie is present
  46:         proxy_no_cache $cookie_our_auth;
  47:         proxy_cache_bypass $cookie_our_auth;
  48:  
  49:         proxy_set_header        X-Real-IP       $remote_addr;
  50:         proxy_set_header        Host            $host;
  51:         proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
  52:         
  53:         #Set Cache Key based on scheme, host, request uri and upstream group
  54:         proxy_cache_key "$scheme$host$request_uri$upstream_group";
  55:     }
  56:  
  57:     location /admin {
  58:         set $upstream_admin upstreamServerB;
  59:         add_header Set-Cookie "sticky_upstream=$upstream_admin;Path=/;";
  60:  
  61:         proxy_pass http://$upstream_admin;
  62:         proxy_redirect  off;
  63:  
  64:         proxy_set_header        X-Real-IP       $remote_addr;
  65:         proxy_set_header        Host            $host;
  66:         proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
  67:     }
  68:  
  69:     # redirect server error pages to the static page /50x.html
  70:     error_page   500 502 503 504  /50x.html;
  71:     location = /50x.html {
  72:         root   /usr/share/nginx/html;
  73:     }
  74: }

0 Responses to 'Split testing using nginx proxy cache'

You must be authenticated in order to add a comment.