Software Development

Traffic Shadowing With Istio: Reducing the Risk of Code Release

We’ve been talking about Istio and service mesh recently (follow along @christianposta for the latest) but one aspect of Istio can be glossed over. One of the most important aspects of Istio.io is its ability to control the routing of traffic between services. With this fine-grained control of application-level traffic, we can do interesting resilience things like routing around failures, routing to different availability zones when necessary etc. IMHO, more importantly, we can also control the flow of traffic for our deployments so we can reduce the risk of change to the system.

With a services architecture, our goal is to increase our ability to go faster so we do things like implement microservices, automated testing pipelines, CI/CD etc. But what good is any of this if we have bottlenecks getting our code changes into production? Production is where we understand whether our changes have any positive impact to our KPIs, so we should reduce the bottlenecks of getting code into production.

At the typical enterprise customers that I visit regularly (Financial services, Insurance, Retail, Energy, etc) risk is such a big part of the equation. Risk is used as a reason for why changes to production get blocked. A big part of this risk is a code “deployment” is all or nothing in these environments. What I mean is there is no separation of deployment and release. This is such a hugely important distinction.

Deployment vs Release

A deployment brings new code to production but it takes no production traffic. Once in the production environment, service teams are free to run smoke tests, integration tests, etc without impacting any users. A service team should feel free to deploy as frequently as it wishes.

A release brings live traffic to a deployment but may require signoff from “the business stakeholders”. Ideally, bringing traffic to a deployment can be done in a controlled manner to reduce risk. For example, we may want to bring internal-user traffic to the deployment first. Or we may want to bring a small fraction, say 1%, of traffic to the deployment. If any of these release rollout strategies (internal, non-paying, 1% traffic, etc) exhibit undesirable behavior (thus the need for strong observability) then we can rollback.

Please go read the two-part series titled “Deploy != Release” from the good folks at Turbine.io labs for a deeper treatment of this topic.

Dark traffic

One strategy we can use to reduce risk for our releases, before we even expose to any type of user, is to shadow traffic live traffic to our deployment. With traffic shadowing, we can take a fraction of traffic and route it to our new deployment and observe how it behaves. We can do things like test for excpetions, performance, and result parity. Projects such as Twitter Diffy can be used to do comparisons between different released versions and unreleased versions.

With Istio, we can do this kind of traffic control by Mirroring traffic from one service to another. Let’s take a look at an example.

Traffic Mirroring with Istio

With the Istio 0.5.0 release we have the ability to mirror traffic from one service to another, or from one version to a newer version.

We’ll start by creating two deployments of an httpbin service.

$  cat httpbin-v1.yaml
apiVersion: v1
kind: Service
metadata:
  name: httpbin
  labels:
    app: httpbin
spec:
  ports:
  - name: http
    port: 8080
  selector:
    app: httpbin
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: httpbin-v1
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: httpbin
        version: v1
    spec:
      containers:
      - image: docker.io/kennethreitz/httpbin
        imagePullPolicy: IfNotPresent
        name: httpbin
        command: ["gunicorn", "--access-logfile", "-", "-b", "0.0.0.0:8080", "httpbin:app"]
        ports:
        - containerPort: 8080

We’ll inject the istio sidecar with kube-inject like this:

$  kubectl create -f <(istioctl kube-inject -f httpbin-v1.yaml)

Version 2 of the httpbin service is similar except it has labels that denote that it’s version 2:

$  cat httpbin-v2.yaml
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: httpbin-v2
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: httpbin
        version: v2
    spec:
      containers:
      - image: docker.io/kennethreitz/httpbin
        imagePullPolicy: IfNotPresent
        name: httpbin
        command: ["gunicorn", "--access-logfile", "-", "-b", "0.0.0.0:8080", "httpbin:app"]
        ports:
        - containerPort: 8080

Let’s deploy httpbin-v2 also:

$  kubectl create -f <(istioctl kube-inject -f httpbin-v2.yaml)

Lastly, let’s deploy the sleep demo from Istio samples so we can easily call into our httpbin service:

$  kubectl create -f <(istioctl kube-inject -f sleep.yaml)

You should see three pods like this:

$  kubectl get pod NAME                          READY     STATUS    RESTARTS   AGE httpbin-v1-2113278084-98whj   2/2       Running   0          1d httpbin-v2-2839546783-2dvhq   2/2       Running   0          1d sleep-1512692991-txrfn        2/2       Running   0          1d

If we start sending traffic to the httpbin service, we’ll see the default Kubernetes behavior to load balance across both v1 and v2 since both pods will match the selector for the httpbin Kubernetes Service. Let’s take a look at the default Istio route rule to route all traffic to v1 of our service:

apiVersion: config.istio.io/v1alpha2
kind: RouteRule
metadata:
  name: httpbin-default-v1
spec:
  destination:
    name: httpbin
  precedence: 5
  route:
  - labels:
      version: v1

Let’s create this routerule:

$  istioctl create -f routerules/all-httpbin-v1.yaml

If we start sending traffic into our httpbin service, we should only see traffic for the httpbin-v1 deployment:

export SLEEP_POD=$(kubectl get pod -l app=sleep -o jsonpath={.items..metadata.name}) kubectl exec -it $SLEEP_POD -c sleep -- sh -c 'curl  http://httpbin:8080/headers' {   "headers": {     "Accept": "*/*",     "Content-Length": "0",     "Host": "httpbin:8080",     "User-Agent": "curl/7.35.0",     "X-B3-Sampled": "1",     "X-B3-Spanid": "eca3d7ed8f2e6a0a",     "X-B3-Traceid": "eca3d7ed8f2e6a0a",     "X-Ot-Span-Context": "eca3d7ed8f2e6a0a;eca3d7ed8f2e6a0a;0000000000000000"   } }

If we check the access logs for the httpbin-v1 service, we should see a single access-log statement:

$  kubectl logs -f httpbin-v1-2113278084-98whj -c httpbin 127.0.0.1 - - [07/Feb/2018:00:07:39 +0000] "GET /headers HTTP/1.1" 200 349 "-" "curl/7.35.0"

If we check the logs for the httpbin-v2 service, we should see NO access log statements.

Let’s mirror traffic from v1 to v2. Here’s the Istio route rule we’ll use:

apiVersion: config.istio.io/v1alpha2
kind: RouteRule
metadata:
  name: mirror-traffic-to-httbin-v2
spec:
  destination:
    name: httpbin
  precedence: 11
  route:
  - labels:
      version: v1
    weight: 100
  - labels: 
      version: v2
    weight: 0
  mirror:
    name: httpbin
    labels:
      version: v2

A few things to note:

  • We are explicitly telling Istio to weight the traffic between v1 (100%) and v2 (0%)
  • We are using labels to specify which version of httpbin service to which we want to mirror

Let’s create this routerule

$  istioctl create -f routerules/mirror/mirror-traffic-to-httbin-v2.yaml

We should see routerules like this:

$  istioctl get routerules $  istioctl get routerules NAME                    KIND                                    NAMESPACE httpbin-default-v1      RouteRule.v1alpha2.config.istio.io      tutorial httpbin-mirror-v2       RouteRule.v1alpha2.config.istio.io      tutorial

Now if we start sending traffic in, we should see requests go to v1 and requests shadowed to v2.

Video demo

Here’s a video showing this:

Istio Mirroring Demo from Christian Posta on Vimeo.

Please see the offical istio docs for more details!

Published on Java Code Geeks with permission by Christian Posta, partner at our JCG program. See the original article here: Traffic Shadowing With Istio: Reducing the Risk of Code Release

Opinions expressed by Java Code Geeks contributors are their own.

Christian Posta

Christian is a Principal Consultant at FuseSource specializing in developing enterprise software applications with an emphasis on software integration and messaging. His strengths include helping clients build software using industry best practices, Test Driven Design, ActiveMQ, Apache Camel, ServiceMix, Spring Framework, and most importantly, modeling complex domains so that they can be realized in software. He works primarily using Java and its many frameworks, but his favorite programming language is Python. He's in the midst of learning Scala and hopes to contribute to the Apache Apollo project.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button