とんちゃんといっしょ

Cloudに関する技術とか日常とかについて書いたり書かなかったり

pumbaを調べてみた

前回 に続きChaos Engineering系の調べ物

今回は pumba について調べてみた。

github.com

Chaos testing and network emulation tool for Docker

pumba はDocker向けのChaos testingとネットワークエミュレーションを行うツールらしい。

できること

コンテナ単位で

  • コンテナを壊す(kill/rm)
    • killの場合signalの指定が可能
  • コンテナを止める(pause/stop)
    • pauseの場合止める時間を指定可能
  • NWのエミュレーション
    • 遅延
    • パケットロス
      • パケットロスの方式に異なる3つがある模様(loss, loss-state, loss-gemodel)
    • トラフィックへのレートリミット
    • パケットの重複
    • パケットの破損
  • 対象とするコンテナの指定(名前, 正規表現マッチ)

できないこと

  • namespace単位での設定やブラックリスト登録
    • 気をつけないと kube-system 配下のコンテナが殺されたり遅延する
  • 動作のスケジューリング
    • chaosmonkeyやkube-monkeyと違って1回だけの動作もしくは一定間隔で動作を続ける
    • intervalなどで動作間隔の調整は可能

動作

コマンドラインから定期的/単発にランダム/特定のコンテナの

  • 破壊/停止を行う
  • NW障害をエミュレーションする

動かし方

注:manifestは全部Spinnakerから放り込んだのでSpinnakerがなければ kubectl で作成してください。

namespaceの作成

一応作っておく

$ kubectl create namespace pumba
namespace/pumba created

nginxのデプロイ

apiVersion: apps/v1
kind: Deployment
metadata:
  name: pumba-nginx
  namespace: pumba
  labels:
    app: pumba-nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: pumba-nginx
  template:
    metadata:
      labels:
        app: pumba-nginx
    spec:
      containers:
        - image: nginx
          name: nginx
          ports:
            - containerPort: 80

nginx(service)のデプロイ

apiVersion: v1
kind: Service
metadata:
  labels:
    app: pumba-nginx
  name: pumba-nginx
  namespace: pumba
spec:
  ports:
    - port: 80
      protocol: TCP
  selector:
    app: pumba-nginx

pumbaのデプロイ

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: pumba
  namespace: pumba
spec:
  selector:
    matchLabels:
      app: pumba
  template:
    metadata:
      labels:
        app: pumba
        com.gaiaadm.pumba: "true" # prevent pumba from killing itself
      name: pumba
    spec:
      containers:
      - image: gaiaadm/pumba:master
        imagePullPolicy: Always
        name: pumba
        # Pumba command: modify it to suite your needs
        # Currently: randomly try to kill some container every 3 minutes
        args:
          - --random
          - --interval
          - "3m"
          - kill
          - --signal
          - "SIGKILL"
          - re2:pumba-nginx
        resources:
          requests:
            cpu: 10m
            memory: 5M
          limits:
            cpu: 100m
            memory: 20M
        volumeMounts:
          - name: dockersocket
            mountPath: /var/run/docker.sock
      volumes:
        - hostPath:
            path: /var/run/docker.sock
          name: dockersocket

動作確認(kill)

$ kubectl get pods -n pumb
NAME                           READY   STATUS             RESTARTS   AGE
pumba-d92ms                    1/1     Running            0          3h
pumba-m56gr                    1/1     Running            0          3h
pumba-m7szx                    1/1     Running            0          3h
pumba-nginx-78cbbc666d-lczjr   0/1     CrashLoopBackOff   37         1d
pumba-nginx-78cbbc666d-lfglj   1/1     Running            34         1d
pumba-nginx-78cbbc666d-nb2t5   0/1     CrashLoopBackOff   47         1d

RESTARTSが増えていくのが確認できた

alpine(ping)のデプロイ

apiVersion: apps/v1
kind: Deployment
metadata:
  name: pumba-ping
  namespace: pumba
  labels:
    app: pumba-ping
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pumba-ping
  template:
    metadata:
      labels:
        app: pumba-ping
    spec:
      containers:
        - image: alpine
          name: pumba-ping
          command: ["/bin/ping"]
          args: 
            - "10.0.3.182" # 適当なPodのIPを指定

pumba(delay)のデプロイ

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: pumba-delay
  namespace: pumba
spec:
  selector:
    matchLabels:
      app: pumba-delay
  template:
    metadata:
      labels:
        app: pumba-delay
        com.gaiaadm.pumba: "true" # prevent pumba from killing itself
      name: pumba-delay
    spec:
      containers:
      - image: gaiaadm/pumba:master
        imagePullPolicy: Always
        name: pumba-delay
        # Pumba command: modify it to suite your needs
        # Currently: randomly try to kill some container every 3 minutes
        args:
          - --random
          - --interval
          - "1m"
          - netem
          - --tc-image
          - "gaiadocker/iproute2"
          - --duration
          - "20s"
          - delay
          - re2:pumba-ping
        volumeMounts:
          - name: dockersocket
            mountPath: /var/run/docker.sock
      volumes:
        - hostPath:
            path: /var/run/docker.sock
          name: dockersocket

実行結果(delay)

64 bytes from 8.8.8.8: seq=1864 ttl=121 time=94.832 ms
64 bytes from 8.8.8.8: seq=1865 ttl=121 time=93.230 ms
64 bytes from 8.8.8.8: seq=1866 ttl=121 time=94.726 ms
64 bytes from 8.8.8.8: seq=1867 ttl=121 time=96.826 ms
64 bytes from 8.8.8.8: seq=1868 ttl=121 time=109.091 ms
64 bytes from 8.8.8.8: seq=1869 ttl=121 time=100.854 ms
64 bytes from 8.8.8.8: seq=1870 ttl=121 time=97.063 ms
64 bytes from 8.8.8.8: seq=1871 ttl=121 time=109.645 ms
64 bytes from 8.8.8.8: seq=1872 ttl=121 time=107.117 ms
64 bytes from 8.8.8.8: seq=1873 ttl=121 time=101.009 ms
64 bytes from 8.8.8.8: seq=1874 ttl=121 time=103.168 ms
64 bytes from 8.8.8.8: seq=1875 ttl=121 time=98.855 ms
64 bytes from 8.8.8.8: seq=1876 ttl=121 time=97.942 ms
64 bytes from 8.8.8.8: seq=1877 ttl=121 time=97.129 ms
64 bytes from 8.8.8.8: seq=1878 ttl=121 time=108.756 ms
64 bytes from 8.8.8.8: seq=1879 ttl=121 time=95.386 ms
64 bytes from 8.8.8.8: seq=1880 ttl=121 time=104.386 ms
64 bytes from 8.8.8.8: seq=1881 ttl=121 time=95.275 ms
64 bytes from 8.8.8.8: seq=1882 ttl=121 time=110.080 ms
64 bytes from 8.8.8.8: seq=1883 ttl=121 time=94.726 ms
64 bytes from 8.8.8.8: seq=1884 ttl=121 time=96.021 ms
64 bytes from 8.8.8.8: seq=1885 ttl=121 time=97.216 ms
64 bytes from 8.8.8.8: seq=1886 ttl=121 time=100.285 ms
64 bytes from 8.8.8.8: seq=1887 ttl=121 time=0.598 ms
64 bytes from 8.8.8.8: seq=1888 ttl=121 time=0.545 ms
64 bytes from 8.8.8.8: seq=1889 ttl=121 time=0.555 ms
64 bytes from 8.8.8.8: seq=1890 ttl=121 time=0.598 ms
64 bytes from 8.8.8.8: seq=1891 ttl=121 time=1.746 ms
64 bytes from 8.8.8.8: seq=1892 ttl=121 time=1.066 ms
64 bytes from 8.8.8.8: seq=1893 ttl=121 time=1.235 ms
64 bytes from 8.8.8.8: seq=1894 ttl=121 time=0.581 ms

20秒遅延が発生しているのが確認できた

pumba(loss)のデプロイ

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: pumba-loss
  namespace: pumba
spec:
  selector:
    matchLabels:
      app: pumba-loss
  template:
    metadata:
      labels:
        app: pumba-loss
        com.gaiaadm.pumba: "true" # prevent pumba from killing itself
      name: pumba-loss
    spec:
      containers:
      - image: gaiaadm/pumba:master
        imagePullPolicy: Always
        name: pumba-loss
        # Pumba command: modify it to suite your needs
        # Currently: randomly try to kill some container every 3 minutes
        args:
          - --random
          - --interval
          - "1m"
          - netem
          - --tc-image
          - "gaiadocker/iproute2"
          - --duration
          - "20s"
          - loss
          - --percent
          - "100"
          - re2:pumba-ping
        volumeMounts:
          - name: dockersocket
            mountPath: /var/run/docker.sock
      volumes:
        - hostPath:
            path: /var/run/docker.sock
          name: dockersocket

結果(loss)

64 bytes from 8.8.8.8: seq=809 ttl=121 time=0.490 ms
64 bytes from 8.8.8.8: seq=810 ttl=121 time=0.587 ms
64 bytes from 8.8.8.8: seq=811 ttl=121 time=0.509 ms
64 bytes from 8.8.8.8: seq=832 ttl=121 time=1.199 ms
64 bytes from 8.8.8.8: seq=833 ttl=121 time=0.535 ms
64 bytes from 8.8.8.8: seq=834 ttl=121 time=0.717 ms
64 bytes from 8.8.8.8: seq=835 ttl=121 time=0.552 ms
64 bytes from 8.8.8.8: seq=836 ttl=121 time=0.530 ms
64 bytes from 8.8.8.8: seq=837 ttl=121 time=0.470 ms
64 bytes from 8.8.8.8: seq=838 ttl=121 time=0.481 ms
64 bytes from 8.8.8.8: seq=839 ttl=121 time=0.466 ms
64 bytes from 8.8.8.8: seq=840 ttl=121 time=0.504 ms

seq=811から832まで20s消えているのが確認できた

pumba(rate)のデプロイ

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: pumba-rate
  namespace: pumba
spec:
  selector:
    matchLabels:
      app: pumba-rate
  template:
    metadata:
      labels:
        app: pumba-rate
        com.gaiaadm.pumba: "true" # prevent pumba from killing itself
      name: pumba-rate
    spec:
      containers:
      - image: gaiaadm/pumba:master
        imagePullPolicy: Always
        name: pumba-rate
        # Pumba command: modify it to suite your needs
        # Currently: randomly try to kill some container every 3 minutes
        args:
          - --random
          - --interval
          - "1m"
          - netem
          - --tc-image
          - "gaiadocker/iproute2"
          - --duration
          - "20s"
          - rate
          - --rate
          - "100kbit"
          - re2:pumba-ping
        volumeMounts:
          - name: dockersocket
            mountPath: /var/run/docker.sock
      volumes:
        - hostPath:
            path: /var/run/docker.sock
          name: dockersocket

実行結果(rate)

64 bytes from 8.8.8.8: seq=1163 ttl=121 time=1.178 ms
64 bytes from 8.8.8.8: seq=1164 ttl=121 time=9.096 ms
64 bytes from 8.8.8.8: seq=1165 ttl=121 time=8.444 ms
64 bytes from 8.8.8.8: seq=1166 ttl=121 time=8.358 ms
64 bytes from 8.8.8.8: seq=1167 ttl=121 time=8.377 ms
64 bytes from 8.8.8.8: seq=1168 ttl=121 time=8.477 ms
64 bytes from 8.8.8.8: seq=1169 ttl=121 time=8.507 ms
64 bytes from 8.8.8.8: seq=1170 ttl=121 time=8.390 ms
64 bytes from 8.8.8.8: seq=1171 ttl=121 time=8.377 ms
64 bytes from 8.8.8.8: seq=1172 ttl=121 time=8.396 ms
64 bytes from 8.8.8.8: seq=1173 ttl=121 time=8.590 ms
64 bytes from 8.8.8.8: seq=1174 ttl=121 time=8.429 ms
64 bytes from 8.8.8.8: seq=1175 ttl=121 time=8.570 ms
64 bytes from 8.8.8.8: seq=1176 ttl=121 time=8.303 ms
64 bytes from 8.8.8.8: seq=1177 ttl=121 time=9.068 ms
64 bytes from 8.8.8.8: seq=1178 ttl=121 time=8.332 ms
64 bytes from 8.8.8.8: seq=1179 ttl=121 time=8.336 ms
64 bytes from 8.8.8.8: seq=1180 ttl=121 time=8.512 ms
64 bytes from 8.8.8.8: seq=1181 ttl=121 time=8.393 ms
64 bytes from 8.8.8.8: seq=1182 ttl=121 time=9.980 ms
64 bytes from 8.8.8.8: seq=1183 ttl=121 time=9.001 ms
64 bytes from 8.8.8.8: seq=1184 ttl=121 time=0.991 ms
64 bytes from 8.8.8.8: seq=1185 ttl=121 time=0.758 ms
64 bytes from 8.8.8.8: seq=1186 ttl=121 time=3.291 ms
64 bytes from 8.8.8.8: seq=1187 ttl=121 time=1.169 ms
64 bytes from 8.8.8.8: seq=1188 ttl=121 time=0.613 ms
64 bytes from 8.8.8.8: seq=1189 ttl=121 time=0.597 ms
64 bytes from 8.8.8.8: seq=1190 ttl=121 time=0.772 ms

seq=1165から20sほど遅延しているのが確認できた

pumba(corrupt)のデプロイ

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: pumba-corrupt
  namespace: pumba
spec:
  selector:
    matchLabels:
      app: pumba-corrupt
  template:
    metadata:
      labels:
        app: pumba-corrupt
        com.gaiaadm.pumba: "true" # prevent pumba from killing itself
      name: pumba-corrupt
    spec:
      containers:
      - image: gaiaadm/pumba:master
        imagePullPolicy: Always
        name: pumba-corrupt
        # Pumba command: modify it to suite your needs
        # Currently: randomly try to kill some container every 3 minutes
        args:
          - --random
          - --interval
          - "1m"
          - netem
          - --tc-image
          - "gaiadocker/iproute2"
          - --duration
          - "20s"
          - corrupt
          - --percent
          - "100"
          - re2:pumba-ping
        volumeMounts:
          - name: dockersocket
            mountPath: /var/run/docker.sock
      volumes:
        - hostPath:
            path: /var/run/docker.sock
          name: dockersocket

実行結果(corrupt)

64 bytes from 10.0.3.182: seq=333 ttl=62 time=0.540 ms
64 bytes from 10.0.3.182: seq=334 ttl=62 time=0.372 ms
64 bytes from 10.0.3.182: seq=335 ttl=62 time=0.272 ms
64 bytes from 10.0.3.182: seq=336 ttl=62 time=0.300 ms
64 bytes from 10.0.3.182: seq=337 ttl=62 time=6.558 ms
64 bytes from 10.0.3.182: seq=357 ttl=62 time=1.927 ms
64 bytes from 10.0.3.182: seq=358 ttl=62 time=0.221 ms
64 bytes from 10.0.3.182: seq=359 ttl=62 time=0.360 ms
64 bytes from 10.0.3.182: seq=360 ttl=62 time=0.376 ms
64 bytes from 10.0.3.182: seq=361 ttl=62 time=0.238 ms
64 bytes from 10.0.3.182: seq=362 ttl=62 time=1.223 ms
64 bytes from 10.0.3.182: seq=363 ttl=62 time=0.519 ms

seq=337から357まで20seq応答がなくなっているのがわかる

除外

  • duplicate: pingだと意味なさそうなので今回はパス
  • loss-state: lossと同じっぽいので今回はパス
  • loss-gemodel: 同上

小ネタ

  • kube-monkeyと違ってコンテナを落とすのでPodからはErrorとして扱われるのでRestartsのカウントが増えていく
  • /var/run/docker.sock はCRIソケットという訳でもなくdockerクライアント向けのソケットなのでdocker依存かも?
  • pumbaのchaos test対象コンテナにtcがないとエラーになる(その場合対象コンテナに iproute2 を入れるか --tc-image gaiadocker/iproute2 が必要)
  • gaiaadm/pumba:master タグ以外で動くかどうか怪しい・・・