前回 に続きChaos Engineering系の調べ物
今回は pumba について調べてみた。
Chaos testing and network emulation tool for Docker
pumba はDocker向けのChaos testingとネットワークエミュレーションを行うツールらしい。
できること
コンテナ単位で
- コンテナを壊す(kill/rm)
- killの場合signalの指定が可能
- コンテナを止める(pause/stop)
- pauseの場合止める時間を指定可能
- NWのエミュレーション
- 遅延
- パケットロス
- パケットロスの方式に異なる3つがある模様(loss, loss-state, loss-gemodel)
- トラフィックへのレートリミット
- パケットの重複
- パケットの破損
- 対象とするコンテナの指定(名前, 正規表現マッチ)
できないこと
- namespace単位での設定やブラックリスト登録
- 気をつけないと
kube-system
配下のコンテナが殺されたり遅延する
- 気をつけないと
- 動作のスケジューリング
- chaosmonkeyやkube-monkeyと違って1回だけの動作もしくは一定間隔で動作を続ける
- intervalなどで動作間隔の調整は可能
動作
コマンドラインから定期的/単発にランダム/特定のコンテナの
- 破壊/停止を行う
- NW障害をエミュレーションする
動かし方
注:manifestは全部Spinnakerから放り込んだのでSpinnakerがなければ kubectl
で作成してください。
namespaceの作成
一応作っておく
$ kubectl create namespace pumba namespace/pumba created
nginxのデプロイ
apiVersion: apps/v1 kind: Deployment metadata: name: pumba-nginx namespace: pumba labels: app: pumba-nginx spec: replicas: 3 selector: matchLabels: app: pumba-nginx template: metadata: labels: app: pumba-nginx spec: containers: - image: nginx name: nginx ports: - containerPort: 80
nginx(service)のデプロイ
apiVersion: v1 kind: Service metadata: labels: app: pumba-nginx name: pumba-nginx namespace: pumba spec: ports: - port: 80 protocol: TCP selector: app: pumba-nginx
pumbaのデプロイ
- manifest.yml (参照:pumba/pumba_kube.yml at master · alexei-led/pumba · GitHub)
apiVersion: apps/v1 kind: DaemonSet metadata: name: pumba namespace: pumba spec: selector: matchLabels: app: pumba template: metadata: labels: app: pumba com.gaiaadm.pumba: "true" # prevent pumba from killing itself name: pumba spec: containers: - image: gaiaadm/pumba:master imagePullPolicy: Always name: pumba # Pumba command: modify it to suite your needs # Currently: randomly try to kill some container every 3 minutes args: - --random - --interval - "3m" - kill - --signal - "SIGKILL" - re2:pumba-nginx resources: requests: cpu: 10m memory: 5M limits: cpu: 100m memory: 20M volumeMounts: - name: dockersocket mountPath: /var/run/docker.sock volumes: - hostPath: path: /var/run/docker.sock name: dockersocket
動作確認(kill)
$ kubectl get pods -n pumb NAME READY STATUS RESTARTS AGE pumba-d92ms 1/1 Running 0 3h pumba-m56gr 1/1 Running 0 3h pumba-m7szx 1/1 Running 0 3h pumba-nginx-78cbbc666d-lczjr 0/1 CrashLoopBackOff 37 1d pumba-nginx-78cbbc666d-lfglj 1/1 Running 34 1d pumba-nginx-78cbbc666d-nb2t5 0/1 CrashLoopBackOff 47 1d
RESTARTSが増えていくのが確認できた
alpine(ping)のデプロイ
apiVersion: apps/v1 kind: Deployment metadata: name: pumba-ping namespace: pumba labels: app: pumba-ping spec: replicas: 1 selector: matchLabels: app: pumba-ping template: metadata: labels: app: pumba-ping spec: containers: - image: alpine name: pumba-ping command: ["/bin/ping"] args: - "10.0.3.182" # 適当なPodのIPを指定
pumba(delay)のデプロイ
apiVersion: apps/v1 kind: DaemonSet metadata: name: pumba-delay namespace: pumba spec: selector: matchLabels: app: pumba-delay template: metadata: labels: app: pumba-delay com.gaiaadm.pumba: "true" # prevent pumba from killing itself name: pumba-delay spec: containers: - image: gaiaadm/pumba:master imagePullPolicy: Always name: pumba-delay # Pumba command: modify it to suite your needs # Currently: randomly try to kill some container every 3 minutes args: - --random - --interval - "1m" - netem - --tc-image - "gaiadocker/iproute2" - --duration - "20s" - delay - re2:pumba-ping volumeMounts: - name: dockersocket mountPath: /var/run/docker.sock volumes: - hostPath: path: /var/run/docker.sock name: dockersocket
実行結果(delay)
64 bytes from 8.8.8.8: seq=1864 ttl=121 time=94.832 ms 64 bytes from 8.8.8.8: seq=1865 ttl=121 time=93.230 ms 64 bytes from 8.8.8.8: seq=1866 ttl=121 time=94.726 ms 64 bytes from 8.8.8.8: seq=1867 ttl=121 time=96.826 ms 64 bytes from 8.8.8.8: seq=1868 ttl=121 time=109.091 ms 64 bytes from 8.8.8.8: seq=1869 ttl=121 time=100.854 ms 64 bytes from 8.8.8.8: seq=1870 ttl=121 time=97.063 ms 64 bytes from 8.8.8.8: seq=1871 ttl=121 time=109.645 ms 64 bytes from 8.8.8.8: seq=1872 ttl=121 time=107.117 ms 64 bytes from 8.8.8.8: seq=1873 ttl=121 time=101.009 ms 64 bytes from 8.8.8.8: seq=1874 ttl=121 time=103.168 ms 64 bytes from 8.8.8.8: seq=1875 ttl=121 time=98.855 ms 64 bytes from 8.8.8.8: seq=1876 ttl=121 time=97.942 ms 64 bytes from 8.8.8.8: seq=1877 ttl=121 time=97.129 ms 64 bytes from 8.8.8.8: seq=1878 ttl=121 time=108.756 ms 64 bytes from 8.8.8.8: seq=1879 ttl=121 time=95.386 ms 64 bytes from 8.8.8.8: seq=1880 ttl=121 time=104.386 ms 64 bytes from 8.8.8.8: seq=1881 ttl=121 time=95.275 ms 64 bytes from 8.8.8.8: seq=1882 ttl=121 time=110.080 ms 64 bytes from 8.8.8.8: seq=1883 ttl=121 time=94.726 ms 64 bytes from 8.8.8.8: seq=1884 ttl=121 time=96.021 ms 64 bytes from 8.8.8.8: seq=1885 ttl=121 time=97.216 ms 64 bytes from 8.8.8.8: seq=1886 ttl=121 time=100.285 ms 64 bytes from 8.8.8.8: seq=1887 ttl=121 time=0.598 ms 64 bytes from 8.8.8.8: seq=1888 ttl=121 time=0.545 ms 64 bytes from 8.8.8.8: seq=1889 ttl=121 time=0.555 ms 64 bytes from 8.8.8.8: seq=1890 ttl=121 time=0.598 ms 64 bytes from 8.8.8.8: seq=1891 ttl=121 time=1.746 ms 64 bytes from 8.8.8.8: seq=1892 ttl=121 time=1.066 ms 64 bytes from 8.8.8.8: seq=1893 ttl=121 time=1.235 ms 64 bytes from 8.8.8.8: seq=1894 ttl=121 time=0.581 ms
20秒遅延が発生しているのが確認できた
pumba(loss)のデプロイ
apiVersion: apps/v1 kind: DaemonSet metadata: name: pumba-loss namespace: pumba spec: selector: matchLabels: app: pumba-loss template: metadata: labels: app: pumba-loss com.gaiaadm.pumba: "true" # prevent pumba from killing itself name: pumba-loss spec: containers: - image: gaiaadm/pumba:master imagePullPolicy: Always name: pumba-loss # Pumba command: modify it to suite your needs # Currently: randomly try to kill some container every 3 minutes args: - --random - --interval - "1m" - netem - --tc-image - "gaiadocker/iproute2" - --duration - "20s" - loss - --percent - "100" - re2:pumba-ping volumeMounts: - name: dockersocket mountPath: /var/run/docker.sock volumes: - hostPath: path: /var/run/docker.sock name: dockersocket
結果(loss)
64 bytes from 8.8.8.8: seq=809 ttl=121 time=0.490 ms 64 bytes from 8.8.8.8: seq=810 ttl=121 time=0.587 ms 64 bytes from 8.8.8.8: seq=811 ttl=121 time=0.509 ms 64 bytes from 8.8.8.8: seq=832 ttl=121 time=1.199 ms 64 bytes from 8.8.8.8: seq=833 ttl=121 time=0.535 ms 64 bytes from 8.8.8.8: seq=834 ttl=121 time=0.717 ms 64 bytes from 8.8.8.8: seq=835 ttl=121 time=0.552 ms 64 bytes from 8.8.8.8: seq=836 ttl=121 time=0.530 ms 64 bytes from 8.8.8.8: seq=837 ttl=121 time=0.470 ms 64 bytes from 8.8.8.8: seq=838 ttl=121 time=0.481 ms 64 bytes from 8.8.8.8: seq=839 ttl=121 time=0.466 ms 64 bytes from 8.8.8.8: seq=840 ttl=121 time=0.504 ms
seq=811から832まで20s消えているのが確認できた
pumba(rate)のデプロイ
apiVersion: apps/v1 kind: DaemonSet metadata: name: pumba-rate namespace: pumba spec: selector: matchLabels: app: pumba-rate template: metadata: labels: app: pumba-rate com.gaiaadm.pumba: "true" # prevent pumba from killing itself name: pumba-rate spec: containers: - image: gaiaadm/pumba:master imagePullPolicy: Always name: pumba-rate # Pumba command: modify it to suite your needs # Currently: randomly try to kill some container every 3 minutes args: - --random - --interval - "1m" - netem - --tc-image - "gaiadocker/iproute2" - --duration - "20s" - rate - --rate - "100kbit" - re2:pumba-ping volumeMounts: - name: dockersocket mountPath: /var/run/docker.sock volumes: - hostPath: path: /var/run/docker.sock name: dockersocket
実行結果(rate)
64 bytes from 8.8.8.8: seq=1163 ttl=121 time=1.178 ms 64 bytes from 8.8.8.8: seq=1164 ttl=121 time=9.096 ms 64 bytes from 8.8.8.8: seq=1165 ttl=121 time=8.444 ms 64 bytes from 8.8.8.8: seq=1166 ttl=121 time=8.358 ms 64 bytes from 8.8.8.8: seq=1167 ttl=121 time=8.377 ms 64 bytes from 8.8.8.8: seq=1168 ttl=121 time=8.477 ms 64 bytes from 8.8.8.8: seq=1169 ttl=121 time=8.507 ms 64 bytes from 8.8.8.8: seq=1170 ttl=121 time=8.390 ms 64 bytes from 8.8.8.8: seq=1171 ttl=121 time=8.377 ms 64 bytes from 8.8.8.8: seq=1172 ttl=121 time=8.396 ms 64 bytes from 8.8.8.8: seq=1173 ttl=121 time=8.590 ms 64 bytes from 8.8.8.8: seq=1174 ttl=121 time=8.429 ms 64 bytes from 8.8.8.8: seq=1175 ttl=121 time=8.570 ms 64 bytes from 8.8.8.8: seq=1176 ttl=121 time=8.303 ms 64 bytes from 8.8.8.8: seq=1177 ttl=121 time=9.068 ms 64 bytes from 8.8.8.8: seq=1178 ttl=121 time=8.332 ms 64 bytes from 8.8.8.8: seq=1179 ttl=121 time=8.336 ms 64 bytes from 8.8.8.8: seq=1180 ttl=121 time=8.512 ms 64 bytes from 8.8.8.8: seq=1181 ttl=121 time=8.393 ms 64 bytes from 8.8.8.8: seq=1182 ttl=121 time=9.980 ms 64 bytes from 8.8.8.8: seq=1183 ttl=121 time=9.001 ms 64 bytes from 8.8.8.8: seq=1184 ttl=121 time=0.991 ms 64 bytes from 8.8.8.8: seq=1185 ttl=121 time=0.758 ms 64 bytes from 8.8.8.8: seq=1186 ttl=121 time=3.291 ms 64 bytes from 8.8.8.8: seq=1187 ttl=121 time=1.169 ms 64 bytes from 8.8.8.8: seq=1188 ttl=121 time=0.613 ms 64 bytes from 8.8.8.8: seq=1189 ttl=121 time=0.597 ms 64 bytes from 8.8.8.8: seq=1190 ttl=121 time=0.772 ms
seq=1165から20sほど遅延しているのが確認できた
pumba(corrupt)のデプロイ
apiVersion: apps/v1 kind: DaemonSet metadata: name: pumba-corrupt namespace: pumba spec: selector: matchLabels: app: pumba-corrupt template: metadata: labels: app: pumba-corrupt com.gaiaadm.pumba: "true" # prevent pumba from killing itself name: pumba-corrupt spec: containers: - image: gaiaadm/pumba:master imagePullPolicy: Always name: pumba-corrupt # Pumba command: modify it to suite your needs # Currently: randomly try to kill some container every 3 minutes args: - --random - --interval - "1m" - netem - --tc-image - "gaiadocker/iproute2" - --duration - "20s" - corrupt - --percent - "100" - re2:pumba-ping volumeMounts: - name: dockersocket mountPath: /var/run/docker.sock volumes: - hostPath: path: /var/run/docker.sock name: dockersocket
実行結果(corrupt)
64 bytes from 10.0.3.182: seq=333 ttl=62 time=0.540 ms 64 bytes from 10.0.3.182: seq=334 ttl=62 time=0.372 ms 64 bytes from 10.0.3.182: seq=335 ttl=62 time=0.272 ms 64 bytes from 10.0.3.182: seq=336 ttl=62 time=0.300 ms 64 bytes from 10.0.3.182: seq=337 ttl=62 time=6.558 ms 64 bytes from 10.0.3.182: seq=357 ttl=62 time=1.927 ms 64 bytes from 10.0.3.182: seq=358 ttl=62 time=0.221 ms 64 bytes from 10.0.3.182: seq=359 ttl=62 time=0.360 ms 64 bytes from 10.0.3.182: seq=360 ttl=62 time=0.376 ms 64 bytes from 10.0.3.182: seq=361 ttl=62 time=0.238 ms 64 bytes from 10.0.3.182: seq=362 ttl=62 time=1.223 ms 64 bytes from 10.0.3.182: seq=363 ttl=62 time=0.519 ms
seq=337から357まで20seq応答がなくなっているのがわかる
除外
- duplicate: pingだと意味なさそうなので今回はパス
- loss-state: lossと同じっぽいので今回はパス
- loss-gemodel: 同上
小ネタ
- kube-monkeyと違ってコンテナを落とすのでPodからはErrorとして扱われるのでRestartsのカウントが増えていく
/var/run/docker.sock
はCRIソケットという訳でもなくdockerクライアント向けのソケットなのでdocker依存かも?- pumbaのchaos test対象コンテナにtcがないとエラーになる(その場合対象コンテナに
iproute2
を入れるか--tc-image gaiadocker/iproute2
が必要) gaiaadm/pumba:master
タグ以外で動くかどうか怪しい・・・