terraform で Amazon ECS 環境を弄る（3）〜 Datadog でコンテナのリソースモニタリングを定義する〜

はじめに

ECS におけるコンテナのリソース監視について

ECS においてコンテナのリソースを監視する場合には Datadog を使うのが良さそうということで試してみる。

<a href="https://github.com/DataDog/docker-dd-agent">DataDog/docker-dd-agent</a>github.com

尚、現時点では CloudWatch メトリクスは個々のコンテナリソースまでを面倒見てくれないようだが今後に期待。（そもそも破棄前提のコンテナにメトリクス監視が必要か否かという議論もありそうだし、コンテナインスタンスのメトリクス監視だけでも十分という考え方も出来そう...）

参考

<a href="http://qiita.com/jhotta/items/6b7ba9cd5602d1f9208d">モニタリング - Datadog Agentが、Dockerのメトリクスをどのように集めているかを追ってみる。 - Qiita</a>qiita.com

Datadog Agent がコンテナをどのように監視しているかを Docker Integration のソースコードを紐解きながら解説されている。

実践

イメージ

f:id:inokara:20150731001435p:plain

docker run で設定する場合には

docker run -d \
  --name dd-agent \
  -h `hostname` \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v /proc/mounts:/host/proc/mounts:ro \
  -v /cgroup/:/host/sys/fs/cgroup:ro \
  -e API_KEY=${datadog-API-key} \
datadog/docker-dd-agent

コンテナイメージは docker-dd-agent を利用する。ボリュームオプションで /var/run/docker.sock 等をコンテナにマウントする。尚、Amazon Linux の場合には /cgroup/ となるが、その他のディストリビューションでは /sys/fs/cgroup/ を指定するので注意する。

そして Terraform で Task definition を新たに定義する

引き続き、教材は...

<a href="https://github.com/inokappa/oreno-terraform-ecs">inokappa/oreno-terraform-ecs</a>github.com

terraform plan 又は terraform apply を実行する際に -var 'datadog_api_key=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' こんな感じで Datadog API キーを渡すようにする為に container_definitions はヒアドキュメントで JSON をベタ書きしつつ変数として Datadog の API キーを定義出来るようにしている。

resource "aws_ecs_task_definition" "datadog_agent" {
  family = "datadog_agent"
  volume {
    name = "docker"
    host_path = "/var/run/docker.sock"
  }
  volume {
    name = "mounts"
    host_path = "/proc/mounts"
  }
  volume {
    name = "cgroup"
    host_path = "/cgroup/"
  }
  container_definitions = <<EOF
[
  {
    "environment": [
        {
            "name": "API_KEY",
            "value": "${var.datadog_api_key}"
        }
    ],
    "mountPoints": [
        {
            "sourceVolume": "docker",
            "containerPath": "/var/run/docker.sock",
            "readOnly": false
        },
        {
            "sourceVolume": "mounts",
            "containerPath": "/host/proc/mounts",
            "readOnly": true
        },
        {
            "sourceVolume": "cgroup",
            "containerPath": "/host/sys/fs/cgroup",
            "readOnly": true
        }
    ],
    "name": "dd-agent",
    "image": "datadog/docker-dd-agent",
    "cpu": 0,
    "memory": 256,
    "portMappings": [
    ],
    "command": [
    ],
    "essential": true
  }
]
EOF
}

volume リソースにてコンテナインスタンスのボリュームをそれぞれ定義し、JSON 内の mountPoints でコンテナ内のマウントポイントとボリュームを関連付ける。

Service の定義も忘れずに

desired_count は 1 で。

resource "aws_ecs_service" "dd-agent" {
  name = "dd-agent"
  cluster = "${aws_ecs_cluster.kappa-cluster.id}"
  task_definition = "${aws_ecs_task_definition.datadog_agent.arn}"
  desired_count = 1
}

terraform plan と apply

後は plan と apply で。

terraform plan \
 -var 'access_key=AK123456789123456789' \
 -var 'secret_key=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' \
 -var 'ssh_key_name=your_ssh_key_name' \
 -var 'subnet=subnet-12345678' \
 -var 'securiy_group=sg-12345678' \
 -var 'iam_profile_name=your_iam_role_name' \
 -var 's3_bucket_name=your-s3-bucket-name' \
 -var 'datadog_api_key=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

terraform apply \
 -var 'access_key=AK123456789123456789' \
 -var 'secret_key=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' \
 -var 'ssh_key_name=your_ssh_key_name' \
 -var 'subnet=subnet-12345678' \
 -var 'securiy_group=sg-12345678' \
 -var 'iam_profile_name=your_iam_role_name' \
 -var 's3_bucket_name=your-s3-bucket-name' \
 -var 'datadog_api_key=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

Datadog 上の Docker ダッシュボード

Datadog に予め用意されている Docker ダッシュボード。

f:id:inokara:20150730233129p:plain f:id:inokara:20150730233142p:plain f:id:inokara:20150730233202p:plain

カッコいい。

最後に

Datadog Agent は

設定自体は docker run するだけなので簡単。また、README にはコンテナ起動時のオプションや独自の監視定義を追加してコンテナをビルドする方法等も掲載されているので収集するメトリクスの追加も任意で行える。

今更だけど注意点

Task definition の JSON を terraform の container_definitions で定義する際にはドキュメントの Task Definition Template 通りに定義すると terraform 側で JSON の Syntax エラーになるので注意する。以下のように書いてあげるとエラーにならない（はず）。

[
   {
      "name": "",
      "image": "",
      "cpu": 0,
      "memory": 0,
      "links": [
         ""
      ],
      "portMappings": [
         {
             "containerPort": 0,
             "hostPort": 0,
             "protocol": ""
         }
      ],
      "essential": true,
      "entryPoint": [
         ""
      ],
      "command": [
         ""
      ],
      "environment": [
         {
             "name": "",
             "value": ""
         }
      ],
      "mountPoints": [
         {
             "sourceVolume": "",
             "containerPath": "",
             "readOnly": true
         }
      ],
      "volumesFrom": [
         {
             "sourceContainer": "",
             "readOnly": true
         }
      ]
   }
]