Cloudwatch의 Container insights로 EKS 어플리케이션 상태 확인하기

2021-10-19

.

Data_Engineering_TIL(20211019)

[학습자료]

“클라우드 네이티브를 위한 쿠버네티스 실전 프로젝트” 책을 읽고 정리한 내용입니다.

** 동양북스, 아이자와 고지&사토 가즈히코 지음, 박상욱 옮김

참고자료 URL : https://github.com/dybooksIT/k8s-aws-book

“EKS 클러스터에 배치 어플리케이션 배포하기”에 이어서 공부한 내용을 정리한 내용임

** URL : https://minman2115.github.io/DE_TIL259

AWS 참고자료 : https://docs.aws.amazon.com/ko_kr/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-metrics.html

[Container insights 개요]

쿠버네티스 클러스터 차원에서 전체 상태 파악도 중요하지만 어플리케이션 상태를 확인하는 것도 중요하다.

Cloudwatch에는 Container insights라는 기능이 있다. 이를 이용하면 클러스터 노드, 파드, 네임스페이스, 서비스 레벨의 메트릭을 참조가 가능하다.

Container insights의 구조는 아래와 같다. Cloudwatch 에이전트를 데몬셋으로 동작시킨후에 필요한 메트릭을 Cloudwatch로 전송한다.

0

[Container insights 구성실습]

STEP 1) 데이터 노드의 IAM 역할에 정책 추가

EKS에 배포한 데몬셋의 cloudwatch 에이전트와 연계해 container insights를 이용하려면, 아래 그림과 같이 EKS 클러스터의 데이터 노드에 속한 IAM 역할에 IAM 정책을 연결해준다.

1

STEP 2) cloudwatch용 네임스페이스 생성

# 현재 클러스터 현황 확인
[ec2-user@ip-10-10-1-195 ~]$ kubectl get all
NAME                              READY   STATUS      RESTARTS   AGE
pod/backend-app-7fb899969-hkvbq   1/1     Running     0          24h
pod/backend-app-7fb899969-wj2h5   1/1     Running     0          24h
pod/batch-app-1634650200-xppvm    0/1     Completed   0          14m
pod/batch-app-1634650500-lzsxs    0/1     Completed   0          9m50s
pod/batch-app-1634650800-j2mm8    0/1     Completed   0          4m49s

NAME                          TYPE           CLUSTER-IP     EXTERNAL-IP                                                                    PORT(S)          AGE
service/backend-app-service   LoadBalancer   10.100.92.76   xxxxxxxxxxxx-yyyyyyyyyyyy.ap-northeast-2.elb.amazonaws.com   8080:30609/TCP   24h

NAME                          READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/backend-app   2/2     2            2           24h

NAME                                    DESIRED   CURRENT   READY   AGE
replicaset.apps/backend-app-7fb899969   2         2         2       24h

NAME                             COMPLETIONS   DURATION   AGE
job.batch/batch-app-1634650200   1/1           10s        14m
job.batch/batch-app-1634650500   1/1           12s        9m50s
job.batch/batch-app-1634650800   1/1           11s        4m49s

NAME                      SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
cronjob.batch/batch-app   */5 * * * *   False     0        4m50s           23h

[ec2-user@ip-10-10-1-195 ~]$ curl -O https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cloudwatch-namespace.yaml
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   141  100   141    0     0    383      0 --:--:-- --:--:-- --:--:--   383
                        
[ec2-user@ip-10-10-1-195 ~]$ ll
total 45812
-rw-rw-r--  1 ec2-user ec2-user      141 Oct 19 13:33 cloudwatch-namespace.yaml
drwxrwxr-x 15 ec2-user ec2-user      315 Oct 18 13:49 k8s-aws-book
-rw-rw-r--  1 ec2-user ec2-user 46907392 Oct 18 12:16 kubectl
[ec2-user@ip-10-10-1-195 ~]$ vim cloudwatch-namespace.yaml
[ec2-user@ip-10-10-1-195 ~]$ cat cloudwatch-namespace.yaml
# create amazon-cloudwatch namespace
apiVersion: v1
kind: Namespace
metadata:
  name: amazon-cloudwatch
  labels:
    name: amazon-cloudwatch

[ec2-user@ip-10-10-1-195 ~]$ kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cloudwatch-namespace.yaml
namespace/amazon-cloudwatch created

STEP 3) cloudwatch용 서비스 계정 생성

cloudwatch 에이전트 파드가 사용할 service account를 생성한다.

여기서 service account는 파드를 동작시키는 사용자와 같은 것이다.

[ec2-user@ip-10-10-1-195 ~]$ curl -O https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cwagent/cwagent-serviceaccount.yaml
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1120  100  1120    0     0   3010      0 --:--:-- --:--:-- --:--:--  3010
                        
[ec2-user@ip-10-10-1-195 ~]$ cat cwagent-serviceaccount.yaml
# create cwagent service account and role binding
apiVersion: v1
kind: ServiceAccount
metadata:
  name: cloudwatch-agent
  namespace: amazon-cloudwatch

---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: cloudwatch-agent-role
rules:
  - apiGroups: [""]
    resources: ["pods", "nodes", "endpoints"]
    verbs: ["list", "watch"]
  - apiGroups: ["apps"]
    resources: ["replicasets"]
    verbs: ["list", "watch"]
  - apiGroups: ["batch"]
    resources: ["jobs"]
    verbs: ["list", "watch"]
  - apiGroups: [""]
    resources: ["nodes/proxy"]
    verbs: ["get"]
  - apiGroups: [""]
    resources: ["nodes/stats", "configmaps", "events"]
    verbs: ["create"]
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["cwagent-clusterleader"]
    verbs: ["get","update"]

---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: cloudwatch-agent-role-binding
subjects:
  - kind: ServiceAccount
    name: cloudwatch-agent
    namespace: amazon-cloudwatch
roleRef:
  kind: ClusterRole
  name: cloudwatch-agent-role
  apiGroup: rbac.authorization.k8s.io
    
[ec2-user@ip-10-10-1-195 ~]$ kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cwagent/cwagent-serviceaccount.yaml
serviceaccount/cloudwatch-agent created
clusterrole.rbac.authorization.k8s.io/cloudwatch-agent-role created
clusterrolebinding.rbac.authorization.k8s.io/cloudwatch-agent-role-binding created

STEP 4) Cloudwatch 에이전트가 사용할 컨피그맵 생성

cloudwatch 에이전트 파드는 컨피그맵을 사용해서 각종설정을 불러온다. 그래서 아래와 같이 컨피그맵을 생성한다.

[ec2-user@ip-10-10-1-195 ~]$ curl -O https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cwagent/cwagent-configmap.yaml
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   521  100   521    0     0   1214      0 --:--:-- --:--:-- --:--:--  1214

[ec2-user@ip-10-10-1-195 ~]$ cat cwagent-configmap.yaml
# create configmap for cwagent config
apiVersion: v1
data:
  # Configuration is in Json format. No matter what configure change you make,
  # please keep the Json blob valid.
  cwagentconfig.json: |
    {
      "logs": {
        "metrics_collected": {
          "kubernetes": {
            "cluster_name": "",
            "metrics_collection_interval": 60
          }
        },
        "force_flush_interval": 5
      }
    }
kind: ConfigMap
metadata:
  name: cwagentconfig
  namespace: amazon-cloudwatch
    
# "cluster_name": "eks-work-cluster" 로 변경하였다. EKS 클러스터 이름을 넣어준다.
[ec2-user@ip-10-10-1-195 ~]$ vim cwagent-configmap.yaml
# create configmap for cwagent config
apiVersion: v1
data:
  # Configuration is in Json format. No matter what configure change you make,
  # please keep the Json blob valid.
  cwagentconfig.json: |
    {
      "logs": {
        "metrics_collected": {
          "kubernetes": {
            "cluster_name": "eks-work-cluster",
            "metrics_collection_interval": 60
          }
        },
        "force_flush_interval": 5
      }
    }
kind: ConfigMap
metadata:
  name: cwagentconfig
  namespace: amazon-cloudwatch
    
[ec2-user@ip-10-10-1-195 ~]$ kubectl apply -f cwagent-configmap.yaml
configmap/cwagentconfig created

STEP 5) Cloudwatch 에이전트를 데몬셋으로 동작시키기

아래와 같이 명령어를 실행하여 동작을 시킨다.

[ec2-user@ip-10-10-1-195 ~]$ curl -O https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cwagent/cwagent-daemonset.yaml
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2560  100  2560    0     0   7293      0 --:--:-- --:--:-- --:--:--  7293
                        
[ec2-user@ip-10-10-1-195 ~]$ cat cwagent-daemonset.yaml
# deploy cwagent as daemonset
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: cloudwatch-agent
  namespace: amazon-cloudwatch
spec:
  selector:
    matchLabels:
      name: cloudwatch-agent
  template:
    metadata:
      labels:
        name: cloudwatch-agent
    spec:
      containers:
        - name: cloudwatch-agent
          image: amazon/cloudwatch-agent:1.247348.0b251302
          #ports:
          #  - containerPort: 8125
          #    hostPort: 8125
          #    protocol: UDP
          resources:
            limits:
              cpu:  200m
              memory: 200Mi
            requests:
              cpu: 200m
              memory: 200Mi
          # Please don't change below envs
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.hostIP
            - name: HOST_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: K8S_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: CI_VERSION
              value: "k8s/1.3.8"
          # Please don't change the mountPath
          volumeMounts:
            - name: cwagentconfig
              mountPath: /etc/cwagentconfig
            - name: rootfs
              mountPath: /rootfs
              readOnly: true
            - name: dockersock
              mountPath: /var/run/docker.sock
              readOnly: true
            - name: varlibdocker
              mountPath: /var/lib/docker
              readOnly: true
            - name: containerdsock
              mountPath: /run/containerd/containerd.sock
              readOnly: true
            - name: sys
              mountPath: /sys
              readOnly: true
            - name: devdisk
              mountPath: /dev/disk
              readOnly: true
      volumes:
        - name: cwagentconfig
          configMap:
            name: cwagentconfig
        - name: rootfs
          hostPath:
            path: /
        - name: dockersock
          hostPath:
            path: /var/run/docker.sock
        - name: varlibdocker
          hostPath:
            path: /var/lib/docker
        - name: containerdsock
          hostPath:
            path: /run/containerd/containerd.sock
        - name: sys
          hostPath:
            path: /sys
        - name: devdisk
          hostPath:
            path: /dev/disk/
      terminationGracePeriodSeconds: 60
      serviceAccountName: cloudwatch-agent

[ec2-user@ip-10-10-1-195 ~]$ kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cwagent/cwagent-daemonset.yaml
daemonset.apps/cloudwatch-agent created

[ec2-user@ip-10-10-1-195 ~]$ kubectl get pods -n amazon-cloudwatch
NAME                     READY   STATUS    RESTARTS   AGE
cloudwatch-agent-j2f74   1/1     Running   0          14s
cloudwatch-agent-shw84   1/1     Running   0          14s

STEP 6) container insights에서 수집한 메트릭을 확인할 수 있다.

아래 그림과 같이 AWS cloudwatch의 console에서 확인할 수 있다.

2

  • 실습종료후 리소스 삭제

앞서 생성한 cloudwatch 관련 리소스를 삭제하는 방법은 아래와 같다. curl 명령어로 설정에 필요한 야믈파일을 다운로드 했기 때문에 다운로드한 디렉토리에서 아래의 명령어를 실행하면 된다.

[ec2-user@ip-10-10-1-195 ~]$ ll
total 45824
-rw-rw-r--  1 ec2-user ec2-user      141 Oct 19 13:33 cloudwatch-namespace.yaml
-rw-rw-r--  1 ec2-user ec2-user      522 Oct 19 13:41 cwagent-configmap.yaml
-rw-rw-r--  1 ec2-user ec2-user     2560 Oct 19 13:43 cwagent-daemonset.yaml
-rw-rw-r--  1 ec2-user ec2-user     1120 Oct 19 13:35 cwagent-serviceaccount.yaml
drwxrwxr-x 15 ec2-user ec2-user      315 Oct 18 13:49 k8s-aws-book
-rw-rw-r--  1 ec2-user ec2-user 46907392 Oct 18 12:16 kubectl

# container insights 관련 리소스를 삭제하는 명령어. (야믈 파일을 다운로드한 디렉토리에서 실행한다.)
# 파일 이름 각각을 설정하여 삭제할 경우 '.' 대신에 파일이름을 입력한다.
[ec2-user@ip-10-10-1-195 ~]$ kubectl delete -f .
namespace "amazon-cloudwatch" deleted
configmap "cwagentconfig" deleted
daemonset.apps "cloudwatch-agent" deleted
serviceaccount "cloudwatch-agent" deleted
clusterrole.rbac.authorization.k8s.io "cloudwatch-agent-role" deleted
clusterrolebinding.rbac.authorization.k8s.io "cloudwatch-agent-role-binding" deleted