Cloud native 서비스를 이용한 Airflow 아키텍처 구현

2020-10-25

.

Data_Engineering_TIL(20201024)

** 베스핀글로벌 최정민님의 ‘Apache Airflow(SPOF 구성) 및 Amazon EMR을 통한 빅데이터 워크플로 오케스트레이션 - private’ 자료를 공부해서 실습한 내용입니다.

[실습목표]

아래와 같이 cloud native 서비스를 이용하여 airflow CeleryExecutor 아키텍처를 구현한다.

(Airflow를 이용하여 transient EMR을 create 하고 terminate를 컨트롤 하는 부분은 생략)

4

[실습내용]

STEP 1) VPC 생성

콘솔에서 아래와 같이 VPC를 생성한다.

  • name : pms-oregon-vpc

  • IPv4 CIDR block : 10.0.0.0/16

  • IPv6 CIDR block : No IPv6 CIDR block

  • Tenancy : default

그러면 아래와 같은 그림으로 vpc가 생성된다.

참고로 자동으로 라우팅 테이블도 만들어졌다.

1

** 생성후에 DNS hostnames를 활성화 해줘야 한다.

STEP 2) subnet 생성

콘솔에서 아래와 같이 public subnet, private subnet a와 c를 각각 생성해준다.

1) public subnet

  • Name tag : pms-public-subnet

  • vpc : pms-oregon-vpc 선택

  • Availability Zone : No preference

  • IPv4 CIDR block : 10.0.1.0/24

2) private subnet (AZ A)

  • Name tag : pms-private-subnet-a

  • vpc : pms-oregon-vpc 선택

  • Availability Zone : us-west-2a

  • IPv4 CIDR block : 10.0.2.0/24

3) private subnet (AZ C)

  • Name tag : pms-private-subnet-c

  • vpc : pms-oregon-vpc 선택

  • Availability Zone : us-west-2c

  • IPv4 CIDR block : 10.0.3.0/24

STEP 3) internet gateway 생성

먼저 인터넷 게이트웨이를 아래와 같은 옵션으로 생성

  • Name tag : pms-ig-test

그런 다음에 아래와 같이 Attach to VPC 를 클릭하고 pms-oregon-vpc attach 해준다.

2

STEP 4) routing table setting

VPC 생성시 자동으로 만들어진 route table 에 internet gateway를 등록해준다.

route table 콘솔에 가서 VPC 생성시 자동으로 만들어진 route table을 검색해서 지정하고, 아래와 같이 셋팅해준다.

(아니면 새로 route table을 생성해서 아래와 같이 셋팅해줘도 된다.)

3

그리고 subnet associations 클릭 –> Edit subnet associations 클릭 –> pms-public-subnet 를 추가해준다.

그런 다음에 pms-route-table-for-private 라는 이름으로 (vpc는 당연히 pms-oregon-vpc), private용으로 라우팅 테이블을 하나 더 만든다.

마찬가지로 subnet associations 클릭 –> Edit subnet associations 클릭 –> pms-private-subnet-apms-private-subnet-c 를 추가해준다.

STEP 5) NAT gateway을 위한 EIP 생성

Elastic IPs 메뉴 –> Allocate Elastic IP address 클릭 –> Allocate 클릭 –> Edit Name해서 pms-eip-test로 명명

STEP 6) NAT gateway 생성

아래와 같은 옵션으로 생성해준다.

  • Name : pms-ng-test

  • Subnet : pms-public-subnet

  • Elastic IP allocation ID : pms-eip-test

그런 다음에 라우팅 테이블 메뉴로 가서 pms-route-table-for-privateedit routes를 클릭한다.

그런다음에 pms-ng-test nat gateway를 0.0.0.0/0 으로해서 추가해준다.

STEP 7) security group 생성

아래와 같이 세개의 security를 생성해준다.

1) public subnet용 sg

  • Security group name : pms-sg-public

  • Description : pms-sg-public

  • VPC : pms-oregon-vpc

  • inbound rules 를 아래와 같이 하나 추가

Type : all traffic / source : My IP / Description - optional : myhome ip

2) private subnet용 sg

  • Security group name : pms-sg-private

  • Description : pms-sg-private

  • VPC : pms-oregon-vpc

  • inbound rules 를 아래와 같이 3개를 추가

Type : all traffic / source : My IP / Description - optional : myhome ip

Type : all traffic / source : 10.0.0.0/16 / Description - optional : vpc ip range

3) EFS 용 sg

  • Security group name : pms-sg-efs

  • Description : pms-sg-EFS

  • VPC : pms-oregon-vpc

  • inbound rules 를 아래와 같이 3개를 추가

Type : TCP / port : 2049 / source : pms-sg-private / Description - optional : pms-sg-private sg

STEP 8) key pair 생성

서버들이 사용할 키페어를 생성한다.

  • Name : pms-oregon-key

STEP 9) Bastion server를 위한 EC2 생성

  • Name : pms-airflow-bastion

  • 운영체제 : Amazon linux AMI version 2

  • 네트워크 : pms-oregon-vpc 의 pms-public-subnet

  • volume : 30GB

  • 사양 : t3.micro

  • Auto-assign Public IP : enable

  • security group : pms-sg-public

  • key : pms-oregon-key

베스천 생성 후 pms-sg-private 에 아래와 같이 시큐리티 인바운드 규칙을 추가해준다.

Type : all traffic / source : [bastion public ip] / Description - optional : bastion ip

STEP 10) Airflow server를 위한 EC2 생성

먼저 default 보안그룹에서 필요한 포트를 개방해주는 설정을 미리해준다.

AZ A에 위치하는 airflow server

  • Name : pms-airflow-a

  • 운영체제 : Amazon linux AMI version 2

  • 네트워크 : pms-oregon-vpc 의 pms-private-subnet-a

  • volume : 30GB

  • 사양 : t3.large

  • Auto-assign Public IP : disable

  • security group : pms-sg-private

  • key : pms-oregon-key

EC2를 생성해서 아래와 같이 bastion과 airflow 서버 접속테스트를 해본다.

# bastion에서 사용할 key를 bastion으로 전송
# 54.185.194.206은 bastion의 public ip
[local pc]$ scp -i pms-oregon-key.pem pms-oregon-key.pem ec2-user@54.185.194.206:~/
pms-oregon-key.pem                                                                                   100% 1678     8.8KB/s   00:00
    
# bastion 접속
[local pc]$ ssh -i pms-oregon-key.pem ec2-user@54.185.194.206
Last login: Sat Oct 24 09:26:27 2020 from 1.233.58.248

       __|  __|_  )
       _|  (     /   Amazon Linux 2 AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-2/
2 package(s) needed for security, out of 13 available
Run "sudo yum update" to apply all updates.
[ec2-user@ip-10-0-1-7 ~]$ ls
pms-oregon-key.pem
            
[ec2-user@ip-10-0-1-7 ~]$ sudo chmod 0400 pms-oregon-key.pem
            
# pms-airflow-a 접속 테스트
[ec2-user@ip-10-0-1-7 ~]$ ssh -i pms-oregon-key.pem ec2-user@10.0.2.91

       __|  __|_  )
       _|  (     /   Amazon Linux 2 AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-2/
2 package(s) needed for security, out of 13 available
Run "sudo yum update" to apply all updates.

[ec2-user@ip-10-0-2-91 ~]$ exit
logout
Connection to 10.0.2.91 closed.

STEP 11) ELB 생성

EC2 메뉴 클릭 –> Load Balancers 메뉴 클릭 –> create load balancer 클릭 –> Application Load Balancer 선택

아래와 같은 옵션으로 생성

  • Name : pms-airflow-alb

  • Scheme : internal

  • Listeners : HTTP, 80

  • Availability Zones : pms-oregon-vpc 의 pms-private-subnet-a 와 pms-private-subnet-c

  • Step 2: Configure Security Settings은 HTTPS일 경우에 설정하므로 그냥 next 클릭해서 넘어간다.

  • sg : pms-sg-private

  • target group

Name : pms-tg-alb , Health checks path : /login/?next=http%3A%2F%2F[Bastion Public ip]%2Fhome

  • Step 5: Register Targets

pms-airflow-a를 add to registered 클릭

ELB를 생성한 다음에 AWS 콘솔 좌측화면에 Load Balancing의 Target Groups 메뉴를 클릭 –> pms-tg-alb 클릭 –> Attributes 의 Edit 클릭 –> stickiness 박스 클릭 –> save

그러면 이제부터 airflow 와 Bastion을 구동하면 상태검사를 시작할 수 있다. 그 전까지는 상태검사 fail 상태이다.

STEP 12) RDS 생성

아래와 같은 셋팅으로 RDS를 생성한다.

먼저 아래와 같은 셋팅으로 Subnet groups 생성한다.

  • Name : pms-airflow-rds-subnet-group

  • Description : pms-airflow-rds-subnet-group

  • VPC : pms-oregon-vpc

  • Availability Zones : us-west-2a, us-west-2c

  • Subnets : pms-private-subnet-a, pms-private-subnet-c

그런 다음에 아래와 같은 셋팅으로 RDS를 생성한다.

1) 엔진 유형 : MySQL Aurora

2) 에디션 : MySQL과 호환되는 Amazon Aurora

3) 버전 : Aurora 2.09 (MySQL 5.7)

5) 템플릿 : 프로덕션

6) DB 인스턴스 식별자 : pms-airflow-rds

7) 마스터 사용자 이름 : admin

8) 마스터 암호 : 패스워드

9) DB 인스턴스 클래스 : db.t3.small

10) 다중 AZ 배포 : 대기 인스턴스를 생성하지 마십시오.

11) Virtual Private Cloud(VPC) : pms-oregon-vpc

12) subnet group : pms-airflow-rds-subnet-group

13) security group : pms-sg-private

14) 퍼블릭 액세스 가능 : 아니오

15) 포트 : 3306

그외의 옵션은 Default option

STEP 13) ElastiCache 생성

Elasticache 콘솔로 이동해서 아래와 같이 생성해준다.

1) 왼쪽 사이드 바에서 [Redis] > [create] 클릭

2) 클러스터 엔진 : Redis (클러스터 모드 활성화는 선택하지 않음)

3) 위치선택 : Amazon 클라우드

4) Name : pms-airflow-redis

5) 엔진 버전 : 5.0.6

6) 포트 번호 : 6379

7) Node type : cache.r6g.large

8) 복제본 갯수 : 0

9) 다중 AZ : 체크 해제

10) 서브넷 그룹 : 새로 생성

Name, Description : pms-redis-subnet-group

VPC : pms-oregon-vpc 의 pms-private-subnet-a 와 pms-private-subnet-c

11) 보안그룹 : pms-sg-private

나머지는 Default option으로 사용

** 참고사항

Redis는 airflow.cfg 설정시 password를 요구하지만 현재 Redis 클러스터 생성할때는 password에 대한 설정은 없다. Redis 생성후 왼쪽 사이드바에서 user management 로 가서 create 클릭

아래와 같이 생성

  • User id : redisairflowid

  • User name : redisairflow

  • password : 패스워드

STEP 14) S3 생성

oregon 으로해서 pms-oregon-bucket 이라는 이름으로 생성

STEP 15) EFS 생성

efs 콘솔가서 아래와 같이 efs를 생성한다.

  • Name : pms-airflow-efs

  • VPC : pms-oregon-vpc

create 바로 하지말고 중간에 customize 클릭 사용자가 원하는 옵션을 아래와 같이 적용해서 생성

VPC 설정: airflow 서버와 같은 VPC로 설정

탑재 대상 설정 : airflow 서버가 있는 가용영역과 서브넷에 설정(해당 내용이 있어야 마운트 가능)

보안그룹 : EFS 용으로 만든 SG를 선택

** 참고사항 : VPC는 DNS 옵션이 활성화 되어있어야 함

STEP 16) Airflow 서버 셋팅

airflow server에 접속해서 아래와 같이 셋팅해준다.

# bastion 접속
Last login: Sat Oct 24 09:28:34 2020 from 1.233.58.248

       __|  __|_  )
       _|  (     /   Amazon Linux 2 AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-2/
2 package(s) needed for security, out of 13 available
Run "sudo yum update" to apply all updates.
            
# 베스천에서 먼저 아래와 같이 nginx 설정을 해준다.
[ec2-user@ip-10-0-1-235 ~]$ sudo yum update -y
            
[ec2-user@ip-10-0-1-235 ~]$ sudo amazon-linux-extras install nginx1 -y
            
[ec2-user@ip-10-0-1-235 ~]$ cd /etc/nginx/
            
[ec2-user@ip-10-0-1-235 nginx]$ sudo vim nginx.conf
# 아래와 같이 server의 location 부분만 수정            
    
# For more information on configuration, see:
#   * Official English Documentation: http://nginx.org/en/docs/
#   * Official Russian Documentation: http://nginx.org/ru/docs/

user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;

# Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;

events {
    worker_connections 1024;
}

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 4096;

    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;

    # Load modular configuration files from the /etc/nginx/conf.d directory.
    # See http://nginx.org/en/docs/ngx_core_module.html#include
    # for more information.
    include /etc/nginx/conf.d/*.conf;

    server {
        listen       80;
        listen       [::]:80;
        server_name  _;
        root         /usr/share/nginx/html;

        # Load configuration files for the default server block.
        include /etc/nginx/default.d/*.conf;

        location / {
             proxy_redirect off;
             proxy_pass_header Server;
             proxy_set_header Host $http_host;
             proxy_set_header X-Real-IP $remote_addr;
             proxy_set_header X-Scheme $scheme;
             # Bastion host로 접근하면 내부 ALB로 proxy pass
             # proxy_pass http://[내부 ALB DNS]:80;
             proxy_pass http://internal-pms-airflow-alb-969054821.us-west-2.elb.amazonaws.com:80;
        }

        error_page 404 /404.html;
            location = /40x.html {
        }

        error_page 500 502 503 504 /50x.html;
            location = /50x.html {
        }
    }

# Settings for a TLS enabled server.
#
#    server {
#        listen       443 ssl http2;
#        listen       [::]:443 ssl http2;
#        server_name  _;
#        root         /usr/share/nginx/html;
#
#        ssl_certificate "/etc/pki/nginx/server.crt";
#        ssl_certificate_key "/etc/pki/nginx/private/server.key";
#        ssl_session_cache shared:SSL:1m;
#        ssl_session_timeout  10m;
#        ssl_ciphers PROFILE=SYSTEM;
#        ssl_prefer_server_ciphers on;
#
#        # Load configuration files for the default server block.
#        include /etc/nginx/default.d/*.conf;
#
#        error_page 404 /404.html;
#            location = /40x.html {
#        }
#
#        error_page 500 502 503 504 /50x.html;
#            location = /50x.html {
#        }
#    }

}
            
# 엔진엑스 구동
[ec2-user@ip-10-0-1-235 nginx]$ sudo service nginx start
Redirecting to /bin/systemctl start nginx.service
            
[ec2-user@ip-10-0-1-235 nginx]$ cd ~
            
# 그런 다음에 pms-oregon-key를 갖고 airflow server a 접속한다.
[ec2-user@ip-10-0-1-235 ~]$ ls
pms-oregon-key.pem
            
# 추후에 SPOF 설정을 위해 키복사
[ec2-user@ip-10-0-1-235 ~]$ cp pms-oregon-key.pem ~/.ssh/

# 10.0.2.169 는 pms-airflow-a의 ip
[ec2-user@ip-10-0-1-235 ~]$ sudo scp -i pms-oregon-key.pem pms-oregon-key.pem ec2-user@10.0.2.169:~/.ssh/
The authenticity of host '10.0.2.169 (10.0.2.169)' can't be established.
ECDSA key fingerprint is SHA256:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.
ECDSA key fingerprint is MD5:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.0.2.169' (ECDSA) to the list of known hosts.
pms-oregon-key.pem                                                                                                                100% 1678   785.0KB/s   00:00

# airflow server a 접속
[ec2-user@ip-10-0-1-235 ~]$ ssh -i pms-oregon-key.pem ec2-user@10.0.2.91
Last login: Sat Oct 24 09:34:03 2020 from 10.0.1.7

       __|  __|_  )
       _|  (     /   Amazon Linux 2 AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-2/
2 package(s) needed for security, out of 13 available
Run "sudo yum update" to apply all updates.
            
[ec2-user@ip-10-0-2-91 ~]$ sudo yum update -y
            
[ec2-user@ip-10-0-2-91 ~]$ sudo yum install python3 -y
            
[ec2-user@ip-10-0-2-91 ~]$ sudo yum install gcc python3-devel -y
            
[ec2-user@ip-10-0-2-91 ~]$ sudo pip3 install apache-airflow
            
[ec2-user@ip-10-0-2-91 ~]$ sudo pip3 install boto3
            
[ec2-user@ip-10-0-2-91 ~]$ aws configure
AWS Access Key ID [None]: xxxxxxxxxxxxxxxxxxxxxx
AWS Secret Access Key [None]: xxxxxxxxxxxxxxxxxxxxxxxxxx
Default region name [None]: us-west-2
Default output format [None]: json
            
[ec2-user@ip-10-0-2-91 ~]$ airflow initdb
DB: sqlite:////home/ec2-user/airflow/airflow.db
[2020-10-24 11:00:44,331] {db.py:378} INFO - Creating tables
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.

            ...
            
Done.
            
[ec2-user@ip-10-0-2-91 ~]$ sudo yum install -y https://dev.mysql.com/get/mysql80-community-release-el7-3.noarch.rpm
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
mysql80-community-release-el7-3.noarch.rpm                                                             |  25 kB  00:00:00
Examining /var/tmp/yum-root-_ghXkL/mysql80-community-release-el7-3.noarch.rpm: mysql80-community-release-el7-3.noarch
Marking /var/tmp/yum-root-_ghXkL/mysql80-community-release-el7-3.noarch.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package mysql80-community-release.noarch 0:el7-3 will be installed
--> Finished Dependency Resolution
amzn2-core/2/x86_64                                                                                    | 3.7 kB  00:00:00

Dependencies Resolved

==============================================================================================================================
 Package                             Arch             Version         Repository                                         Size
==============================================================================================================================
Installing:
 mysql80-community-release           noarch           el7-3           /mysql80-community-release-el7-3.noarch            31 k

Transaction Summary
==============================================================================================================================
Install  1 Package

Total size: 31 k
Installed size: 31 k
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : mysql80-community-release-el7-3.noarch                                                                     1/1
  Verifying  : mysql80-community-release-el7-3.noarch                                                                     1/1

Installed:
  mysql80-community-release.noarch 0:el7-3

Complete!

[ec2-user@ip-10-0-2-91 ~]$ sudo yum repolist
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
37 packages excluded due to repository priority protections
repo id                                                      repo name                                                  status
amzn2-core/2/x86_64                                          Amazon Linux 2 core repository                             21,106
amzn2extra-docker/2/x86_64                                   Amazon Extras repo for docker                                  28
mysql-connectors-community/x86_64                            MySQL Connectors Community                                 138+37
mysql-tools-community/x86_64                                 MySQL Tools Community                                         120
mysql80-community/x86_64                                     MySQL 8.0 Community Server                                    211
repolist: 21,603

[ec2-user@ip-10-0-2-91 ~]$ sudo yum install -y mysql-community-server
            
[ec2-user@ip-10-0-2-91 ~]$ sudo systemctl enable --now mysqld
            
[ec2-user@ip-10-0-2-91 ~]$ systemctl status mysqld
● mysqld.service - MySQL Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since Sat 2020-10-24 11:04:58 UTC; 9s ago
     Docs: man:mysqld(8)
           http://dev.mysql.com/doc/refman/en/using-systemd.html
  Process: 8967 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS)
 Main PID: 9039 (mysqld)
   Status: "Server is operational"
   CGroup: /system.slice/mysqld.service
           └─9039 /usr/sbin/mysqld

Oct 24 11:04:51 ip-10-0-2-91.us-west-2.compute.internal systemd[1]: Starting MySQL Server...
Oct 24 11:04:58 ip-10-0-2-91.us-west-2.compute.internal systemd[1]: Started MySQL Server.
    
[ec2-user@ip-10-0-2-91 ~]$ sudo yum install -y mysql-devel

[ec2-user@ip-10-0-2-91 ~]$ sudo pip3 install 'apache-airflow[mysql]'
            
# mysql -u [admin username] -p -h [RDS Endpoint]
[ec2-user@ip-10-0-2-91 ~]$ mysql -u admin -p -h pms-airflow-rds-instance-1.cbk472w8gsxf.us-west-2.rds.amazonaws.com
Enter password: 패스워드 입력
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 5.7.12 MySQL Community Server (GPL)

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

# RDS에 airflow metadb 생성
mysql> create database airflow;
Query OK, 1 row affected (0.02 sec)

# RDS에 airflow 계정 생성
# 로컬
mysql> CREATE USER 'airflow'@'localhost' IDENTIFIED BY 'admin1!';
Query OK, 0 rows affected (0.01 sec)

# 리모트
mysql> CREATE USER 'airflow'@'%' IDENTIFIED BY 'admin1!';
Query OK, 0 rows affected (0.02 sec)

# db에 대한 권한 부여
mysql> GRANT ALL privileges on airflow.* to airflow@localhost;
Query OK, 0 rows affected, 1 warning (0.02 sec)

mysql> GRANT ALL privileges on airflow.* to airflow@'%';
Query OK, 0 rows affected (0.02 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

mysql> exit;
Bye
            
[ec2-user@ip-10-0-2-91 ~]$ cd ~/airflow
            
[ec2-user@ip-10-0-2-91 airflow]$ mkdir dags

[ec2-user@ip-10-0-2-91 airflow]$ ls
airflow.cfg  airflow.db  dags  logs  unittests.cfg
            
[ec2-user@ip-10-0-2-91 airflow]$ sudo vim airflow.cfg

아래와 부분을 수정
            
# sql_alchemy_conn = mysql://[ID]:[PASSWORD]@[rds endpoint]:3306/airflow
sql_alchemy_conn = mysql://airflow:admin1!@pms-airflow-rds-instance-1.cbk472w8gsxf.us-west-2.rds.amazonaws.com:3306/airflow

load_examples = False
            
[ec2-user@ip-10-0-2-91 airflow]$ airflow initdb
DB: mysql://airflow:***@pms-airflow-rds-instance-1.cbk472w8gsxf.us-west-2.rds.amazonaws.com:3306/airflow
[2020-10-24 11:27:49,828] {db.py:378} INFO - Creating tables
INFO  [alembic.runtime.migration] Context impl MySQLImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
            
            ...
            
Done.
            
[ec2-user@ip-10-0-2-91 airflow]$ mysql -u admin -p -h pms-airflow-rds-instance-1.cbk472w8gsxf.us-west-2.rds.amazonaws.com
Enter password: 패스워드 입력
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 8.0.21 MySQL Community Server - GPL

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> alter table airflow.xcom modify value LONGBLOB;
Query OK, 0 rows affected (0.05 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> exit
Bye
            
[ec2-user@ip-10-0-2-91 airflow]$ sudo pip3 install flask-bcrypt 
            
[ec2-user@ip-10-0-2-91 airflow]$ sudo vim airflow.cfg
# 인증 설정을 아이디와 비밀번호로 설정
# RBAC는 Role Base Access Control로서 Role에 따라 접근 권한을 설정

아래의 부분을 수정

[webserver]

authenticate = True
rbac = True            
            
[api]

auth_backend = airflow.contrib.auth.backends.password_auth

#############################  참고사항 ###############################################
# 웹 인증을 사용할 경우 Airflow는 플라스크 관리자 기반 웹 UI 전용입니다. 
# RBAC 기능이있는 FAB 기반 웹 UI를 사용하는 경우 CLI 혹은 UI에서 계정 생성 가능합니다.
# 단 최초 관리자 계정 생성은 CLI로만 진행 가능합니다.
###############################################################################

            
# airflow 계정생성
# 최초 인증설정 후에는 관리자 계정을 생성해야 웹서버의 문제가 없다.
# 관리자 계정이 생성되면 웹서버를 반드시 재부팅 해야함
# 웹서버가 실행 중이던 브라우저의 캐시를 날린 후 재 접속하면 로그인 화면에 접근할 수 있음

# 관리자 계정 생성
# airflow create_user -r [Role] -u [User Name] -e [Email Adress] -f [Family Name] -l [Last Name] -p [Password]
[ec2-user@ip-10-0-2-91 airflow]$ airflow create_user -r Admin -u soojung.kim -e soojung.kim@itcompany.com -f soojung -l kim -p password1!
[2020-10-24 11:38:19,381] {manager.py:710} WARNING - No user yet created, use flask fab command to do it.
[2020-10-24 11:38:22,017] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-10-24 11:38:22,017] {dagbag.py:417} INFO - Filling up the DagBag from /home/ec2-user/airflow/dags
[2020-10-24 11:38:27,937] {security.py:477} INFO - Start syncing user roles.
[2020-10-24 11:38:28,238] {security.py:209} INFO - Initializing permissions for role:Viewer in the database.
[2020-10-24 11:38:28,865] {security.py:209} INFO - Initializing permissions for role:User in the database.
[2020-10-24 11:38:29,486] {security.py:209} INFO - Initializing permissions for role:Op in the database.
[2020-10-24 11:38:29,931] {security.py:387} INFO - Fetching a set of all permission, view_menu from FAB meta-table
[2020-10-24 11:38:30,448] {security.py:330} INFO - Cleaning faulty perms
Admin user soojung.kim created.
       
# 일반사용자 계정 생성
[ec2-user@ip-10-0-2-91 airflow]$ airflow create_user -r User -u jisung.park -e jisung.park@manunited.com -f jisung -l park -p password1!
[2020-10-24 11:41:12,037] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-10-24 11:41:12,038] {dagbag.py:417} INFO - Filling up the DagBag from /home/ec2-user/airflow/dags
[2020-10-24 11:41:12,775] {security.py:477} INFO - Start syncing user roles.
[2020-10-24 11:41:13,280] {security.py:387} INFO - Fetching a set of all permission, view_menu from FAB meta-table
[2020-10-24 11:41:13,639] {security.py:330} INFO - Cleaning faulty perms
User user jisung.park created.
            
# FERNET_KEY 설정
# Airflow는 기본적으로 meta DB에 password 정보를 평문으로 입력한다. 보안을 위해 Fernet 방식(Symmetric key)을 지원한다.
# FERNET_KEY를 위한 플러그인 설치
            
[ec2-user@ip-10-0-2-91 airflow]$ sudo pip3 install 'apache-airflow[crypto]'
            
[ec2-user@ip-10-0-2-91 airflow]$ python3
Python 3.7.9 (default, Aug 27 2020, 21:59:41)
[GCC 7.3.1 20180712 (Red Hat 7.3.1-9)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from cryptography.fernet import Fernet
>>> fernet_key= Fernet.generate_key()
>>> print(fernet_key.decode())
ZDwacVwTY8LKvlaAgqvTXafxDzXXwI3fIXqIBPKtX5w=   ## 이 비번을 복사
>>> exit()

[ec2-user@ip-10-0-2-91 airflow]$ sudo vim airflow.cfg

아래와 같이 일부분 수정

# Secret key to save connection passwords in the db
fernet_key = ZDwacVwTY8LKvlaAgqvTXafxDzXXwI3fIXqIBPKtX5w= ## 위에서 복사한 비번

############################ 역할별 권한 참고사항 ################################
# Admin : 다른 사용자의 권한 부여 또는 취소를 포함하여 가능한 모든 권한
# Public : 권한이 없음
# Viewer : 제한된 뷰어 권한
# User : Viewer권한과 추가 사용자 권한
# Op : User권한과 추가 작업 권한
###################################################################################
            
[ec2-user@ip-10-0-2-91 airflow]$ airflow initdb
DB: mysql://airflow:***@pms-airflow-rds-instance-1.cbk472w8gsxf.us-west-2.rds.amazonaws.com:3306/airflow
[2020-10-24 11:46:14,599] {db.py:378} INFO - Creating tables
INFO  [alembic.runtime.migration] Context impl MySQLImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
Done.

[ec2-user@ip-10-0-2-91 airflow]$ airflow list_users
[2020-10-24 11:46:35,488] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-10-24 11:46:35,489] {dagbag.py:417} INFO - Filling up the DagBag from /home/ec2-user/airflow/dags
[2020-10-24 11:46:36,294] {security.py:477} INFO - Start syncing user roles.
[2020-10-24 11:46:36,870] {security.py:387} INFO - Fetching a set of all permission, view_menu from FAB meta-table
[2020-10-24 11:46:37,257] {security.py:330} INFO - Cleaning faulty perms
╒══════╤═════════════╤═══════════════════════════╤══════════════╤═════════════╤═════════╕
│   Id │ Username    │ Email                     │ First name   │ Last name   │ Roles   │
╞══════╪═════════════╪═══════════════════════════╪══════════════╪═════════════╪═════════╡
│    1 │ soojung.kim │ soojung.kim@itcompany.com │ soojung      │ kim         │ [Admin] │
├──────┼─────────────┼────────────────────────
│    2 │ jisung.park │ jisung.park@manunited.com │ jisung       │ park        │ [User]  │
╘══════╧═════════════╧═══════════════════════════╧══════════════╧═════════════╧═════════╛
            
            
# S3에 로그 적재하도록 설정
[ec2-user@ip-10-0-2-91 airflow]$ sudo vim airflow.cfg 
remote_logging = True
remote_log_conn_id = MyS3Conn
remote_base_log_folder = s3://pms-oregon-bucket/airflow_log/

# airflow 서버에 EFS 마운트 
# 먼저 관련된 플러그인 설치
[ec2-user@ip-10-0-2-91 airflow]$ sudo yum install -y amazon-efs-utils
            
# sudo mount -t efs [EFS 파일 시스템 ID]:/ ~/airflow/dags            
[ec2-user@ip-10-0-2-91 airflow]$ sudo mount -t efs fs-9ecb899b:/ ~/airflow/dags
[ec2-user@ip-10-0-2-91 airflow]$ sudo chown ec2-user ~/airflow/dags


# celery executor 설정
            
[ec2-user@ip-10-0-2-91 airflow]$ sudo pip3 install celery

[ec2-user@ip-10-0-2-91 airflow]$ sudo pip3 install 'apache-airflow[redis]'
            
[ec2-user@ip-10-0-2-91 airflow]$ sudo pip3 install 'apache-airflow[celery]'

[ec2-user@ip-10-0-2-91 airflow]$ sudo vim ~/airflow/airflow.cfg

아래와 같은 부분을 수정

executor = CeleryExecutor
            
# broker_url = redis://{액세스 문자열}@{Redis Endpoint}:6379/0
broker_url = redis://off -@all@pms-airflow-redis.wwoqjw.0001.usw2.cache.amazonaws.com:6379/0
            
# result_backend = db+mysql://{DB username}:{DB password}@{DB Endpoint}/0
result_backend = db+mysql://airflow:admin1!@pms-airflow-rds-instance-1.cbk472w8gsxf.us-west-2.rds.amazonaws.com:3306/airflow

# airflow 디비갱신
[ec2-user@ip-10-0-2-91 airflow]$ airflow initdb
DB: mysql://airflow:***@pms-airflow-rds-instance-1.cbk472w8gsxf.us-west-2.rds.amazonaws.com:3306/airflow
[2020-10-24 12:30:42,211] {db.py:378} INFO - Creating tables
INFO  [alembic.runtime.migration] Context impl MySQLImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
Done.
            

# nginx 다운로드 및 설정
[ec2-user@ip-10-0-2-91 airflow]$ sudo amazon-linux-extras install nginx1 -y            
            
[ec2-user@ip-10-0-2-91 airflow]$ cd /etc/nginx/
            
[ec2-user@ip-10-0-2-91 nginx]$ sudo vim nginx.conf

# 아래와 같이 server의 location 부분만 수정 

# For more information on configuration, see:
#   * Official English Documentation: http://nginx.org/en/docs/
#   * Official Russian Documentation: http://nginx.org/ru/docs/

user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;

# Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;

events {
    worker_connections 1024;
}

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 4096;

    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;

    # Load modular configuration files from the /etc/nginx/conf.d directory.
    # See http://nginx.org/en/docs/ngx_core_module.html#include
    # for more information.
    include /etc/nginx/conf.d/*.conf;

    server {
        listen       80;
        listen       [::]:80;
        server_name  _;
        root         /usr/share/nginx/html;

        # Load configuration files for the default server block.
        include /etc/nginx/default.d/*.conf;

        location / {
            proxy_pass http://localhost:8080;
            # 해당 private ip:80으로 접근하면 localhost:8080으로 proxy
            # airflow의 Base_url이 http://localhost:8080이 되는것임
            proxy_set_header Host $host;
            proxy_redirect off;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
        }

        error_page 404 /404.html;
            location = /40x.html {
        }

        error_page 500 502 503 504 /50x.html;
            location = /50x.html {
        }
    }

# Settings for a TLS enabled server.
#
#    server {
#        listen       443 ssl http2;
#        listen       [::]:443 ssl http2;
#        server_name  _;
#        root         /usr/share/nginx/html;
#
#        ssl_certificate "/etc/pki/nginx/server.crt";
#        ssl_certificate_key "/etc/pki/nginx/private/server.key";
#        ssl_session_cache shared:SSL:1m;
#        ssl_session_timeout  10m;
#        ssl_ciphers PROFILE=SYSTEM;
#        ssl_prefer_server_ciphers on;
#
#        # Load configuration files for the default server block.
#        include /etc/nginx/default.d/*.conf;
#
#        error_page 404 /404.html;
#            location = /40x.html {
#        }
#
#        error_page 500 502 503 504 /50x.html;
#            location = /50x.html {
#        }
#    }

}            
            
# config를 수정후 nginx start
[ec2-user@ip-10-0-2-91 nginx]$ sudo service nginx start
Redirecting to /bin/systemctl start nginx.service                        

여기까지 진행하고 airflow-a를 AMI 이미지를 떠서 아래와 같은 옵션으로 띄운다.

pms-airflow-a를 create image한다.

  • Image name : pms-airflow-image

그런 다음에 pms-airflow-image를 갖고 pms-airflow-c를 아래와 같은 옵션으로 구동한다.

AZ C에 위치하는 airflow server

  • Name : pms-airflow-c

  • 운영체제 : Amazon linux AMI version 2

  • 네트워크 : pms-oregon-vpc 의 pms-private-subnet-c

  • volume : 30GB

  • 사양 : t3.large

  • Auto-assign Public IP : disable

  • security group : pms-sg-private

  • key : pms-oregon-key

생성한 다음에 ELB로가서 pms-airflow-c를 target으로 등록해준다.

그런 다음에 베스천에 접속해서 아래와 같이 명령어를 실행한다.

## 이미지를 뜨면서 airflow-a 와 세션이 끊겼기 때문에 다시 airflow-a서버로 접속한다.
[ec2-user@ip-10-0-1-235 ~]$ ssh -i pms-oregon-key.pem ec2-user@10.0.2.91
Last login: Sun Oct 25 09:08:36 2020 from ip-10-0-1-235.us-west-2.compute.internal

       __|  __|_  )
       _|  (     /   Amazon Linux 2 AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-2/
            
# 엔진엑스 구동
[ec2-user@ip-10-0-2-91 airflow]$ sudo service nginx start
Redirecting to /bin/systemctl start nginx.service            
            
## airflow 서비스를 구동한다.
[ec2-user@ip-10-0-2-91 airflow]$ airflow webserver -p 8080
  ____________       _____________
 ____    |__( )_________  __/__  /________      __
____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
[2020-10-24 12:31:23,510] {__init__.py:50} INFO - Using executor CeleryExecutor
[2020-10-24 12:31:23,511] {dagbag.py:417} INFO - Filling up the DagBag from /dev/null
[2020-10-24 12:31:24,253] {security.py:477} INFO - Start syncing user roles.
[2020-10-24 12:31:24,782] {security.py:387} INFO - Fetching a set of all permission, view_menu from FAB meta-table
[2020-10-24 12:31:25,152] {security.py:330} INFO - Cleaning faulty perms
Running the Gunicorn Server with:
Workers: 4 sync
Host: 0.0.0.0:8080
Timeout: 120
Logfiles: - -
=================================================================
[2020-10-24 12:31:25 +0000] [12694] [INFO] Starting gunicorn 20.0.4
[2020-10-24 12:31:25 +0000] [12694] [INFO] Listening at: http://0.0.0.0:8080 (12694)
[2020-10-24 12:31:25 +0000] [12694] [INFO] Using worker: sync
[2020-10-24 12:31:25 +0000] [12697] [INFO] Booting worker with pid: 12697
[2020-10-24 12:31:25 +0000] [12698] [INFO] Booting worker with pid: 12698
[2020-10-24 12:31:25 +0000] [12699] [INFO] Booting worker with pid: 12699
[2020-10-24 12:31:26 +0000] [12700] [INFO] Booting worker with pid: 12700
[2020-10-24 12:31:28,117] {__init__.py:50} INFO - Using executor CeleryExecutor
[2020-10-24 12:31:28,125] {dagbag.py:417} INFO - Filling up the DagBag from /home/ec2-user/airflow/dags
[2020-10-24 12:31:28,623] {__init__.py:50} INFO - Using executor CeleryExecutor
[2020-10-24 12:31:28,624] {dagbag.py:417} INFO - Filling up the DagBag from /home/ec2-user/airflow/dags
[2020-10-24 12:31:28,659] {__init__.py:50} INFO - Using executor CeleryExecutor
[2020-10-24 12:31:28,660] {dagbag.py:417} INFO - Filling up the DagBag from /home/ec2-user/airflow/dags
[2020-10-24 12:31:28,789] {__init__.py:50} INFO - Using executor CeleryExecutor
[2020-10-24 12:31:28,789] {dagbag.py:417} INFO - Filling up the DagBag from /home/ec2-user/airflow/dags

...
            
# 터미널 하나 또 열어서 아래와 같이 실행
[ec2-user@ip-10-0-2-91 ~]$ airflow scheduler
  ____________       _____________
 ____    |__( )_________  __/__  /________      __
____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
[2020-10-24 12:33:09,670] {__init__.py:50} INFO - Using executor CeleryExecutor
[2020-10-24 12:33:09,683] {scheduler_job.py:1367} INFO - Starting the scheduler
[2020-10-24 12:33:09,684] {scheduler_job.py:1375} INFO - Running execute loop for -1 seconds
[2020-10-24 12:33:09,684] {scheduler_job.py:1376} INFO - Processing each file at most -1 times
[2020-10-24 12:33:09,684] {scheduler_job.py:1379} INFO - Searching for files in /home/ec2-user/airflow/dags
[2020-10-24 12:33:09,686] {scheduler_job.py:1381} INFO - There are 0 files in /home/ec2-user/airflow/dags
[2020-10-24 12:33:09,686] {scheduler_job.py:1438} INFO - Resetting orphaned tasks for active dag runs
[2020-10-24 12:33:09,702] {dag_processing.py:562} INFO - Launched DagFileProcessorManager with pid: 12776
[2020-10-24 12:33:09,706] {settings.py:55} INFO - Configured default timezone <Timezone [UTC]>

# 터미널 하나 또 열어서 아래와 같이 실행
[ec2-user@ip-10-0-2-91 airflow]$ airflow worker

 -------------- celery@ip-10-0-2-91.us-west-2.compute.internal v4.4.7 (cliffs)
--- ***** -----
-- ******* ---- Linux-4.14.193-149.317.amzn2.x86_64-x86_64-with-glibc2.2.5 2020-10-24 13:04:35
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app:         airflow.executors.celery_executor:0x7f9259419310
- ** ---------- .> transport:   redis://on%20~%2A%20%2B%40all@pms-airflow-redis.wwoqjw.0001.usw2.cache.amazonaws.com:6379/0
- ** ---------- .> results:     mysql://airflow:**@pms-airflow-rds-instance-1.cbk472w8gsxf.us-west-2.rds.amazonaws.com:3306/airflow
- *** --- * --- .> concurrency: 16 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
 -------------- [queues]
                .> default          exchange=default(direct) key=default


[tasks]
  . airflow.executors.celery_executor.execute_command

Starting flask
           
            
# 터미널 하나 또 열어서 아래와 같이 실행
[ec2-user@ip-10-0-2-91 airflow]$ airflow flower
[I 201024 12:55:41 command:140] Visit me at http://0.0.0.0:5555
[I 201024 12:55:41 command:145] Broker: redis://on%20~%2A%20%2B%40all@pms-airflow-redis.wwoqjw.0001.usw2.cache.amazonaws.com:6379/0
[I 201024 12:55:41 command:148] Registered tasks:
    ['celery.accumulate',
     'celery.backend_cleanup',
     'celery.chain',
     'celery.chord',
     'celery.chord_unlock',
     'celery.chunks',
     'celery.group',
     'celery.map',
     'celery.starmap']
[I 201024 12:55:41 mixins:229] Connected to redis://on%20~%2A%20%2B%40all@pms-airflow-redis.wwoqjw.0001.usw2.cache.amazonaws.com:6379/0

            
...

마찬가지로 airflow-c에도 접속해서 airflow 서비스들을 구동시킨다.

그런 다음에 airflow-a와 c 모두에서 아래와 같이 설정을 해준다.

# SPOF 설정
# 위의 내용중에 Bastion에서 Bastion host의 ~/.ssh/에 airflow server pem key를 복사해 놓았고, airflow-a와 c에도 키를 복사해놨다.
# airflow serverA와 serverB가 능동적인 통신을 하기 위해 두 서버에 모두 pem key가 필요한 것이었다.
# 그러면 airflow-a와 c 두곳 모두에서 아래와 같이 설정해주자.

# # {AIRFLOW_HOME}에 airflow-scheduler-failover-controller라는 툴을 github에서 다운받아 적용할 것이다.
[ec2-user@ip-10-0-2-91 nginx]$ cd ~/airflow
            
[ec2-user@ip-10-0-2-91 airflow]$ sudo yum install -y git
            
[ec2-user@ip-10-0-2-91 airflow]$ sudo pip3 install git+https://github.com/teamclairvoyant/airflow-scheduler-failover-controller.git@v1.0.5            
            
[ec2-user@ip-10-0-2-91 airflow]$ scheduler_failover_controller init
Adding Scheduler Failover configs to Airflow config file...
Finished adding Scheduler Failover configs to Airflow config file.
Finished Initializing Configurations to allow Scheduler Failover Controller to run. Please update the airflow.cfg with your desired configurations.
            
[ec2-user@ip-10-0-2-91 airflow]$ sudo vim ~/airflow/airflow.cfg

아래와 같이 scheduler_nodes_in_cluster 부분을 수정해준다.

[scheduler_failover]

scheduler_nodes_in_cluster = [다른 airflow server IP]            
            

# 그런 다음에 scheduler 서버끼리 ssh 터널링이 가능하도록 ssh 키 작업을 해줘야 한다.
# scheduler를 동작시킬 webserver1,2 가 있다면 ssh를 통해 webserver 1->2로, webserver 2->1 로 접근 가능하도록 설정해야 한다.
# 아래의 명령어 실행하면(실행후 엔터를 세번 누르자) ~/.ssh/에 id_rsa.pub이 생성된다.
# airflow-a와 airflow-c를 모두 아래와 같이 설정해준다.
[ec2-user@ip-10-0-2-91 airflow]$ cd ~

[ec2-user@ip-10-0-2-91 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/ec2-user/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/ec2-user/.ssh/id_rsa.
Your public key has been saved in /home/ec2-user/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:xxxxxxxxxxxxxxxxxxxxxxxxxxxxx ec2-user@ip-10-0-2-15.us-west-2.compute.internal
The key's randomart image is:
+---[RSA 2048]----+
|                  |
+----[SHA256]-----+

[ec2-user@ip-10-0-2-91 ~]$ cd ~/.ssh/

# cat ~/.ssh/id_rsa.pub | sudo ssh -i [pem key] ec2-user@[다른 airflow server IP] "cat >>  ~/.ssh/authorized_keys"
[ec2-user@ip-10-0-2-91 .ssh]$ sudo cat ~/.ssh/id_rsa.pub | sudo ssh -i pms-oregon-key.pem ec2-user@[상대방 airflow ip] "cat >>  ~/.ssh/authorized_keys"

# scheduler_failover_controller 정상작동여부 확인
[ec2-user@ip-10-0-2-15 .ssh]$ scheduler_failover_controller test_connection
Testing Connection for host '10.0.3.20'
The authenticity of host '10.0.3.20 (10.0.3.20)' can't be established.
ECDSA key fingerprint is SHA256:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.
ECDSA key fingerprint is MD5:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.
Are you sure you want to continue connecting (yes/no)? yes
(True, ["Warning: Permanently added '10.0.3.20' (ECDSA) to the list of known hosts.\r\n", 'Connection Succeeded\n'])

# airflow-a와 c 모두 scheduler_failover_controller를 구동시키면 모든 구현이 완료된다.
[ec2-user@ip-10-0-2-91 .ssh]$ scheduler_failover_controller start
[2020-10-25 10:25:20,487] {app.py:26} INFO - Scheduler Failover Controller Starting Up!
[2020-10-25 10:25:20,487] {app.py:29} INFO - Current Host: ip-10-0-2-15.us-west-2.compute.internal
[2020-10-25 10:25:20,487] {sql_metadata_service.py:33} INFO - Creating Metadata Table
[2020-10-25 10:25:20,543] {failover_controller.py:30} INFO - --------------------------------------
[2020-10-25 10:25:20,544] {failover_controller.py:31} INFO - Started Polling...
[2020-10-25 10:25:20,585] {failover_controller.py:37} INFO - Active Failover Node: ip-10-0-3-20.us-west-2.compute.internal
[2020-10-25 10:25:20,585] {failover_controller.py:38} INFO - Active Scheduler Node: 10.0.2.15
[2020-10-25 10:25:20,586] {failover_controller.py:39} INFO - Last Failover Heartbeat: 2020-10-25 10:25:14. Current time: 2020-10-25 10:25:20.585683.
[2020-10-25 10:25:20,586] {failover_controller.py:43} INFO - This Failover Controller is on Standby.
[2020-10-25 10:25:20,586] {failover_controller.py:46} INFO - There already is an active Failover Controller 'ip-10-0-3-20.us-west-2.compute.internal'
[2020-10-25 10:25:20,586] {failover_controller.py:133} INFO - This Failover Controller on STANDBY
[2020-10-25 10:25:20,586] {app.py:36} INFO - Finished Polling. Sleeping for 10 seconds

[bastion public ip] 로 웹브라우저에서 접속하면 아래와 같이 airflow 로그인 화면이 나오고 우리가 위에서 생성한 아이디로 접속을 할 수 있다.

bastion public ip 로 접속하면 URL이 http://[bastion public ip]/login/?next=http%3A%2F%2F[bastion public ip]%2Fhome 형태가 된다.

그리고 ELB 의 target group 콘솔메뉴로 가면 두 노드의 상태가 healthy한 것도 확인할 수 있다.

5

** 아래 URL의 자료도 반드시 참고할것

https://github.com/minman2115/Data_engineering_studynotes_2020/blob/master/Cloud%20native%20%EC%84%9C%EB%B9%84%EC%8A%A4%EB%A5%BC%20%EC%9D%B4%EC%9A%A9%ED%95%9C%20Airflow%20%EC%95%84%ED%82%A4%ED%85%8D%EC%B2%98%20%EA%B5%AC%ED%98%84/%EC%B0%B8%EA%B3%A0%EC%9E%90%EB%A3%8C.zip