Athena로 EMR log를 검색하는 예시 쿼리

2021-11-09

.

Data_Engineering_TIL(20211109)

** 참고자료 : https://docs.aws.amazon.com/athena/latest/ug/emr-logs.html

  • Athena Table 등록

1

CREATE EXTERNAL TABLE `pms-emr-log-tb`(`data` string COMMENT 'from deserializer')
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\n'
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://pms-bucket-test/emr_log/'
  • 데이터 조회

‘i-0695e89e56483fa05’ 마스터노드의 ERROR 로그를 조회하는 쿼리

2

SELECT "data","$PATH" AS filepath
FROM "pms-glue-db"."pms-emr-log-tb"
WHERE regexp_like("$PATH",'i-0695e89e56483fa05')
AND regexp_like("$PATH",'bootstrap-actions')
AND regexp_like(data,'ERROR')
limit 100;