Athena로 EMR log를 검색하는 예시 쿼리
2021-11-09
.
Data_Engineering_TIL(20211109)
** 참고자료 : https://docs.aws.amazon.com/athena/latest/ug/emr-logs.html
- Athena Table 등록
CREATE EXTERNAL TABLE `pms-emr-log-tb`(`data` string COMMENT 'from deserializer')
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\n'
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://pms-bucket-test/emr_log/'
- 데이터 조회
‘i-0695e89e56483fa05’ 마스터노드의 ERROR 로그를 조회하는 쿼리
SELECT "data","$PATH" AS filepath
FROM "pms-glue-db"."pms-emr-log-tb"
WHERE regexp_like("$PATH",'i-0695e89e56483fa05')
AND regexp_like("$PATH",'bootstrap-actions')
AND regexp_like(data,'ERROR')
limit 100;