<데이터 가공을 위한 SQL 값 조작>
1. 코드 값을 레이블로 변환 : CASE 식
SELECT
user_id
, CASE
WHEN register_device = 1 THEN '데스크톱'
WHEN register_device = 2 THEN '스마트폰'
WHEN register_device = 3 THEN '애플리케이션'
ELSE '' -- 디폴트값 의미
END AS device_name
FROM mst_users
;
2. 지정한 값의 날짜/시각 데이터 추출하기
-> 날짜 자료형과 타임스탬프 자료형으로 변환
미들웨어에 따라 다양한 방법 : CAST 함수가 가장 범용적
SELECT
-- PostgreSQL, Hive, Redshift, Bigquery, SparkSQL 모두 CAST(value AS type) 사용 가능
CAST('2016-01-30' AS date) AS dt
, CAST('2016-01-30 12:00:00' AS timestamp) AS stamp
-- PostgreSQL, Hive, Redshift, Bigquery, SparkSQL 모두 type value 사용 가능
date '2016-01-30' AS dt
, timestamp '2016-01-30 12:00:00' AS stamp
-- Hive, Bigquery 사용
date('2016-01-30') AS dt
, timestamp('2016-01-30 12:00:00') AS stamp
-- PostgreSQL, Redshift 'value::type' 사용
'2016-01-30'::date AS dt
, '2016-01-30 12:00:00' ::timestamp AS stamp
타임스탬프 자료형에서 연,월,일,시간 값을 추출할 때 -> EXTRACT 함수 사용 (hive, SparkSQL 제외)
SELECT
-- PostgreSQL, Redshift, Bigquery : EXTRACT 함수 사용하기
EXTRACT(YEAR FROM stamp) AS year
, EXTRACT(MONTH FROM stamp) AS month
, EXTRACT(DAY FROM stamp) AS day
, EXTRACT(HOUR FROM stamp) AS hour
-- Hive, SparkSQL 함수 사용하기
YEAR(stamp) AS year
, MONTH(stamp) AS month
, DAY(stamp) AS day
, HOUR(stamp) AS hour
FROM
(SELECT CAST('2016-01-30 12:00:00' AS timestamp) AS stamp) AS t
;
문자열에서 연,월,일,시간 추출 -> SUBSTRING
SELECT
stamp
-- PostgreSQL, Hive, Redshift, Bigquery : SUBSTRING 함수 사용하기
substring(stamp, 1, 4) AS year
, substring(stamp, 6, 2) AS month
, substring(stamp, 9, 2) AS day
, substring(stamp, 12, 2) AS hour
, substring(stamp, 1, 7) AS year_month
-- PostgreSQL, Hive, Bigquery, SparkSQL : SUBSTR 함수 사용하기
SUBSTR(stamp, 1, 4) AS year
, SUBSTR(stamp, 6, 2) AS month
, SUBSTR(stamp, 9, 2) AS day
, SUBSTR(stamp, 12, 2) AS hour
, SUBSTR(stamp, 1, 7) AS year_month
FROM
-- PostgreSQL, Redshift CAST 방식 : text
(SELECT CAST('2016-01-30 12:00:00' AS text) AS stamp) AS t
-- Hive, Bigquery, SparkSQL CAST 방식 : string
(SELECT CAST('2016-01-30 12:00:00' AS string) AS stamp) AS t
;
3. NULL 값 대체하기 : COALESCE 함수
- 문자열 + NULL = NULL
- 숫자 + NULL = NULL
--NULL값을 방지하기 위해 0으로 대치
SELECT
purchase_id
, amount - COALESCE(coupon,0) AS discount_amount2
FROM
purchase_log_with_coupon
;
출처
[한빛미디어] 데이터 분석을 위한 SQL 레시피 - 가사키 나가토, 다미야 나오토, 윤인성 옮김
'Career > TIL' 카테고리의 다른 글
2024-07-12 (0) | 2024.07.13 |
---|---|
2024-07-08 (0) | 2024.07.09 |
댓글