
購入の価値ありです。
もくじ
SQL1. SELECT
postgres=# SELECT * FROM payment;
payment_id | customer_id | staff_id | rental_id | amount | payment_date
------------+-------------+----------+-----------+--------+----------------------------
16050 | 269 | 2 | 7 | 1.99 | 2007-01-24 21:40:19.996577
16051 | 269 | 1 | 98 | 0.99 | 2007-01-25 15:16:50.996577
16052 | 269 | 2 | 678 | 6.99 | 2007-01-28 21:44:14.996577
16053 | 269 | 2 | 703 | 0.99 | 2007-01-29 00:58:02.996577
16054 | 269 | 1 | 750 | 4.99 | 2007-01-29 08:10:06.996577
16055 | 269 | 2 | 1099 | 2.99 | 2007-01-31 12:23:14.996577
16056 | 270 | 1 | 193 | 1.99 | 2007-01-26 05:10:14.996577
16057 | 270 | 1 | 1040 | 4.99 | 2007-01-31 04:03:42.996577
16058 | 271 | 1 | 1096 | 8.99 | 2007-01-31 11:59:15.996577
16059 | 272 | 1 | 33 | 0.99 | 2007-01-25 02:47:17.996577
(略)
postgres=# SELECT * FROM customer;
customer_id | store_id | first_name | last_name | email | address_id | activebool | create_date | last_update | active
-------------+----------+-------------+--------------+------------------------------------------+------------+------------+-------------+---------------------+--------
1 | 1 | MARY | SMITH | MARY.SMITH@sakilacustomer.org | 5 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1
2 | 1 | PATRICIA | JOHNSON | PATRICIA.JOHNSON@sakilacustomer.org | 6 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1
3 | 1 | LINDA | WILLIAMS | LINDA.WILLIAMS@sakilacustomer.org | 7 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1
4 | 2 | BARBARA | JONES | BARBARA.JONES@sakilacustomer.org | 8 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1
5 | 1 | ELIZABETH | BROWN | ELIZABETH.BROWN@sakilacustomer.org | 9 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1
6 | 2 | JENNIFER | DAVIS | JENNIFER.DAVIS@sakilacustomer.org | 10 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1
7 | 1 | MARIA | MILLER | MARIA.MILLER@sakilacustomer.org | 11 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1
8 | 2 | SUSAN | WILSON | SUSAN.WILSON@sakilacustomer.org | 12 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1
9 | 2 | MARGARET | MOORE | MARGARET.MOORE@sakilacustomer.org | 13 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1
10 | 1 | DOROTHY | TAYLOR | DOROTHY.TAYLOR@sakilacustomer.org | 14 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1
(略)
SQL2. 条件を指定する WHERE
SELECT
payment_id,
customer_id
FROM
payment
WHERE
customer_id = 1;
payment_id | customer_id
------------+-------------
16677 | 1
16678 | 1
18495 | 1
18496 | 1
18497 | 1
18498 | 1
18499 | 1
18500 | 1
18501 | 1
22680 | 1
(略)
SQL3. cutomerテーブルの名前(first_name)がKELLYのレコードを抽出
SELECT first_name, last_name FROM customer WHERE first_name = 'KELLY'; first_name | last_name ------------+----------- KELLY | TORRES KELLY | KNOTT (2 rows)
SQL4. customerテーブルから、名字(last_name)がKNOTTで名前がKELLYの顧客を抽出する
SELECT first_name, last_name FROM customer WHERE first_name = 'KELLY' AND last_name = 'KNOTT' ; first_name | last_name ------------+----------- KELLY | KNOTT (1 row)
SQL5. customerテーブルから、名前がKELLYもしくはMARIAの顧客を抽出する
SELECT first_name, last_name FROM customer WHERE first_name = 'KELLY' OR first_name = 'MARIA' ; first_name | last_name ------------+----------- MARIA | MILLER KELLY | TORRES KELLY | KNOTT (3 rows)
SQL6. customerテーブルから、名前がKELLYやMARIA以外の顧客を抽出する
SELECT
first_name,
last_name
FROM
customer
WHERE
NOT (
first_name = 'KELLY'
OR
first_name = 'MARIA'
)
;
first_name | last_name
-------------+--------------
MARY | SMITH
PATRICIA | JOHNSON
LINDA | WILLIAMS
BARBARA | JONES
ELIZABETH | BROWN
JENNIFER | DAVIS
SUSAN | WILSON
MARGARET | MOORE
DOROTHY | TAYLOR
LISA | ANDERSON
NANCY | THOMAS
KAREN | JACKSON
(略)
SQL7. OR条件を列挙して記述できるIN句。cutomerテーブルから、名前がAARON, ADAM, ANNの顧客を抽出する
SELECT
first_name,
last_name
FROM
customer
WHERE
first_name IN ('AARON', 'ADAM', 'ANN')
;
first_name | last_name
------------+-----------
ANN | EVANS
ADAM | GOOCH
AARON | SELBY
(3 rows)
SQL8. 比較演算子 paymentテーブルから支払額(amount)が6.99ドル以上のレコードを抽出する
SELECT
payment_id,
amount
FROM
payment
WHERE
amount >= 6.99
;
payment_id | amount
------------+--------
16052 | 6.99
16058 | 8.99
16060 | 6.99
16061 | 6.99
16073 | 10.99
16074 | 6.99
16082 | 6.99
16086 | 6.99
16087 | 6.99
16092 | 6.9
(略)
SQL9. paymentテーブルから、支払額(amount)が0.99ドル以外のテーブルを抽出
SELECT
payment_id,
amount
FROM
payment
WHERE
amount != 0.99
;
payment_id | amount
------------+--------
16050 | 1.99
16052 | 6.99
16054 | 4.99
16055 | 2.99
16056 | 1.99
16057 | 4.99
16058 | 8.99
SQL11. rentalテーブルのreturn_dateがNULLではないレコードを抽出する
SELECT
rental_id,
return_date
FROM
rental
WHERE
return_date IS NOT NULL
;
rental_id | return_date
-----------+---------------------
2 | 2005-05-28 19:40:33
3 | 2005-06-01 22:12:39
4 | 2005-06-03 01:43:41
5 | 2005-06-02 04:33:21
6 | 2005-05-27 01:32:07
7 | 2005-05-29 20:34:53
8 | 2005-05-27 23:33:46
(略)
SQL12. BETWEEN演算子 customテーブルから、顧客IDが11から13の顧客をBETWEENを使って抽出
SELECT
customer_id,
first_name,
last_name
FROM
customer
WHERE
customer_id BETWEEN 11 AND 13
;
customer_id | first_name | last_name
-------------+------------+-----------
11 | LISA | ANDERSON
12 | NANCY | THOMAS
13 | KAREN | JACKSON
(3 rows)
SQL13. LIKE演算子 filmテーブルのdescriptionにAmazingが含まれているレコードを抽出
SELECT
title,
description
FROM
film
WHERE
description LIKE '%Amazing%'
;
title | description
-----------------------+--------------------------------------------------------------------------------------------------------------------------------
ANNIE IDENTITY | A Amazing Panorama of a Pastry Chef And a Boat who must Escape a Woman in An Abandoned Amusement Park
ANONYMOUS HUMAN | A Amazing Reflection of a Database Administrator And a Astronaut who must Outrace a Database Administrator in A Shark Tank
BRANNIGAN SUNRISE | A Amazing Epistle of a Moose And a Crocodile who must Outrace a Dog in Berlin
BUCKET BROTHERHOOD | A Amazing Display of a Girl And a Womanizer who must Succumb a Lumberjack in A Baloon Factory
BULWORTH COMMANDMENTS | A Amazing Display of a Mad Cow And a Pioneer who must Redeem a Sumo Wrestler in The Outback
CARRIE BUNCH | A Amazing Epistle of a Student And a Astronaut who must Discover a Frisbee in The Canadian Rockies
CASABLANCA SUPER | A Amazing Panorama of a Crocodile And a Forensic Psychologist who must Pursue a Secret Agent in The First Manned Space Station
CELEBRITY HORN | A Amazing Documentary of a Secret Agent And a Astronaut who must Vanquish a Hunter in A Shark Tank
CHAMPION FLATLINERS | A Amazing Story of a Mad Cow And a Dog who must Kill a Husband in A Monastery
CLASH FREDDY | A Amazing Yarn of a Composer And a Squirrel who must Escape a Astronaut in Australia
CLONES PINOCCHIO | A Amazing Drama of a Car And a Robot who must Pursue a Dentist in New Orleans
DEEP CRUSADE | A Amazing Tale of a Crocodile And a Squirrel who must Discover a Composer in Australia
EARLY HOME | A Amazing Panorama of a Mad Scientist And a Husband who must Meet a Woman in The Outback
(略)
SQL14. NOT LIKE filmテーブルのdescriptionにAmazingが含まれていないレコードの抽出
SELECT
title,
description
FROM
film
WHERE
description NOT LIKE '%Amaziong%'
;
title | description
-----------------------------+------------------------------------------------------------------------------------------------------------------------------------
ACADEMY DINOSAUR | A Epic Drama of a Feminist And a Mad Scientist who must Battle a Teacher in The Canadian Rockies
ACE GOLDFINGER | A Astounding Epistle of a Database Administrator And a Explorer who must Find a Car in Ancient China
ADAPTATION HOLES | A Astounding Reflection of a Lumberjack And a Car who must Sink a Lumberjack in A Baloon Factory
AFFAIR PREJUDICE | A Fanciful Documentary of a Frisbee And a Lumberjack who must Chase a Monkey in A Shark Tank
AFRICAN EGG | A Fast-Paced Documentary of a Pastry Chef And a Dentist who must Pursue a Forensic Psychologist in The Gulf of Mexico
AGENT TRUMAN | A Intrepid Panorama of a Robot And a Boy who must Escape a Sumo Wrestler in Ancient China
AIRPLANE SIERRA | A Touching Saga of a Hunter And a Butler who must Discover a Butler in A Jet Boat
AIRPORT POLLOCK | A Epic Tale of a Moose And a Girl who must Confront a Monkey in Ancient India
(略)
SQL15. COUNT() paymentテーブルの総件数を求める
SELECT COUNT(*) FROM payment; count ------- 16049 (1 row)
SQL16. DISTINCT 重複を除外。 paymentテーブルから、支払いを行ったユニークな顧客IDを求める
SELECT
DISTINCT customer_id
FROM
payment
;
customer_id
-------------
251
106
120
285
264
497
452
496
455
209
(略)
SQL17. paymentテーブルから、支払いを行ったユニークな顧客数を求める
SELECT COUNT(DISTINCT customer_id) FROM payment ; count ------- 599 (1 row)
SQL18. ORDER BY 順序を指定する。 customerテーブルから、顧客の名字を昇順に出力する
SELECT
customer_id,
last_name
FROM
customer
ORDER BY
last_name
;
customer_id | last_name
-------------+--------------
505 | ABNEY
504 | ADAM
36 | ADAMS
96 | ALEXANDER
470 | ALLARD
27 | ALLEN
220 | ALVAREZ
11 | ANDERSON
326 | ANDREW
183 | ANDREWS
449 | AQUINO
368 | ARCE
560 | ARCHULETA
188 | ARMSTRONG
(略)
SQL20. LIMIT句。 customerテーブルから、最近登録された顧客3名を抽出する
SELECT first_name, last_name FROM customer ORDER BY customer_id DESC LIMIT 3 ; first_name | last_name ------------+----------- AUSTIN | CINTRON WADE | DELVALLE FREDDIE | DUGGAN (3 rows)
SQL21. GROUP BY句。paymentテーブルから、これまでの累計で支払い回数が多い顧客の上位3人の顧客IDを抽出する
SELECT
customer_id,
COUNT(*) AS payment_count
FROM
payment
GROUP BY
customer_id
ORDER BY
payment_count DESC
LIMIT 3
;
customer_id | payment_count
-------------+---------------
148 | 46
526 | 45
236 | 42
(3 rows)
SQL22. ROUND句。paymentテーブルの売上金額(amount)をドルから円に変換して、”109″のように小数点以下を四捨五入した形で抽出。1ドル110円とする。
SELECT
ROUND(amount * 110) AS amount_yen
FROM
payment
LIMIT 3
;
amount_yen
------------
219
109
769
(3 rows)
SQL23. CONCAT句。paymentテーブルの売上金額(amount)をドルから円に変換して、”109yen”のように小数点以下を四捨五入して単位をつける。
SELECT
CONCAT(
ROUND(amount * 110),
'yen'
) AS amount_yen
FROM
payment
LIMIT 3
;
amount_yen
------------
219yen
109yen
769yen
(3 rows)
結合
SQL24. LEFT JOIN句。paymentテーブルにcustomer_idでひも付くcustomerテーブルを結合し、payment_id, last_name, first_nameカラムのデータを抽出する
SELECT
payment_id,
last_name,
first_name
FROM
payment
LEFT JOIN customer
ON payment.customer_id
= customer.customer_id
;
または
SELECT
payment_id,
last_name,
first_name
FROM
payment
LEFT JOIN customer
USING(customer_id)
;
payment_id | last_name | first_name
------------+--------------+-------------
16050 | WALTERS | CASSANDRA
16051 | WALTERS | CASSANDRA
16052 | WALTERS | CASSANDRA
16053 | WALTERS | CASSANDRA
16054 | WALTERS | CASSANDRA
16055 | WALTERS | CASSANDRA
16056 | CURTIS | LEAH
16057 | CURTIS | LEAH
16058 | NEAL | PENNY
16059 | CALDWELL | KAY
16060 | CALDWELL | KAY
16061 | CALDWELL | KAY
16062 | CALDWELL | KAY
16063 | LOWE | PRISCILLA
(略)
SQL25. INNER JOIN句。paymentテーブルから、顧客名がBRIAN WYMANの支払いデータを抽出する
SELECT
payment_id,
payment.customer_id,
amount
FROM
payment
INNER JOIN customer
ON payment.customer_id
= customer.customer_id
WHERE
first_name = 'BRIAN'
AND last_name = 'WYMAN'
;
payment_id | customer_id | amount
------------+-------------+--------
16160 | 318 | 9.99
17400 | 318 | 2.99
17401 | 318 | 2.99
17402 | 318 | 0.99
17403 | 318 | 7.99
20001 | 318 | 2.99
25744 | 318 | 4.99
25745 | 318 | 2.99
25746 | 318 | 8.99
25747 | 318 | 0.99
25748 | 318 | 0.99
25749 | 318 | 5.99
(12 rows)
SQL26. WHERE >GROUP BY > HAVINGの優先順位。 filmテーブル, film_categoryテーブル, categoryテーブルから、カテゴリ名ごとに映画の作品数を集計し、65件件以上のものを抽出する
SELECT
category.name AS name,
COUNT(category.name) AS film_cnt
FROM
film
INNER JOIN film_category
USING(film_id)
INNER JOIN category
USING(category_id)
WHERE
category.name IN(
'Sports',
'Games',
'Travel'
)
GROUP BY
category.name
HAVING
COUNT(category.name) > 60
ORDER BY
film_cnt DESC
;
name | film_cnt
--------+----------
Sports | 74
Games | 61
(2 rows)
SQL27. CASE句。paymentテーブルで、支払い額が5を超える場合はexpensive, 1を超える場合はmodest, そうでなければcheapとして一覧表示する
SELECT
payment_id,
amount,
CASE
WHEN amount > 5 THEN 'expensive'
WHEN amount > 1 THEN 'modest'
ELSE 'cheap'
END AS price_range
FROM
payment
;
payment_id | amount | price_range
------------+--------+-------------
16050 | 1.99 | modest
16051 | 0.99 | cheap
16052 | 6.99 | expensive
16053 | 0.99 | cheap
16054 | 4.99 | modest
16055 | 2.99 | modest
16056 | 1.99 | modest
16057 | 4.99 | modest
16058 | 8.99 | expensive
16059 | 0.99 | cheap
16060 | 6.99 | expensive
(略)
SQL28. POSIX正規表現。filmテーブルのdescriptionに入っているThoughtfulまたは、Insightfulのレコード数を求める。
PostgreSQLの場合
SELECT COUNT(*) FROM film WHERE description ~ '(Thou|Insi)ghtful' ;
MySQL
SELECT COUNT(*) FROM film WHERE description REGEXP ~ '(Thou|Insi)ghtful' ;
正規表現
- ~ 正規表現に一致し、大文字小文字の区別を行う
- ~* 正規表現に一致し、大文字小文字の区別をしない
- !~ 正規表現に一致しない、大文字小文字の区別を行う
- !~* 正規表現に一致しない、大文字小文字の区別をしない
集計関数
- COUNT レコード数
- SUM 合計
- AVG 平均値
- MAX 最大値
- MIN 最小値
●SUM
SELECT
SUM(amount) AS total_sales
FROM
payment
;
total_sales
-------------
67416.51
(1 row)
●SUM + GROUP BY
SELECT
customer_id,
SUM(amount) AS total_sales
FROM
payment
GROUP BY
customer_id
;
customer_id | total_sales
-------------+-------------
251 | 120.69
106 | 100.77
120 | 143.68
285 | 135.74
264 | 98.75
497 | 129.72
452 | 107.68
496 | 88.79
455 | 87.76
SQL29. paymentテーブルから、支払い額上位5名の顧客データを抽出する
SELECT
customer_id,
SUM(amount) AS total_sales
FROM
payment
GROUP BY
customer_id
ORDER BY
total_sales DESC
LIMIT 5
;
customer_id | total_sales
-------------+-------------
526 | 221.55
148 | 216.54
144 | 195.58
137 | 194.61
178 | 194.61
(5 rows)
日付の条件指定
CAST(変換したいカラム AS 変換後のデータ型)
SQL30. CAST句。paymentテーブルから、日付ごとの売上金額を集計する。
SELECT CAST(payment_date AS DATE) AS p_date, SUM(amount) AS total_sales FROM payment GROUP BY p_date ORDER BY p_date ; p_date | total_sales ------------+------------- 2007-01-24 | 86.81 2007-01-25 | 568.61 2007-01-26 | 743.30 2007-01-27 | 708.27 2007-01-28 | 793.10 2007-01-29 | 655.42 2007-01-30 | 622.41 2007-01-31 | 646.51 2007-02-14 | 152.63 2007-02-15 | 1376.49 (略)
成功!
SELECT
payment_date,
SUM(amount) AS total_sales
FROM
payment
GROUP BY
payment_date
ORDER BY
payment_date
;
payment_date | total_sales
----------------------------+-------------
2007-01-24 21:21:56.996577 | 2.99
2007-01-24 21:22:59.996577 | 2.99
2007-01-24 21:32:05.996577 | 3.99
2007-01-24 21:33:07.996577 | 4.99
2007-01-24 21:33:47.996577 | 6.99
2007-01-24 21:36:33.996577 | 0.99
2007-01-24 21:40:19.996577 | 1.99
2007-01-24 22:00:12.996577 | 4.99
(略)
payment_dateがtimestamp型のため、
日付ごとでは集計が上手くいかない!!!!
SQL31. EXTRACT句。paymentテーブルから、月別の売上金額を集計する。
SELECT EXTRACT(YEAR FROM payment_date) As YYYY, EXTRACT(MONTH FROM payment_date) AS mm, SUM(amount) AS total_sales FROM payment GROUP BY yyyy, mm ORDER BY mm ; yyyy | mm | total_sales ------+----+------------- 2007 | 1 | 4824.43 2007 | 2 | 9631.88 2007 | 3 | 23886.56 2007 | 4 | 28559.46 2007 | 5 | 514.18 (5 rows)
または、
●LEFT句 文字列から文字数分の文字を左から切り出す関数。
LEFT(文字列, 文字数)
SELECT
LEFT(
CAST(payment_date AS VARCHAR),
7
) AS yyyymm,
SUM(amount) AS total_sales
FROM
payment
GROUP BY
yyyymm
ORDER BY
yyyymm
;
yyyymm | total_sales
---------+-------------
2007-01 | 4824.43
2007-02 | 9631.88
2007-03 | 23886.56
2007-04 | 28559.46
2007-05 | 514.18
(5 rows)
SQL32. 日付の条件指定。paymentテーブルから、2007年1月の売上データを抽出する。
●よくある間違い
SELECT
SUM(amount) AS total_sales
FROM
payment
WHERE
payment_date >= '2007-01-01'
AND payment_date <= '2007-01-31'
;
total_sales
-------------
4177.92 ←少ない。SQL31参照
(1 row)
2007-01-31のデータが含まれていない結果になっています。
●理由
WHERE payment_date <= ‘2007-01-31’
これは、2007-01-31 00:00:00までのデータを取得するから。
●正しい方法①
SELECT
SUM(amount) AS total_sales
FROM
payment
WHERE
payment_date >= '2007-01-01'
AND payment_date < '2007-02-01'
;
total_sales
-------------
4824.43
(1 row)
●正しい方法② BETWEENを利用する
SELECT
SUM(amount) AS total_sales
FROM
payment
WHERE
CAST(payment_date AS DATE)
BETWEEN '2007-01-01' AND '2007-01-31'
;
total_sales
-------------
4824.43
(1 row)
●正しい方法③
SELECT
SUM(amount) AS total_sales
FROM
payment
WHERE
payment_date >= '2007-01-01'
AND CAST(payment_date AS DATE) <= '2007-01-31'
;
total_sales
-------------
4824.43
(1 row)
●正しい方法④ EXTRACT
SELECT
SUM(amount) AS total_sales
FROM
payment
WHERE
EXTRACT(YEAR FROM payment_date) = 2007
AND EXTRACT(MONTH FROM payment_date) = 1
;
total_sales
-------------
4824.43
(1 row)
複数のクエリを組み合わせた抽出 FROM, IN, EXISTS
SQL33. paymentテーブルから顧客IDごとに累計売上を合計し、1顧客あたりの平均売上、最低売上、最高売上を求める。
●過程① 顧客IDごとに累計売上を合計する
SELECT
customer_id,
SUM(amount) AS total_sales
FROM
payment
GROUP BY
customer_id
;
customer_id | total_sales
-------------+-------------
251 | 120.69
106 | 100.77
120 | 143.68
285 | 135.74
264 | 98.75
これでcustomer_id毎の合計売上は出る。
●過程② ①で作った顧客IDごとに売上を合計したテーブルがcustomer_paymentというテーブル名で存在すると想像しながら書く。
SELECT AVG(total_sales), MIN(total_sales), MAX(total_sales) FROM customer_payment ; ERROR: relation "customer_payment" does not exist LINE 6: customer_payment
※当然エラーになるが、イメージはつかめる
●過程①と②を組み合わせる。
SELECT
AVG(total_sales),
MIN(total_sales),
MAX(total_sales)
FROM
(SELECT
customer_id,
SUM(amount) AS total_sales
FROM
payment
GROUP BY
customer_id
) AS customer_payment
;
avg | min | max
----------------------+-------+--------
112.5484307178631052 | 50.85 | 221.55
(1 row)
●JOINとサブクエリの使い分けなるべくインデックスが働くJOINを利用する。
サブクエリの場合は
- サブクエリの結果テーブルにはインデックスが働かない。
- ネストが深くなり、可読性が低くなる
→重くなりがち、可読性が低くなるサブクエリの欠点を解消するWITHを利用する。
SQL34. customerテーブルとpayment_p2007_05テーブルからJOINを用いて、2007年5月に支払いのあった顧客のlast_nameを抽出する
SELECT
last_name
FROM
customer AS c
INNER JOIN payment_p2007_05 AS p
ON c.customer_id = p.customer_id
;
last_name
-------------
BROWN
MOORE
ANDERSON
WHITE
HARRIS
HARRIS
CLARK
RODRIGUEZ
LEWIS
YOUNG
HERNANDEZ
(略)
一見上手くいったかのように見えるが、
2回購入した人は2回でる。DISTINCTを使って重複を削除することもできるが、
JOINを繋げていくケースでは
JOINするごとにどんどん行が膨れ上がって、
処理時間も膨大になる。
SQL35ではこれをINを使って解決する。
SQL35. INを用いて、2007年5月に支払いのあった顧客のlast_nameを抽出する。
SELECT
last_name
FROM
customer
WHERE
customer_id IN
(SELECT
customer_id
FROM
payment_p2007_05
)
;
last_name
-------------
BROWN
MOORE
ANDERSON
WHITE
HARRIS
CLARK
RODRIGUEZ
条件はcolumn IN (query)という形になり、
queryのところにSELECTクエリがかかれます。
ここでは5月に支払いのあったcsutomer_idを抽出しており、
その結果を値として親クエリではINを用いて絞りこみを行っています。
別のテーブルに存在する値だけを抽出するEXISTS
●EXISTS句の特徴
- ある外部キーが他のテーブルに存在するかを判定
- 一般的にJOINからDISTINCTを行うのに比べ高速である
- SELECT句のカラムは結果に影響せず、慣例的に1が使われることが多い
- これまでの条件演算と同様にNOT EXISTSとして、否定条件で使うこともできる。
●INとEXISTSのサブクエリの実行結果について
IN()のサブクエリ単体で実行結果が返るが、
EXISTS()のサブクエリは文法的にエラーの為実行結果は返らない。
●INとEXISTSの使い分け
- サブクエリの結果が小さくなる場合はINを使う
- 親クエリの結果が小さくなる場合はEXISTSを使う
SQL36. payment_p2007_05テーブルから、EXISTSを用いて、2007年5月に支払いのあった顧客のlast_nameを抽出する
SELECT
last_name
FROM
customer AS c
WHERE
EXISTS (
SELECT
1
FROM
payment_p2007_05 AS p
WHERE
c.customer_id = p.customer_id
)
;
last_name
-------------
BROWN
MOORE
ANDERSON
WHITE
HARRIS
CLARK
RODRIGUEZ
問い合わせの結合 UNION, INTERSECT, EXCEPT
●結果の和集合をとる UNION
2つ以上のクエリから、その結果の和集合(どちらか、もしくは両方に含まれるもの)
を取り出したい時はUNIONで繋ぎます。
SQL37. 1月と5月の支払い履歴(payment_p2007_01/05テーブル)から、どちらかに含まれるcustomer_idを抽出する
SELECT
DISTINCT customer_id
FROM
payment_p2007_01
UNION
SELECT
DISTINCT customer_id
FROM
payment_p2007_05
;
customer_id
-------------
251
106
120
264
497
452
496
455
209
276
577
550
(略)
●結果の積集合をとる INTERSECT
同様に積集合(両方に含まれるもの)をとりたい場合は、INTERSECT
SQL38. 1月と5月の支払い履歴(payment_p2007_01/05テーブル)から、両方に含まれるcustomer_idを抽出する
SELECT
DISTINCT customer_id
FROM
payment_p2007_05
INTERSECT
SELECT
DISTINCT customer_id
FROM
payment_p2007_01
;
customer_id
-------------
251
120
337
227
497
190
452
(略)
SQL39. 1, 2, 3月の支払い履歴(payment_p2007_01/02/03テーブル)から、全てに含まれるcustomer_idを抽出する。
SELECT
DISTINCT customer_id
FROM
payment_p2007_01
INTERSECT
SELECT
DISTINCT customer_id
FROM
payment_p2007_02
INTERSECT
SELECT
DISTINCT customer_id
FROM
payment_p2007_03
;
customer_id
-------------
251
106
120
497
452
496
455
209
276
(略)
●結果の差集合をとる EXCEPT
SELECT
DISTINCT customer_id
FROM
payment_p2007_05
EXCEPT
SELECT
DISTINCT customer_id
FROM
payment_p2007_01
;
customer_id
-------------
264
410
80
175
15
585
579
568
335
229
336
178
512
188
521
592
527
163
41
(19 rows)
SQL40. +1。 1月には支払いがあって、5月には支払いがなかったcustomer_idの抽出
SELECT
DISTINCT customer_id
FROM
payment_p2007_01
EXCEPT
SELECT
DISTINCT customer_id
FROM
payment_p2007_05
;
customer_id
-------------
106
455
209
276
577
599
151
(略)
(381 rows)
SQL41. payment_p2007_01/05テーブルから、1月には支払いがあって、5月には支払いがなかったcustomer_idを昇順で3件抽出する。
SELECT
DISTINCT customer_id
FROM
payment_p2007_01
EXCEPT
SELECT
DISTINCT customer_id
FROM
payment_p2007_05
ORDER BY
customer_id ASC
LIMIT 3
;
customer_id
-------------
1
2
3
(3 rows)
複雑なサブクエリを簡略化するWITH ※MySQLでは8.0から利用可能
●WITH
WITHを使えば、実行結果を1つの仮想テーブルとして扱うことが出来ます。
SQL42. WITH句を使って、payment_p2007_01テーブルから1月に7回以上支払いのあったアクティブなcustomerのemailを抽出する
WITH loyal_customer AS (
SELECT
customer_id,
COUNT(*) AS cnt
FROM
payment_p2007_01
GROUP BY
customer_id
HAVING
COUNT(*) >=7
)
SELECT
email
FROM
customer AS c
INNER JOIN loyal_customer AS lc
ON c.customer_id = lc.customer_id
WHERE c.active = 1
;
email
----------------------------------
EDNA.WEST@sakilacustomer.org
SUE.PETERS@sakilacustomer.org
LESLIE.SEWARD@sakilacustomer.org
(3 rows)
●WITHは2つ以上でも使える
WITH table1 AS ( query ), table2 AS ( query )
部分的に集計関数を適用する ウィンドウ関数
●ウィンドウ関数
構文
function_name() OVER (ORDER BY カラム名 DESC|ASC)
SQL44. ウィンドウ関数を使って、payment\p2007_01テーブルとcustomer_listテーブルから1月の利用回数が多かった顧客をその順位と一緒に表示する。
SELECT
cl.name,
COUNT(*) AS cnt,
RANK() OVER (
ORDER BY COUNT(*) DESC
) AS ranking
FROM
payment_p2007_01 AS p
INNER JOIN customer_list AS cl
ON p.customer_id = cl.id
GROUP BY
cl.name
;
name | cnt | ranking
-----------------------+-----+---------
SUE PETERS | 8 | 1
EDNA WEST | 7 | 2
LESLIE SEWARD | 7 | 2
MINNIE ROMERO | 6 | 4
BILLY POULIN | 6 | 4
CASSANDRA WALTERS | 6 | 4
NAOMI JENNINGS | 6 | 4
HEATHER MORRIS | 6 | 4
COURTNEY DAY | 6 | 4
RUTH MARTINEZ | 6 | 4
ENRIQUE FORSYTHE | 6 | 4
VICKIE BREWER | 6 | 4
GERALDINE PERKINS | 6 | 4
GERTRUDE CASTILLO | 6 | 4
JO FOWLER | 5 | 15
TIM CARY | 5 | 15
BETTY WHITE | 5 | 15
(略)
●範囲(パーティション)を指定する PARTITION
SQL44. payment_p2007_01テーブルとcustomer_listテーブルから、1月の顧客の利用回数順位を国別に表示する
SELECT
cl.id,
cl.country,
COUNT(*) AS cnt,
RANK() OVER (
PARTITION BY cl.country
ORDER BY COUNT(*) DESC
) AS rank
FROM
payment_p2007_01 AS p
INNER JOIN customer_list AS cl
ON p.customer_id = cl.id
GROUP BY
cl.id, cl.country
;
id | country | cnt | rank
-----+---------------------------------------+-----+------
176 | Algeria | 5 | 1
69 | Algeria | 2 | 2
441 | Algeria | 1 | 3
528 | Angola | 5 | 1
383 | Angola | 2 | 2
381 | Anguilla | 3 | 1
359 | Argentina | 3 | 1
89 | Argentina | 3 | 1
322 | Argentina | 2 | 3
405 | Argentina | 2 | 3
107 | Argentina | 2 | 3
445 | Argentina | 2 | 3
331 | Argentina | 2 | 3
24 | Argentina | 2 | 3
530 | Argentina | 1 | 9
243 | Argentina | 1 | 9
560 | Argentina | 1 | 9
219 | Argentina | 1 | 9
159 | Austria | 4 | 1
173 | Austria | 2 | 2
266 | Austria | 2 | 2
334 | Azerbaijan | 2 | 1
333 | Azerbaijan | 1 | 2
590 | Bahrain | 1 | 1
21 | Bangladesh | 3 | 1
332 | Bangladesh | 2 | 2
(略)
●ウィンドウ関数で平均を求める AVG
構文
AVG(カラム名)
SELECT
cl.id,
cl.country,
COUNT(*) AS cnt,
ROUND(AVG(COUNT(*)) OVER (
PARTITION BY cl.country
), 2) AS avg_pay,
RANK() OVER (
PARTITION BY cl.country
ORDER BY COUNT(*) DESC
) AS rank
FROM
payment_p2007_01 AS p
INNER JOIN customer_list AS cl
ON p.customer_id = cl.id
GROUP BY
cl.id, cl.country
;
id | country | cnt | avg_pay | rank
-----+---------------------------------------+-----+---------+------
176 | Algeria | 5 | 2.67 | 1
69 | Algeria | 2 | 2.67 | 2
441 | Algeria | 1 | 2.67 | 3
528 | Angola | 5 | 3.50 | 1
383 | Angola | 2 | 3.50 | 2
381 | Anguilla | 3 | 3.00 | 1
89 | Argentina | 3 | 1.83 | 1
359 | Argentina | 3 | 1.83 | 1
331 | Argentina | 2 | 1.83 | 3
445 | Argentina | 2 | 1.83 | 3
24 | Argentina | 2 | 1.83 | 3
107 | Argentina | 2 | 1.83 | 3
322 | Argentina | 2 | 1.83 | 3
405 | Argentina | 2 | 1.83 | 3
243 | Argentina | 1 | 1.83 | 9
530 | Argentina | 1 | 1.83 | 9
560 | Argentina | 1 | 1.83 | 9
219 | Argentina | 1 | 1.83 | 9
159 | Austria | 4 | 2.67 | 1
266 | Austria | 2 | 2.67 | 2
173 | Austria | 2 | 2.67 | 2
334 | Azerbaijan | 2 | 1.50 | 1
(略)
●ウィンドウ関数で合計を求める SUM
SQL46. ノック45のクエリを変更し、平均回数の代わりに国ごとの合計利用回数を表示する
SUM(COUNT(*))を用いて、国ごとの合計利用回数を並べて表示させます。
SELECT
cl.id,
cl.country,
COUNT(*) AS cnt,
SUM(COUNT(*)) OVER (
PARTITION BY cl.country
) AS total_pay,
RANK() OVER (
PARTITION BY cl.country
ORDER BY COUNT(*) DESC
) AS rank
FROM
payment_p2007_01 AS p
INNER JOIN customer_list AS cl
ON p.customer_id = cl.id
GROUP BY
cl.id, cl.country
;
id | country | cnt | total_pay | rank
-----+---------------------------------------+-----+-----------+------
176 | Algeria | 5 | 8 | 1
69 | Algeria | 2 | 8 | 2
441 | Algeria | 1 | 8 | 3
528 | Angola | 5 | 7 | 1
383 | Angola | 2 | 7 | 2
381 | Anguilla | 3 | 3 | 1
89 | Argentina | 3 | 22 | 1
359 | Argentina | 3 | 22 | 1
331 | Argentina | 2 | 22 | 3
445 | Argentina | 2 | 22 | 3
24 | Argentina | 2 | 22 | 3
107 | Argentina | 2 | 22 | 3
322 | Argentina | 2 | 22 | 3
405 | Argentina | 2 | 22 | 3
243 | Argentina | 1 | 22 | 9
530 | Argentina | 1 | 22 | 9
560 | Argentina | 1 | 22 | 9
219 | Argentina | 1 | 22 | 9
159 | Austria | 4 | 8 | 1
266 | Austria | 2 | 8 | 2
173 | Austria | 2 | 8 | 2
334 | Azerbaijan | 2 | 3 | 1
333 | Azerbaijan | 1 | 3 | 2
590 | Bahrain | 1 | 1 | 1
21 | Bangladesh | 3 | 5 | 1
332 | Bangladesh | 2 | 5 | 2
144 | Belarus | 2 | 2 | 1
455 | Bolivia | 2 | 3 | 1
431 | Bolivia | 1 | 3 | 2
246 | Brazil | 5 | 50 | 1
77 | Brazil | 5 | 50 | 1
551 | Brazil | 5 | 50 | 1
307 | Brazil | 4 | 50 | 4
●累積比率の為の累積回数
累積比率 = 累積利用回数 / 全体の利用回数累積比率 = 累積利用回数 / 全体の利用回数
- ROWS BETWEENで指定できる値
- UNBOUNDED PRECEDING パーティションの最初の行
- n PRECEDING n行前
- CURRENT_ROW 現在の行
- n FOLLOWING n行後
- UNBOUNDED FOLLOWING パーティションの最後の行
SQL47. 1月の国別の利用回数を降順に抽出し、累積回数と合わせて表示する。
SELECT
cl.country,
COUNT(*) AS count,
SUM(COUNT(*)) OVER (
ORDER BY COUNT(*) DESC
ROWS BETWEEN
UNBOUNDED PRECEDING
AND CURRENT ROW
) AS cumulative_count
FROM
payment_p2007_01 AS p
INNER JOIN customer_list AS cl
ON p.customer_id = cl.id
GROUP BY
cl.country
ORDER BY
count DESC
;
country | count | cumulative_count
---------------------------------------+-------+------------------
India | 111 | 111
China | 109 | 220
United States | 70 | 290
Japan | 65 | 355
Mexico | 63 | 418
Russian Federation | 60 | 478
Brazil | 50 | 528
Philippines | 38 | 566
Indonesia | 31 | 597
Nigeria | 29 | 626
Turkey | 27 | 653
South Africa | 24 | 677
Argentina | 22 | 699
Germany | 20 | 719
Taiwan | 16 | 735
United Kingdom | 16 | 751
Iran | 16 | 767
Spain | 15 | 782
Saudi Arabia | 13 | 795
Colombia | 13 | 808
South Korea | 12 | 820
Pakistan | 12 | 832
Ukraine | 11 | 843
Poland | 11 | 854
Israel | 11 | 865
Italy | 11 | 876
Vietnam | 9 | 885
Peru | 9 | 894
Egypt | 9 | 903
Congo, The Democratic Republic of the | 8 | 911
France | 8 | 919
Austria | 8 | 927
Venezuela | 8 | 935
●累積比率
SQL48. payment_p2007_01テーブルとcustomer_listテーブルから1月の国別の利用比率を降順に並べ、累積で表示する
SELECT
cl.country,
ROUND(
SUM(COUNT(*)) OVER (
ORDER BY COUNT(*) DESC
ROWS BETWEEN
UNBOUNDED PRECEDING
AND CURRENT ROW
) / SUM(COUNT(*)) OVER (),
2
) AS cumulative_percent
FROM
payment_p2007_01 AS p
INNER JOIN customer_list AS cl
ON p.customer_id = cl.id
GROUP BY
cl.country
ORDER BY
COUNT(*) DESC
;
country | cumulative_percent
---------------------------------------+--------------------
India | 0.10
China | 0.19
United States | 0.25
Japan | 0.31
Mexico | 0.36
Russian Federation | 0.41
Brazil | 0.46
Philippines | 0.49
Indonesia | 0.52
Nigeria | 0.54
Turkey | 0.56
South Africa | 0.59
Argentina | 0.60
Germany | 0.62
Taiwan | 0.64
United Kingdom | 0.65
Iran | 0.66
Spain | 0.68
Saudi Arabia | 0.69
(略)
●ウィンドウ関数で移動平均を求める
移動平均とは主に時系列データなどに対し、
ある一定区間ごとの平均値を、、区間をずらしながら求めるもの。
例えば曜日ごとの変動の大きいデータに対して7日間移動平均を計算することで、
曜日変動を抑えた傾向をみることが出来ます。
今回はpaymentテーブルを使い、4月6日から12日の利用回数の3日移動平均を見てましょう。
まずは当該っ期間について、日別の利用回数を計算します。
SQL49. paymentテーブルを使い、2007年4月6日から12日の日別利用回数を集計する。
SELECT
CAST(payment_date AS DATE) AS d,
COUNT(*)
FROM
payment AS p
WHERE
CAST(payment_date AS DATE)
BETWEEN '2007-04-06' AND '2007-04-12'
GROUP BY
d
ORDER BY
d ASC
;
d | count
------------+-------
2007-04-06 | 486
2007-04-07 | 472
2007-04-08 | 516
2007-04-09 | 514
2007-04-10 | 482
2007-04-11 | 468
2007-04-12 | 452
(7 rows)
SQL50. paymentテーブルを使い、2007年4月6日から12日の利用回数の3日移動平均を計算する。
SELECT
CAST(payment_date AS DATE) AS d,
COUNT(*),
ROUND(AVG(COUNT(*)) OVER (
ORDER BY
CAST(payment_date AS DATE) ASC
ROWS BETWEEN
2 PRECEDING
AND CURRENT ROW
), 2) AS moving_avg
FROM
payment AS p
WHERE
CAST(payment_date AS DATE)
BETWEEN '2007-04-06' AND '2007-04-12'
GROUP BY
d
ORDER BY
d ASC
;
d | count | moving_avg
------------+-------+------------
2007-04-06 | 486 | 486.00
2007-04-07 | 472 | 479.00
2007-04-08 | 516 | 491.33
2007-04-09 | 514 | 500.67
2007-04-10 | 482 | 504.00
2007-04-11 | 468 | 488.00
2007-04-12 | 452 | 467.33
(7 rows)



![[MysQL]オンラインDDL](https://www.yuulinux.tokyo/contents/wp-content/uploads/2020/04/mysql_logo-150x150.jpg)