購入の価値ありです。
もくじ
SQL1. SELECT
postgres=# SELECT * FROM payment; payment_id | customer_id | staff_id | rental_id | amount | payment_date ------------+-------------+----------+-----------+--------+---------------------------- 16050 | 269 | 2 | 7 | 1.99 | 2007-01-24 21:40:19.996577 16051 | 269 | 1 | 98 | 0.99 | 2007-01-25 15:16:50.996577 16052 | 269 | 2 | 678 | 6.99 | 2007-01-28 21:44:14.996577 16053 | 269 | 2 | 703 | 0.99 | 2007-01-29 00:58:02.996577 16054 | 269 | 1 | 750 | 4.99 | 2007-01-29 08:10:06.996577 16055 | 269 | 2 | 1099 | 2.99 | 2007-01-31 12:23:14.996577 16056 | 270 | 1 | 193 | 1.99 | 2007-01-26 05:10:14.996577 16057 | 270 | 1 | 1040 | 4.99 | 2007-01-31 04:03:42.996577 16058 | 271 | 1 | 1096 | 8.99 | 2007-01-31 11:59:15.996577 16059 | 272 | 1 | 33 | 0.99 | 2007-01-25 02:47:17.996577 (略) postgres=# SELECT * FROM customer; customer_id | store_id | first_name | last_name | email | address_id | activebool | create_date | last_update | active -------------+----------+-------------+--------------+------------------------------------------+------------+------------+-------------+---------------------+-------- 1 | 1 | MARY | SMITH | MARY.SMITH@sakilacustomer.org | 5 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1 2 | 1 | PATRICIA | JOHNSON | PATRICIA.JOHNSON@sakilacustomer.org | 6 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1 3 | 1 | LINDA | WILLIAMS | LINDA.WILLIAMS@sakilacustomer.org | 7 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1 4 | 2 | BARBARA | JONES | BARBARA.JONES@sakilacustomer.org | 8 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1 5 | 1 | ELIZABETH | BROWN | ELIZABETH.BROWN@sakilacustomer.org | 9 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1 6 | 2 | JENNIFER | DAVIS | JENNIFER.DAVIS@sakilacustomer.org | 10 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1 7 | 1 | MARIA | MILLER | MARIA.MILLER@sakilacustomer.org | 11 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1 8 | 2 | SUSAN | WILSON | SUSAN.WILSON@sakilacustomer.org | 12 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1 9 | 2 | MARGARET | MOORE | MARGARET.MOORE@sakilacustomer.org | 13 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1 10 | 1 | DOROTHY | TAYLOR | DOROTHY.TAYLOR@sakilacustomer.org | 14 | t | 2006-02-14 | 2006-02-15 09:57:20 | 1 (略)
SQL2. 条件を指定する WHERE
SELECT payment_id, customer_id FROM payment WHERE customer_id = 1; payment_id | customer_id ------------+------------- 16677 | 1 16678 | 1 18495 | 1 18496 | 1 18497 | 1 18498 | 1 18499 | 1 18500 | 1 18501 | 1 22680 | 1 (略)
SQL3. cutomerテーブルの名前(first_name)がKELLYのレコードを抽出
SELECT first_name, last_name FROM customer WHERE first_name = 'KELLY'; first_name | last_name ------------+----------- KELLY | TORRES KELLY | KNOTT (2 rows)
SQL4. customerテーブルから、名字(last_name)がKNOTTで名前がKELLYの顧客を抽出する
SELECT first_name, last_name FROM customer WHERE first_name = 'KELLY' AND last_name = 'KNOTT' ; first_name | last_name ------------+----------- KELLY | KNOTT (1 row)
SQL5. customerテーブルから、名前がKELLYもしくはMARIAの顧客を抽出する
SELECT first_name, last_name FROM customer WHERE first_name = 'KELLY' OR first_name = 'MARIA' ; first_name | last_name ------------+----------- MARIA | MILLER KELLY | TORRES KELLY | KNOTT (3 rows)
SQL6. customerテーブルから、名前がKELLYやMARIA以外の顧客を抽出する
SELECT first_name, last_name FROM customer WHERE NOT ( first_name = 'KELLY' OR first_name = 'MARIA' ) ; first_name | last_name -------------+-------------- MARY | SMITH PATRICIA | JOHNSON LINDA | WILLIAMS BARBARA | JONES ELIZABETH | BROWN JENNIFER | DAVIS SUSAN | WILSON MARGARET | MOORE DOROTHY | TAYLOR LISA | ANDERSON NANCY | THOMAS KAREN | JACKSON (略)
SQL7. OR条件を列挙して記述できるIN句。cutomerテーブルから、名前がAARON, ADAM, ANNの顧客を抽出する
SELECT first_name, last_name FROM customer WHERE first_name IN ('AARON', 'ADAM', 'ANN') ; first_name | last_name ------------+----------- ANN | EVANS ADAM | GOOCH AARON | SELBY (3 rows)
SQL8. 比較演算子 paymentテーブルから支払額(amount)が6.99ドル以上のレコードを抽出する
SELECT payment_id, amount FROM payment WHERE amount >= 6.99 ; payment_id | amount ------------+-------- 16052 | 6.99 16058 | 8.99 16060 | 6.99 16061 | 6.99 16073 | 10.99 16074 | 6.99 16082 | 6.99 16086 | 6.99 16087 | 6.99 16092 | 6.9 (略)
SQL9. paymentテーブルから、支払額(amount)が0.99ドル以外のテーブルを抽出
SELECT payment_id, amount FROM payment WHERE amount != 0.99 ; payment_id | amount ------------+-------- 16050 | 1.99 16052 | 6.99 16054 | 4.99 16055 | 2.99 16056 | 1.99 16057 | 4.99 16058 | 8.99
SQL11. rentalテーブルのreturn_dateがNULLではないレコードを抽出する
SELECT rental_id, return_date FROM rental WHERE return_date IS NOT NULL ; rental_id | return_date -----------+--------------------- 2 | 2005-05-28 19:40:33 3 | 2005-06-01 22:12:39 4 | 2005-06-03 01:43:41 5 | 2005-06-02 04:33:21 6 | 2005-05-27 01:32:07 7 | 2005-05-29 20:34:53 8 | 2005-05-27 23:33:46 (略)
SQL12. BETWEEN演算子 customテーブルから、顧客IDが11から13の顧客をBETWEENを使って抽出
SELECT customer_id, first_name, last_name FROM customer WHERE customer_id BETWEEN 11 AND 13 ; customer_id | first_name | last_name -------------+------------+----------- 11 | LISA | ANDERSON 12 | NANCY | THOMAS 13 | KAREN | JACKSON (3 rows)
SQL13. LIKE演算子 filmテーブルのdescriptionにAmazingが含まれているレコードを抽出
SELECT title, description FROM film WHERE description LIKE '%Amazing%' ; title | description -----------------------+-------------------------------------------------------------------------------------------------------------------------------- ANNIE IDENTITY | A Amazing Panorama of a Pastry Chef And a Boat who must Escape a Woman in An Abandoned Amusement Park ANONYMOUS HUMAN | A Amazing Reflection of a Database Administrator And a Astronaut who must Outrace a Database Administrator in A Shark Tank BRANNIGAN SUNRISE | A Amazing Epistle of a Moose And a Crocodile who must Outrace a Dog in Berlin BUCKET BROTHERHOOD | A Amazing Display of a Girl And a Womanizer who must Succumb a Lumberjack in A Baloon Factory BULWORTH COMMANDMENTS | A Amazing Display of a Mad Cow And a Pioneer who must Redeem a Sumo Wrestler in The Outback CARRIE BUNCH | A Amazing Epistle of a Student And a Astronaut who must Discover a Frisbee in The Canadian Rockies CASABLANCA SUPER | A Amazing Panorama of a Crocodile And a Forensic Psychologist who must Pursue a Secret Agent in The First Manned Space Station CELEBRITY HORN | A Amazing Documentary of a Secret Agent And a Astronaut who must Vanquish a Hunter in A Shark Tank CHAMPION FLATLINERS | A Amazing Story of a Mad Cow And a Dog who must Kill a Husband in A Monastery CLASH FREDDY | A Amazing Yarn of a Composer And a Squirrel who must Escape a Astronaut in Australia CLONES PINOCCHIO | A Amazing Drama of a Car And a Robot who must Pursue a Dentist in New Orleans DEEP CRUSADE | A Amazing Tale of a Crocodile And a Squirrel who must Discover a Composer in Australia EARLY HOME | A Amazing Panorama of a Mad Scientist And a Husband who must Meet a Woman in The Outback (略)
SQL14. NOT LIKE filmテーブルのdescriptionにAmazingが含まれていないレコードの抽出
SELECT title, description FROM film WHERE description NOT LIKE '%Amaziong%' ; title | description -----------------------------+------------------------------------------------------------------------------------------------------------------------------------ ACADEMY DINOSAUR | A Epic Drama of a Feminist And a Mad Scientist who must Battle a Teacher in The Canadian Rockies ACE GOLDFINGER | A Astounding Epistle of a Database Administrator And a Explorer who must Find a Car in Ancient China ADAPTATION HOLES | A Astounding Reflection of a Lumberjack And a Car who must Sink a Lumberjack in A Baloon Factory AFFAIR PREJUDICE | A Fanciful Documentary of a Frisbee And a Lumberjack who must Chase a Monkey in A Shark Tank AFRICAN EGG | A Fast-Paced Documentary of a Pastry Chef And a Dentist who must Pursue a Forensic Psychologist in The Gulf of Mexico AGENT TRUMAN | A Intrepid Panorama of a Robot And a Boy who must Escape a Sumo Wrestler in Ancient China AIRPLANE SIERRA | A Touching Saga of a Hunter And a Butler who must Discover a Butler in A Jet Boat AIRPORT POLLOCK | A Epic Tale of a Moose And a Girl who must Confront a Monkey in Ancient India (略)
SQL15. COUNT() paymentテーブルの総件数を求める
SELECT COUNT(*) FROM payment; count ------- 16049 (1 row)
SQL16. DISTINCT 重複を除外。 paymentテーブルから、支払いを行ったユニークな顧客IDを求める
SELECT DISTINCT customer_id FROM payment ; customer_id ------------- 251 106 120 285 264 497 452 496 455 209 (略)
SQL17. paymentテーブルから、支払いを行ったユニークな顧客数を求める
SELECT COUNT(DISTINCT customer_id) FROM payment ; count ------- 599 (1 row)
SQL18. ORDER BY 順序を指定する。 customerテーブルから、顧客の名字を昇順に出力する
SELECT customer_id, last_name FROM customer ORDER BY last_name ; customer_id | last_name -------------+-------------- 505 | ABNEY 504 | ADAM 36 | ADAMS 96 | ALEXANDER 470 | ALLARD 27 | ALLEN 220 | ALVAREZ 11 | ANDERSON 326 | ANDREW 183 | ANDREWS 449 | AQUINO 368 | ARCE 560 | ARCHULETA 188 | ARMSTRONG (略)
SQL20. LIMIT句。 customerテーブルから、最近登録された顧客3名を抽出する
SELECT first_name, last_name FROM customer ORDER BY customer_id DESC LIMIT 3 ; first_name | last_name ------------+----------- AUSTIN | CINTRON WADE | DELVALLE FREDDIE | DUGGAN (3 rows)
SQL21. GROUP BY句。paymentテーブルから、これまでの累計で支払い回数が多い顧客の上位3人の顧客IDを抽出する
SELECT customer_id, COUNT(*) AS payment_count FROM payment GROUP BY customer_id ORDER BY payment_count DESC LIMIT 3 ; customer_id | payment_count -------------+--------------- 148 | 46 526 | 45 236 | 42 (3 rows)
SQL22. ROUND句。paymentテーブルの売上金額(amount)をドルから円に変換して、”109″のように小数点以下を四捨五入した形で抽出。1ドル110円とする。
SELECT ROUND(amount * 110) AS amount_yen FROM payment LIMIT 3 ; amount_yen ------------ 219 109 769 (3 rows)
SQL23. CONCAT句。paymentテーブルの売上金額(amount)をドルから円に変換して、”109yen”のように小数点以下を四捨五入して単位をつける。
SELECT CONCAT( ROUND(amount * 110), 'yen' ) AS amount_yen FROM payment LIMIT 3 ; amount_yen ------------ 219yen 109yen 769yen (3 rows)
結合
SQL24. LEFT JOIN句。paymentテーブルにcustomer_idでひも付くcustomerテーブルを結合し、payment_id, last_name, first_nameカラムのデータを抽出する
SELECT payment_id, last_name, first_name FROM payment LEFT JOIN customer ON payment.customer_id = customer.customer_id ;
または
SELECT payment_id, last_name, first_name FROM payment LEFT JOIN customer USING(customer_id) ; payment_id | last_name | first_name ------------+--------------+------------- 16050 | WALTERS | CASSANDRA 16051 | WALTERS | CASSANDRA 16052 | WALTERS | CASSANDRA 16053 | WALTERS | CASSANDRA 16054 | WALTERS | CASSANDRA 16055 | WALTERS | CASSANDRA 16056 | CURTIS | LEAH 16057 | CURTIS | LEAH 16058 | NEAL | PENNY 16059 | CALDWELL | KAY 16060 | CALDWELL | KAY 16061 | CALDWELL | KAY 16062 | CALDWELL | KAY 16063 | LOWE | PRISCILLA (略)
SQL25. INNER JOIN句。paymentテーブルから、顧客名がBRIAN WYMANの支払いデータを抽出する
SELECT payment_id, payment.customer_id, amount FROM payment INNER JOIN customer ON payment.customer_id = customer.customer_id WHERE first_name = 'BRIAN' AND last_name = 'WYMAN' ; payment_id | customer_id | amount ------------+-------------+-------- 16160 | 318 | 9.99 17400 | 318 | 2.99 17401 | 318 | 2.99 17402 | 318 | 0.99 17403 | 318 | 7.99 20001 | 318 | 2.99 25744 | 318 | 4.99 25745 | 318 | 2.99 25746 | 318 | 8.99 25747 | 318 | 0.99 25748 | 318 | 0.99 25749 | 318 | 5.99 (12 rows)
SQL26. WHERE >GROUP BY > HAVINGの優先順位。 filmテーブル, film_categoryテーブル, categoryテーブルから、カテゴリ名ごとに映画の作品数を集計し、65件件以上のものを抽出する
SELECT category.name AS name, COUNT(category.name) AS film_cnt FROM film INNER JOIN film_category USING(film_id) INNER JOIN category USING(category_id) WHERE category.name IN( 'Sports', 'Games', 'Travel' ) GROUP BY category.name HAVING COUNT(category.name) > 60 ORDER BY film_cnt DESC ; name | film_cnt --------+---------- Sports | 74 Games | 61 (2 rows)
SQL27. CASE句。paymentテーブルで、支払い額が5を超える場合はexpensive, 1を超える場合はmodest, そうでなければcheapとして一覧表示する
SELECT payment_id, amount, CASE WHEN amount > 5 THEN 'expensive' WHEN amount > 1 THEN 'modest' ELSE 'cheap' END AS price_range FROM payment ; payment_id | amount | price_range ------------+--------+------------- 16050 | 1.99 | modest 16051 | 0.99 | cheap 16052 | 6.99 | expensive 16053 | 0.99 | cheap 16054 | 4.99 | modest 16055 | 2.99 | modest 16056 | 1.99 | modest 16057 | 4.99 | modest 16058 | 8.99 | expensive 16059 | 0.99 | cheap 16060 | 6.99 | expensive (略)
SQL28. POSIX正規表現。filmテーブルのdescriptionに入っているThoughtfulまたは、Insightfulのレコード数を求める。
PostgreSQLの場合
SELECT COUNT(*) FROM film WHERE description ~ '(Thou|Insi)ghtful' ;
MySQL
SELECT COUNT(*) FROM film WHERE description REGEXP ~ '(Thou|Insi)ghtful' ;
正規表現
- ~ 正規表現に一致し、大文字小文字の区別を行う
- ~* 正規表現に一致し、大文字小文字の区別をしない
- !~ 正規表現に一致しない、大文字小文字の区別を行う
- !~* 正規表現に一致しない、大文字小文字の区別をしない
集計関数
- COUNT レコード数
- SUM 合計
- AVG 平均値
- MAX 最大値
- MIN 最小値
●SUM
SELECT SUM(amount) AS total_sales FROM payment ; total_sales ------------- 67416.51 (1 row)
●SUM + GROUP BY
SELECT customer_id, SUM(amount) AS total_sales FROM payment GROUP BY customer_id ; customer_id | total_sales -------------+------------- 251 | 120.69 106 | 100.77 120 | 143.68 285 | 135.74 264 | 98.75 497 | 129.72 452 | 107.68 496 | 88.79 455 | 87.76
SQL29. paymentテーブルから、支払い額上位5名の顧客データを抽出する
SELECT customer_id, SUM(amount) AS total_sales FROM payment GROUP BY customer_id ORDER BY total_sales DESC LIMIT 5 ; customer_id | total_sales -------------+------------- 526 | 221.55 148 | 216.54 144 | 195.58 137 | 194.61 178 | 194.61 (5 rows)
日付の条件指定
CAST(変換したいカラム AS 変換後のデータ型)
SQL30. CAST句。paymentテーブルから、日付ごとの売上金額を集計する。
SELECT CAST(payment_date AS DATE) AS p_date, SUM(amount) AS total_sales FROM payment GROUP BY p_date ORDER BY p_date ; p_date | total_sales ------------+------------- 2007-01-24 | 86.81 2007-01-25 | 568.61 2007-01-26 | 743.30 2007-01-27 | 708.27 2007-01-28 | 793.10 2007-01-29 | 655.42 2007-01-30 | 622.41 2007-01-31 | 646.51 2007-02-14 | 152.63 2007-02-15 | 1376.49 (略)
成功!
SELECT payment_date, SUM(amount) AS total_sales FROM payment GROUP BY payment_date ORDER BY payment_date ; payment_date | total_sales ----------------------------+------------- 2007-01-24 21:21:56.996577 | 2.99 2007-01-24 21:22:59.996577 | 2.99 2007-01-24 21:32:05.996577 | 3.99 2007-01-24 21:33:07.996577 | 4.99 2007-01-24 21:33:47.996577 | 6.99 2007-01-24 21:36:33.996577 | 0.99 2007-01-24 21:40:19.996577 | 1.99 2007-01-24 22:00:12.996577 | 4.99 (略)
payment_dateがtimestamp型のため、
日付ごとでは集計が上手くいかない!!!!
SQL31. EXTRACT句。paymentテーブルから、月別の売上金額を集計する。
SELECT EXTRACT(YEAR FROM payment_date) As YYYY, EXTRACT(MONTH FROM payment_date) AS mm, SUM(amount) AS total_sales FROM payment GROUP BY yyyy, mm ORDER BY mm ; yyyy | mm | total_sales ------+----+------------- 2007 | 1 | 4824.43 2007 | 2 | 9631.88 2007 | 3 | 23886.56 2007 | 4 | 28559.46 2007 | 5 | 514.18 (5 rows)
または、
●LEFT句 文字列から文字数分の文字を左から切り出す関数。
LEFT(文字列, 文字数)
SELECT LEFT( CAST(payment_date AS VARCHAR), 7 ) AS yyyymm, SUM(amount) AS total_sales FROM payment GROUP BY yyyymm ORDER BY yyyymm ; yyyymm | total_sales ---------+------------- 2007-01 | 4824.43 2007-02 | 9631.88 2007-03 | 23886.56 2007-04 | 28559.46 2007-05 | 514.18 (5 rows)
SQL32. 日付の条件指定。paymentテーブルから、2007年1月の売上データを抽出する。
●よくある間違い
SELECT SUM(amount) AS total_sales FROM payment WHERE payment_date >= '2007-01-01' AND payment_date <= '2007-01-31' ; total_sales ------------- 4177.92 ←少ない。SQL31参照 (1 row)
2007-01-31のデータが含まれていない結果になっています。
●理由
WHERE payment_date <= ‘2007-01-31’
これは、2007-01-31 00:00:00までのデータを取得するから。
●正しい方法①
SELECT SUM(amount) AS total_sales FROM payment WHERE payment_date >= '2007-01-01' AND payment_date < '2007-02-01' ; total_sales ------------- 4824.43 (1 row)
●正しい方法② BETWEENを利用する
SELECT SUM(amount) AS total_sales FROM payment WHERE CAST(payment_date AS DATE) BETWEEN '2007-01-01' AND '2007-01-31' ; total_sales ------------- 4824.43 (1 row)
●正しい方法③
SELECT SUM(amount) AS total_sales FROM payment WHERE payment_date >= '2007-01-01' AND CAST(payment_date AS DATE) <= '2007-01-31' ; total_sales ------------- 4824.43 (1 row)
●正しい方法④ EXTRACT
SELECT SUM(amount) AS total_sales FROM payment WHERE EXTRACT(YEAR FROM payment_date) = 2007 AND EXTRACT(MONTH FROM payment_date) = 1 ; total_sales ------------- 4824.43 (1 row)
複数のクエリを組み合わせた抽出 FROM, IN, EXISTS
SQL33. paymentテーブルから顧客IDごとに累計売上を合計し、1顧客あたりの平均売上、最低売上、最高売上を求める。
●過程① 顧客IDごとに累計売上を合計する
SELECT customer_id, SUM(amount) AS total_sales FROM payment GROUP BY customer_id ; customer_id | total_sales -------------+------------- 251 | 120.69 106 | 100.77 120 | 143.68 285 | 135.74 264 | 98.75
これでcustomer_id毎の合計売上は出る。
●過程② ①で作った顧客IDごとに売上を合計したテーブルがcustomer_paymentというテーブル名で存在すると想像しながら書く。
SELECT AVG(total_sales), MIN(total_sales), MAX(total_sales) FROM customer_payment ; ERROR: relation "customer_payment" does not exist LINE 6: customer_payment
※当然エラーになるが、イメージはつかめる
●過程①と②を組み合わせる。
SELECT AVG(total_sales), MIN(total_sales), MAX(total_sales) FROM (SELECT customer_id, SUM(amount) AS total_sales FROM payment GROUP BY customer_id ) AS customer_payment ; avg | min | max ----------------------+-------+-------- 112.5484307178631052 | 50.85 | 221.55 (1 row)
●JOINとサブクエリの使い分けなるべくインデックスが働くJOINを利用する。
サブクエリの場合は
- サブクエリの結果テーブルにはインデックスが働かない。
- ネストが深くなり、可読性が低くなる
→重くなりがち、可読性が低くなるサブクエリの欠点を解消するWITHを利用する。
SQL34. customerテーブルとpayment_p2007_05テーブルからJOINを用いて、2007年5月に支払いのあった顧客のlast_nameを抽出する
SELECT last_name FROM customer AS c INNER JOIN payment_p2007_05 AS p ON c.customer_id = p.customer_id ; last_name ------------- BROWN MOORE ANDERSON WHITE HARRIS HARRIS CLARK RODRIGUEZ LEWIS YOUNG HERNANDEZ (略)
一見上手くいったかのように見えるが、
2回購入した人は2回でる。DISTINCTを使って重複を削除することもできるが、
JOINを繋げていくケースでは
JOINするごとにどんどん行が膨れ上がって、
処理時間も膨大になる。
SQL35ではこれをINを使って解決する。
SQL35. INを用いて、2007年5月に支払いのあった顧客のlast_nameを抽出する。
SELECT last_name FROM customer WHERE customer_id IN (SELECT customer_id FROM payment_p2007_05 ) ; last_name ------------- BROWN MOORE ANDERSON WHITE HARRIS CLARK RODRIGUEZ
条件はcolumn IN (query)という形になり、
queryのところにSELECTクエリがかかれます。
ここでは5月に支払いのあったcsutomer_idを抽出しており、
その結果を値として親クエリではINを用いて絞りこみを行っています。
別のテーブルに存在する値だけを抽出するEXISTS
●EXISTS句の特徴
- ある外部キーが他のテーブルに存在するかを判定
- 一般的にJOINからDISTINCTを行うのに比べ高速である
- SELECT句のカラムは結果に影響せず、慣例的に1が使われることが多い
- これまでの条件演算と同様にNOT EXISTSとして、否定条件で使うこともできる。
●INとEXISTSのサブクエリの実行結果について
IN()のサブクエリ単体で実行結果が返るが、
EXISTS()のサブクエリは文法的にエラーの為実行結果は返らない。
●INとEXISTSの使い分け
- サブクエリの結果が小さくなる場合はINを使う
- 親クエリの結果が小さくなる場合はEXISTSを使う
SQL36. payment_p2007_05テーブルから、EXISTSを用いて、2007年5月に支払いのあった顧客のlast_nameを抽出する
SELECT last_name FROM customer AS c WHERE EXISTS ( SELECT 1 FROM payment_p2007_05 AS p WHERE c.customer_id = p.customer_id ) ; last_name ------------- BROWN MOORE ANDERSON WHITE HARRIS CLARK RODRIGUEZ
問い合わせの結合 UNION, INTERSECT, EXCEPT
●結果の和集合をとる UNION
2つ以上のクエリから、その結果の和集合(どちらか、もしくは両方に含まれるもの)
を取り出したい時はUNIONで繋ぎます。
SQL37. 1月と5月の支払い履歴(payment_p2007_01/05テーブル)から、どちらかに含まれるcustomer_idを抽出する
SELECT DISTINCT customer_id FROM payment_p2007_01 UNION SELECT DISTINCT customer_id FROM payment_p2007_05 ; customer_id ------------- 251 106 120 264 497 452 496 455 209 276 577 550 (略)
●結果の積集合をとる INTERSECT
同様に積集合(両方に含まれるもの)をとりたい場合は、INTERSECT
SQL38. 1月と5月の支払い履歴(payment_p2007_01/05テーブル)から、両方に含まれるcustomer_idを抽出する
SELECT DISTINCT customer_id FROM payment_p2007_05 INTERSECT SELECT DISTINCT customer_id FROM payment_p2007_01 ; customer_id ------------- 251 120 337 227 497 190 452 (略)
SQL39. 1, 2, 3月の支払い履歴(payment_p2007_01/02/03テーブル)から、全てに含まれるcustomer_idを抽出する。
SELECT DISTINCT customer_id FROM payment_p2007_01 INTERSECT SELECT DISTINCT customer_id FROM payment_p2007_02 INTERSECT SELECT DISTINCT customer_id FROM payment_p2007_03 ; customer_id ------------- 251 106 120 497 452 496 455 209 276 (略)
●結果の差集合をとる EXCEPT
SELECT DISTINCT customer_id FROM payment_p2007_05 EXCEPT SELECT DISTINCT customer_id FROM payment_p2007_01 ; customer_id ------------- 264 410 80 175 15 585 579 568 335 229 336 178 512 188 521 592 527 163 41 (19 rows)
SQL40. +1。 1月には支払いがあって、5月には支払いがなかったcustomer_idの抽出
SELECT DISTINCT customer_id FROM payment_p2007_01 EXCEPT SELECT DISTINCT customer_id FROM payment_p2007_05 ; customer_id ------------- 106 455 209 276 577 599 151 (略) (381 rows)
SQL41. payment_p2007_01/05テーブルから、1月には支払いがあって、5月には支払いがなかったcustomer_idを昇順で3件抽出する。
SELECT DISTINCT customer_id FROM payment_p2007_01 EXCEPT SELECT DISTINCT customer_id FROM payment_p2007_05 ORDER BY customer_id ASC LIMIT 3 ; customer_id ------------- 1 2 3 (3 rows)
複雑なサブクエリを簡略化するWITH ※MySQLでは8.0から利用可能
●WITH
WITHを使えば、実行結果を1つの仮想テーブルとして扱うことが出来ます。
SQL42. WITH句を使って、payment_p2007_01テーブルから1月に7回以上支払いのあったアクティブなcustomerのemailを抽出する
WITH loyal_customer AS ( SELECT customer_id, COUNT(*) AS cnt FROM payment_p2007_01 GROUP BY customer_id HAVING COUNT(*) >=7 ) SELECT email FROM customer AS c INNER JOIN loyal_customer AS lc ON c.customer_id = lc.customer_id WHERE c.active = 1 ; email ---------------------------------- EDNA.WEST@sakilacustomer.org SUE.PETERS@sakilacustomer.org LESLIE.SEWARD@sakilacustomer.org (3 rows)
●WITHは2つ以上でも使える
WITH table1 AS ( query ), table2 AS ( query )
部分的に集計関数を適用する ウィンドウ関数
●ウィンドウ関数
構文
function_name() OVER (ORDER BY カラム名 DESC|ASC)
SQL44. ウィンドウ関数を使って、payment\p2007_01テーブルとcustomer_listテーブルから1月の利用回数が多かった顧客をその順位と一緒に表示する。
SELECT cl.name, COUNT(*) AS cnt, RANK() OVER ( ORDER BY COUNT(*) DESC ) AS ranking FROM payment_p2007_01 AS p INNER JOIN customer_list AS cl ON p.customer_id = cl.id GROUP BY cl.name ; name | cnt | ranking -----------------------+-----+--------- SUE PETERS | 8 | 1 EDNA WEST | 7 | 2 LESLIE SEWARD | 7 | 2 MINNIE ROMERO | 6 | 4 BILLY POULIN | 6 | 4 CASSANDRA WALTERS | 6 | 4 NAOMI JENNINGS | 6 | 4 HEATHER MORRIS | 6 | 4 COURTNEY DAY | 6 | 4 RUTH MARTINEZ | 6 | 4 ENRIQUE FORSYTHE | 6 | 4 VICKIE BREWER | 6 | 4 GERALDINE PERKINS | 6 | 4 GERTRUDE CASTILLO | 6 | 4 JO FOWLER | 5 | 15 TIM CARY | 5 | 15 BETTY WHITE | 5 | 15 (略)
●範囲(パーティション)を指定する PARTITION
SQL44. payment_p2007_01テーブルとcustomer_listテーブルから、1月の顧客の利用回数順位を国別に表示する
SELECT cl.id, cl.country, COUNT(*) AS cnt, RANK() OVER ( PARTITION BY cl.country ORDER BY COUNT(*) DESC ) AS rank FROM payment_p2007_01 AS p INNER JOIN customer_list AS cl ON p.customer_id = cl.id GROUP BY cl.id, cl.country ; id | country | cnt | rank -----+---------------------------------------+-----+------ 176 | Algeria | 5 | 1 69 | Algeria | 2 | 2 441 | Algeria | 1 | 3 528 | Angola | 5 | 1 383 | Angola | 2 | 2 381 | Anguilla | 3 | 1 359 | Argentina | 3 | 1 89 | Argentina | 3 | 1 322 | Argentina | 2 | 3 405 | Argentina | 2 | 3 107 | Argentina | 2 | 3 445 | Argentina | 2 | 3 331 | Argentina | 2 | 3 24 | Argentina | 2 | 3 530 | Argentina | 1 | 9 243 | Argentina | 1 | 9 560 | Argentina | 1 | 9 219 | Argentina | 1 | 9 159 | Austria | 4 | 1 173 | Austria | 2 | 2 266 | Austria | 2 | 2 334 | Azerbaijan | 2 | 1 333 | Azerbaijan | 1 | 2 590 | Bahrain | 1 | 1 21 | Bangladesh | 3 | 1 332 | Bangladesh | 2 | 2 (略)
●ウィンドウ関数で平均を求める AVG
構文
AVG(カラム名)
SELECT cl.id, cl.country, COUNT(*) AS cnt, ROUND(AVG(COUNT(*)) OVER ( PARTITION BY cl.country ), 2) AS avg_pay, RANK() OVER ( PARTITION BY cl.country ORDER BY COUNT(*) DESC ) AS rank FROM payment_p2007_01 AS p INNER JOIN customer_list AS cl ON p.customer_id = cl.id GROUP BY cl.id, cl.country ; id | country | cnt | avg_pay | rank -----+---------------------------------------+-----+---------+------ 176 | Algeria | 5 | 2.67 | 1 69 | Algeria | 2 | 2.67 | 2 441 | Algeria | 1 | 2.67 | 3 528 | Angola | 5 | 3.50 | 1 383 | Angola | 2 | 3.50 | 2 381 | Anguilla | 3 | 3.00 | 1 89 | Argentina | 3 | 1.83 | 1 359 | Argentina | 3 | 1.83 | 1 331 | Argentina | 2 | 1.83 | 3 445 | Argentina | 2 | 1.83 | 3 24 | Argentina | 2 | 1.83 | 3 107 | Argentina | 2 | 1.83 | 3 322 | Argentina | 2 | 1.83 | 3 405 | Argentina | 2 | 1.83 | 3 243 | Argentina | 1 | 1.83 | 9 530 | Argentina | 1 | 1.83 | 9 560 | Argentina | 1 | 1.83 | 9 219 | Argentina | 1 | 1.83 | 9 159 | Austria | 4 | 2.67 | 1 266 | Austria | 2 | 2.67 | 2 173 | Austria | 2 | 2.67 | 2 334 | Azerbaijan | 2 | 1.50 | 1 (略)
●ウィンドウ関数で合計を求める SUM
SQL46. ノック45のクエリを変更し、平均回数の代わりに国ごとの合計利用回数を表示する
SUM(COUNT(*))を用いて、国ごとの合計利用回数を並べて表示させます。
SELECT cl.id, cl.country, COUNT(*) AS cnt, SUM(COUNT(*)) OVER ( PARTITION BY cl.country ) AS total_pay, RANK() OVER ( PARTITION BY cl.country ORDER BY COUNT(*) DESC ) AS rank FROM payment_p2007_01 AS p INNER JOIN customer_list AS cl ON p.customer_id = cl.id GROUP BY cl.id, cl.country ; id | country | cnt | total_pay | rank -----+---------------------------------------+-----+-----------+------ 176 | Algeria | 5 | 8 | 1 69 | Algeria | 2 | 8 | 2 441 | Algeria | 1 | 8 | 3 528 | Angola | 5 | 7 | 1 383 | Angola | 2 | 7 | 2 381 | Anguilla | 3 | 3 | 1 89 | Argentina | 3 | 22 | 1 359 | Argentina | 3 | 22 | 1 331 | Argentina | 2 | 22 | 3 445 | Argentina | 2 | 22 | 3 24 | Argentina | 2 | 22 | 3 107 | Argentina | 2 | 22 | 3 322 | Argentina | 2 | 22 | 3 405 | Argentina | 2 | 22 | 3 243 | Argentina | 1 | 22 | 9 530 | Argentina | 1 | 22 | 9 560 | Argentina | 1 | 22 | 9 219 | Argentina | 1 | 22 | 9 159 | Austria | 4 | 8 | 1 266 | Austria | 2 | 8 | 2 173 | Austria | 2 | 8 | 2 334 | Azerbaijan | 2 | 3 | 1 333 | Azerbaijan | 1 | 3 | 2 590 | Bahrain | 1 | 1 | 1 21 | Bangladesh | 3 | 5 | 1 332 | Bangladesh | 2 | 5 | 2 144 | Belarus | 2 | 2 | 1 455 | Bolivia | 2 | 3 | 1 431 | Bolivia | 1 | 3 | 2 246 | Brazil | 5 | 50 | 1 77 | Brazil | 5 | 50 | 1 551 | Brazil | 5 | 50 | 1 307 | Brazil | 4 | 50 | 4
●累積比率の為の累積回数
累積比率 = 累積利用回数 / 全体の利用回数累積比率 = 累積利用回数 / 全体の利用回数
- ROWS BETWEENで指定できる値
- UNBOUNDED PRECEDING パーティションの最初の行
- n PRECEDING n行前
- CURRENT_ROW 現在の行
- n FOLLOWING n行後
- UNBOUNDED FOLLOWING パーティションの最後の行
SQL47. 1月の国別の利用回数を降順に抽出し、累積回数と合わせて表示する。
SELECT cl.country, COUNT(*) AS count, SUM(COUNT(*)) OVER ( ORDER BY COUNT(*) DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS cumulative_count FROM payment_p2007_01 AS p INNER JOIN customer_list AS cl ON p.customer_id = cl.id GROUP BY cl.country ORDER BY count DESC ; country | count | cumulative_count ---------------------------------------+-------+------------------ India | 111 | 111 China | 109 | 220 United States | 70 | 290 Japan | 65 | 355 Mexico | 63 | 418 Russian Federation | 60 | 478 Brazil | 50 | 528 Philippines | 38 | 566 Indonesia | 31 | 597 Nigeria | 29 | 626 Turkey | 27 | 653 South Africa | 24 | 677 Argentina | 22 | 699 Germany | 20 | 719 Taiwan | 16 | 735 United Kingdom | 16 | 751 Iran | 16 | 767 Spain | 15 | 782 Saudi Arabia | 13 | 795 Colombia | 13 | 808 South Korea | 12 | 820 Pakistan | 12 | 832 Ukraine | 11 | 843 Poland | 11 | 854 Israel | 11 | 865 Italy | 11 | 876 Vietnam | 9 | 885 Peru | 9 | 894 Egypt | 9 | 903 Congo, The Democratic Republic of the | 8 | 911 France | 8 | 919 Austria | 8 | 927 Venezuela | 8 | 935
●累積比率
SQL48. payment_p2007_01テーブルとcustomer_listテーブルから1月の国別の利用比率を降順に並べ、累積で表示する
SELECT cl.country, ROUND( SUM(COUNT(*)) OVER ( ORDER BY COUNT(*) DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) / SUM(COUNT(*)) OVER (), 2 ) AS cumulative_percent FROM payment_p2007_01 AS p INNER JOIN customer_list AS cl ON p.customer_id = cl.id GROUP BY cl.country ORDER BY COUNT(*) DESC ; country | cumulative_percent ---------------------------------------+-------------------- India | 0.10 China | 0.19 United States | 0.25 Japan | 0.31 Mexico | 0.36 Russian Federation | 0.41 Brazil | 0.46 Philippines | 0.49 Indonesia | 0.52 Nigeria | 0.54 Turkey | 0.56 South Africa | 0.59 Argentina | 0.60 Germany | 0.62 Taiwan | 0.64 United Kingdom | 0.65 Iran | 0.66 Spain | 0.68 Saudi Arabia | 0.69 (略)
●ウィンドウ関数で移動平均を求める
移動平均とは主に時系列データなどに対し、
ある一定区間ごとの平均値を、、区間をずらしながら求めるもの。
例えば曜日ごとの変動の大きいデータに対して7日間移動平均を計算することで、
曜日変動を抑えた傾向をみることが出来ます。
今回はpaymentテーブルを使い、4月6日から12日の利用回数の3日移動平均を見てましょう。
まずは当該っ期間について、日別の利用回数を計算します。
SQL49. paymentテーブルを使い、2007年4月6日から12日の日別利用回数を集計する。
SELECT CAST(payment_date AS DATE) AS d, COUNT(*) FROM payment AS p WHERE CAST(payment_date AS DATE) BETWEEN '2007-04-06' AND '2007-04-12' GROUP BY d ORDER BY d ASC ; d | count ------------+------- 2007-04-06 | 486 2007-04-07 | 472 2007-04-08 | 516 2007-04-09 | 514 2007-04-10 | 482 2007-04-11 | 468 2007-04-12 | 452 (7 rows)
SQL50. paymentテーブルを使い、2007年4月6日から12日の利用回数の3日移動平均を計算する。
SELECT CAST(payment_date AS DATE) AS d, COUNT(*), ROUND(AVG(COUNT(*)) OVER ( ORDER BY CAST(payment_date AS DATE) ASC ROWS BETWEEN 2 PRECEDING AND CURRENT ROW ), 2) AS moving_avg FROM payment AS p WHERE CAST(payment_date AS DATE) BETWEEN '2007-04-06' AND '2007-04-12' GROUP BY d ORDER BY d ASC ; d | count | moving_avg ------------+-------+------------ 2007-04-06 | 486 | 486.00 2007-04-07 | 472 | 479.00 2007-04-08 | 516 | 491.33 2007-04-09 | 514 | 500.67 2007-04-10 | 482 | 504.00 2007-04-11 | 468 | 488.00 2007-04-12 | 452 | 467.33 (7 rows)