SQL太难?你离完全理解 SQL 就差这10步!
点击上方“数据管道”,选择“置顶星标”公众号
干货福利,第一时间送达
在工作中会用到 SQL 但是对它并不完全了解的人
能够熟练使用 SQL 但是并不了解其语法逻辑的人
想要教别人 SQL 的人
1
SQL 是一种声明式语言
SELECT first_name, last_name FROM employees WHERE salary > 100000
2
SQL 的语法并不按照语法顺序执行
SELECT[DISTINCT]
FROM
WHERE
GROUP BY
HAVING
UNION
ORDER BY
FROM
WHERE
GROUP BY
HAVING
SELECT
DISTINCT
UNION
ORDER BY
SELECT A.x + A.y AS z
FROM A
WHERE z = 10 -- z 在此处不可用,因为SELECT是最后执行的语句!
SELECT A.x + A.y AS z
FROM A
WHERE (A.x + A.y) = 10
注意:并非所有的数据库对 SQL 语句使用相同的解析方式。如 MySQL、PostgreSQL和 SQLite 中就不会按照上面第二点中所说的方式执行。
3
SQL 语言的核心是对表的引用
<from clause> ::= FROM <table reference> [ { <comma> <table reference> }... ]
FROM a, b
4
灵活引用表能使 SQL 语句变得更强大
<table reference> ::=
<table name>
| <derived table>
| <joined table>
FROM a, b
a1 JOIN a2 ON a1.id = a2.id
FROM a1 JOIN a2 ON a1.id = a2.id, b
5
SQL 语句中推荐使用表连接
FROM a, b
FROM a, b, c, d, e, f, g, h
WHERE a.a1 = b.bx
AND a.a2 = c.c1
AND d.d1 = b.bc
etc...
安全。JOIN 和要连接的表离得非常近,这样就能避免错误
更多连接的方式,JOIN 语句能去区分出来外连接和内连接等
6
简约框线标题
EQUI JOIN
SEMI JOIN
ANTI JOIN
CROSS JOIN
DIVISION
INNER JOIN(或者是 JOIN )
OUTER JOIN(包括:LEFT 、 RIGHT、 FULL OUTER JOIN)
-- This table reference contains authors and their books.
-- There is one record for each book and its author.
-- authors without books are NOT included
author JOIN book ON author.id = book.author_id
-- This table reference contains authors and their books
-- There is one record for each book and its author.
-- ... OR there is an "empty" record for authors without books
-- ("empty" meaning that all book columns are NULL)
author LEFT OUTER JOIN book ON author.id = book.author_id
-- Using IN
FROM author
WHERE author.id IN (SELECT book.author_id FROM book)
-- Using EXISTS
FROM author
WHERE EXISTS (SELECT 1 FROM book WHERE book.author_id = author.id)
IN比 EXISTS 的可读性更好
EXISTS 比IN 的表达性更好(更适合复杂的语句)
二者之间性能没有差异(但对于某些数据库来说性能差异会非常大)
-- Find only those authors who also have books
SELECT DISTINCT first_name, last_name
FROM author
JOIN book ON author.id = book.author_id
SQL 语句性能低下:因为去重操作( DISTINCT )需要数据库重复从硬盘中读取数据到内存中。(译者注:DISTINCT 的确是一种很耗费资源的操作,但是每种数据库对于 DISTINCT 的操作方式可能不同)。
这么写并非完全正确:尽管也许现在这么写不会出现问题,但是随着 SQL 语句变得越来越复杂,你想要去重得到正确的结果就变得十分困难。
-- Using IN
FROM author
WHERE author.id NOT IN (SELECT book.author_id FROM book)
-- Using EXISTS
FROM author
WHERE NOT EXISTS (SELECT 1 FROM book WHERE book.author_id = author.id)
-- Combine every author with every book
author CROSS JOIN book
7
SQL 中如同变量的派生表
-- A derived table
FROM (SELECT * FROM author)
-- A derived table with an alias
FROM (SELECT * FROM author) a
-- Get authors' first and last names, and their age in days
SELECT first_name, last_name, age
FROM (
SELECT first_name, last_name, current_date - date_of_birth age
FROM author
)
-- If the age is greater than 10000 days
WHERE age > 10000
WITH a AS (
SELECT first_name, last_name, current_date - date_of_birth age
FROM author
)
SELECT *
FROM a
WHERE age > 10000
8
SQL 语句中GROUP BY是对表的引用进行的操作
FROM a, b
GROUP BY A.x, A.y, B.z
SELECT A.x, A.y, SUM(A.z)
FROM A
GROUP BY A.x, A.y
9
SQL 语句中的 SELECT 实质上是对关系的映射
你仅能够使用那些能通过表引用而得来的字段
如果你有 GROUP BY 语句,你只能够使用 GROUP BY 语句后面的字段或者聚合函数
当你的语句中没有 GROUP BY 的时候,可以使用开窗函数代替聚合函数
当你的语句中没有 GROUP BY 的时候,你不能同时使用聚合函数和其它函数
有一些方法可以将普通函数封装在聚合函数中
凭直觉,这种做法从逻辑上就讲不通
如果直觉不能够说服你,那么语法肯定能。SQL : 1999 标准引入了 GROUPING SETS,SQL:2003 标准引入了 group sets : GROUP BY()
10
SQL 语句中的几个简单的关键词:DISTINCT,UNION ,ORDER BY 和 OFFSET
集合运算( DISTINCT 和 UNION )
排序运算( ORDER BY,OFFSET…FETCH)
集合运算( set operation)
DISTINCT 在映射之后对数据进行去重
UNION 将两个子查询拼接起来并去重
UNION ALL 将两个子查询拼接起来但不去重
EXCEPT 将第二个字查询中的结果从第一个子查询中去掉
INTERSECT 保留两个子查询中都有的结果并去重