Presto介绍及常用查询优化方法总结
程序源代码
共 3138字,需浏览 7分钟
·
2021-12-09 18:44
完全基于内存的并行计算 流水线 本地化计算 动态编译执行计划 小心使用内存和数据结构 GC控制 无容错
[GOOD]: SELECT GROUP BY uid, gender
[BAD]: SELECT GROUP BY gender, uid
[GOOD]
SELECT ...
FROM access
WHERE regexp_like(method, 'GET|POST|PUT|DELETE')
[BAD]
SELECT ...
FROM access
WHERE
method LIKE '%GET%' OR
method LIKE '%POST%' OR
method LIKE '%PUT%' OR
method LIKE '%DELETE%'
set session distributed_join = 'true'
SELECT ...
FROM
large_table1
join large_table2
on large_table1.id = large_table2.id
SELECT ...
FROM
t1
JOIN t2
ON t1.a1 = t2.a1 OR
t1.a2 = t2.a2
改为
SELECT ...
FROM
t1
JOIN t2
ON t1.a1 = t2.a1
union
SELECT ...
FROM
t1
JOIN t2
ON t1.a2 = t2.a2
WITH tmp AS (
SELECT DISTINCT a1, a2
FROM t2
)
SELECT ...
FROM t1
JOIN tmp
ON t1.a1 = tmp.a1
union
SELECT ...
FROM t1
JOIN tmp
ON t1.a2 = tmp.a2;
Flink CDC我吃定了耶稣也留不住他!| Flink CDC线上问题小盘点
4万字长文 | ClickHouse基础&实践&调优全视角解析
评论