2021年ClickHouse最王炸功能来袭,性能轻松提升40倍
大数据DT
共 19804字,需浏览 40分钟
·
2021-09-18 11:44
导读:Projection来了。
Order BY A,B,C
part-level 存储 相比普通物化视图是一张独立的表,Projection 物化的数据就保存在原表的分区目录中,支持明细数据的普通Projection 和 预聚合Projection 无感使用,自动命中 可以对一张 MergeTree 创建多个 Projection ,当执行 Select 语句的时候,能根据查询范围,自动匹配最优的 Projection 提供查询加速。如果没有命中 Projection , 就直接查询底表。 数据同源、同生共死 因为物化的数据保存在原表的分区,所以数据的更新、合并都是同源的,也就不会出现不一致的情况了
SELECT count(*)
FROM hits_100m_obfuscated
Query id: 813ba930-d299-47d8-9ac3-6d7dbde075b1
┌───count()─┐
│ 100000000 │
└───────────┘
1 rows in set. Elapsed: 0.004 sec.
ENGINE = MergeTree
PARTITION BY toYYYYMM(EventDate)
ORDER BY (CounterID, EventDate, intHash32(UserID), EventTime)
SELECT WatchID
FROM hits_100m_obfuscated
WHERE WatchID = 5814563137538961516
Query id: 20110b52-cac0-43b7-baf6-1931b94864a6
┌─────────────WatchID─┐
│ 5814563137538961516 │
└─────────────────────┘
1 rows in set. Elapsed: 0.262 sec. Processed 100.00 million rows, 800.00 MB (380.95 million rows/s., 3.05 GB/s.)
ALTER TABLE hits_100m_obfuscated ADD PROJECTION p1
(
SELECT
WatchID,Title
ORDER BY WatchID
)
alter table hits_100m_obfuscated MATERIALIZE PROJECTION p1
SELECT
table,
mutation_id,
command,
is_done
FROM system.mutations AS m
WHERE is_done = 0
Query id: 7ddc855a-acb5-4ca9-8c48-ad4f5a7b234e
┌─table────────────────┬─mutation_id─────┬─command───────────────────┬─is_done─┐
│ hits_100m_obfuscated │ mutation_99.txt │ MATERIALIZE PROJECTION p1 │ 0 │
└──────────────────────┴─────────────────┴───────────────────────────┴─────────┘
1 rows in set. Elapsed: 0.005 sec.
cd /data/default/hits_100m_obfuscated/201307_1_96_4_107/p1.proj
[root@ch9 p1.proj]# ll
total 5187772
-rw-r-----. 1 clickhouse clickhouse 278 Sep 8 23:43 checksums.txt
-rw-r-----. 1 clickhouse clickhouse 69 Sep 8 23:43 columns.txt
-rw-r-----. 1 clickhouse clickhouse 9 Sep 8 23:43 count.txt
-rw-r-----. 1 clickhouse clickhouse 10 Sep 8 23:43 default_compression_codec.txt
-rw-r-----. 1 clickhouse clickhouse 97672 Sep 8 23:43 primary.idx
-rw-r-----. 1 clickhouse clickhouse 4508224709 Sep 8 23:43 Title.bin
-rw-r-----. 1 clickhouse clickhouse 293016 Sep 8 23:43 Title.mrk2
-rw-r-----. 1 clickhouse clickhouse 803340103 Sep 8 23:43 WatchID.bin
-rw-r-----. 1 clickhouse clickhouse 293016 Sep 8 23:43 WatchID.mrk2
SET allow_experimental_projection_optimization = 1;
SELECT WatchID
FROM hits_100m_obfuscated
WHERE WatchID = 5814563137538961516
Query id: 38d2aa48-45da-4487-ab80-1cd02ee08ce2
┌─────────────WatchID─┐
│ 5814563137538961516 │
└─────────────────────┘
1 rows in set. Elapsed: 0.006 sec. Processed 8.19 thousand rows, 65.54 KB (1.41 million rows/s., 11.27 MB/s.)
SELECT
UserID,
SearchPhrase,
count()
FROM hits_100m_obfuscated
GROUP BY
UserID,
SearchPhrase
LIMIT 10
Query id: 42c941e0-c15a-4206-9c1b-7350a5a67984
┌───────────────UserID─┬─SearchPhrase─────────────────────────────────────────────────┬─count()─┐
│ 64240392369242065 │ │ 1 │
│ 2542641703475366060 │ galaxy s4 activerstovmamasumi x2 │ 3 │
│ 14973463213479722228 │ │ 17 │
│ 6604743450870066038 │ │ 1 │
│ 325929602194382277 │ вес гриппи игре aventity of wars 2 в в играть │ 1 │
│ 5481644077966220011 │ как леченский рецепты как почему конкая лето москва отдых на │ 1 │
│ 5965198553492672379 │ │ 1 │
│ 119657425828985633 │ │ 1 │
│ 8462750442030450647 │ рулонасточный+статив зомбинет магазин на айресу батл │ 1 │
│ 7510587892824469257 │ sia 265 сезон 6 серии │ 1 │
└──────────────────────┴──────────────────────────────────────────────────────────────┴─────────┘
10 rows in set. Elapsed: 2.190 sec. Processed 100.00 million rows, 2.44 GB (45.66 million rows/s., 1.11 GB/s.)
ALTER TABLE hits_100m_obfuscated ADD PROJECTION agg_p2
(
SELECT
UserID,
SearchPhrase,
count()
GROUP BY UserID, SearchPhrase
)
alter table hits_100m_obfuscated MATERIALIZE PROJECTION agg_p2
SELECT
UserID,
SearchPhrase,
count()
FROM hits_100m_obfuscated
GROUP BY
UserID,
SearchPhrase
LIMIT 10
Query id: 258e556e-ea5b-43f0-980a-997c02abc233
┌───────────────UserID─┬─SearchPhrase─────────────────────────────────────────────────┬─count()─┐
│ 64240392369242065 │ │ 1 │
│ 2542641703475366060 │ galaxy s4 activerstovmamasumi x2 │ 3 │
│ 14973463213479722228 │ │ 17 │
│ 6604743450870066038 │ │ 1 │
│ 325929602194382277 │ вес гриппи игре aventity of wars 2 в в играть │ 1 │
│ 5481644077966220011 │ как леченский рецепты как почему конкая лето москва отдых на │ 1 │
│ 5965198553492672379 │ │ 1 │
│ 119657425828985633 │ │ 1 │
│ 8462750442030450647 │ рулонасточный+статив зомбинет магазин на айресу батл │ 1 │
│ 7510587892824469257 │ sia 265 сезон 6 серии │ 1 │
└──────────────────────┴──────────────────────────────────────────────────────────────┴─────────┘
10 rows in set. Elapsed: 1.847 sec. Processed 24.07 million rows, 1.58 GB (13.04 million rows/s., 856.09 MB/s.)
SELECT
name,
partition,
formatReadableSize(bytes_on_disk) AS bytes,
formatReadableSize(parent_bytes_on_disk) AS parent_bytes,
parent_rows,
rows / parent_rows AS ratio
FROM system.projection_parts
Query id: 2887b0e1-b984-4274-862c-0b59c68693c5
┌─name───┬─partition─┬─bytes──────┬─parent_bytes─┬─parent_rows─┬──────ratio─┐
│ agg_p2 │ 201307 │ 490.40 MiB │ 14.06 GiB │ 100000000 │ 0.24070565 │
│ p1 │ 201307 │ 4.95 GiB │ 18.53 GiB │ 100000000 │ 1 │
└────────┴───────────┴────────────┴──────────────┴─────────────┴────────────┘
ALTER TABLE hits_100m_obfuscated DROP PROJECTION p1
ALTER TABLE hits_100m_obfuscated DROP PROJECTION agg_p2
CREATE TABLE xxx
(
`event_key` String,
`user` UInt32,
`dim1` String,
PROJECTION p1
(
SELECT
groupBitmap(user),
count(1)
GROUP BY dim1
)
)
ENGINE = MergeTree()
ORDER BY (event_key, user)
设置了 SET allow_experimental_projection_optimization = 1 返回的数据行小于基表总数 查询覆盖的分区 part 超过一半 Where 必须是 PROJECTION 定义中 GROUP BY 的子集 GROUP BY 必须是 PROJECTION 定义中 GROUP BY 的子集 SELECT 必须是 PROJECTION 定义中 SELECT 的子集 匹配多个 PROJECTION 的时候,选取读取 part 最少的
EXPLAIN
SELECT WatchID
FROM hits_100m_obfuscated
WHERE WatchID = 5814563137538961516
Query id: bf008e69-fd68-4928-83f6-a57a2d84e286
┌─explain───────────────────────────────────────────────────────────────────┐
│ Expression ((Projection + Before ORDER BY)) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromStorage (MergeTree(with 0 projection p1)) │
└───────────────────────────────────────────────────────────────────────────┘
(SelectExecutor): Choose normal projection p3
(SelectExecutor): projection required columns: dim1, dim3, event_time, dim2, event_key, user
(SelectExecutor): Key condition: (column 0 in ['dim12', 'dim12'])
评论