【干货】使用hive应该知道基础点

A小天

共 6847字,需浏览 14分钟

 ·

2021-07-29 09:34





















5a884304225d9441aed653d81c40b180.webp

正念的观察,让你又重回事物本身,看清自己的每一个动作、每一个瞬间,你将明白自己有权利快乐而从容地完成每一件事情!

《和繁重的工作一起修行》



一、数据类型

数值型
tinyint
1 byte 有符号整数
smallint
2 byte 有符号整数
int/integer
4 byte 有符号整数
bigint
8 byte 有符号整数
float4 byte 单精度浮点数
double8 byte 双精度浮点数
decimal指定小数位,最大38位精度。例decimal(5,3) 5位数保留3位小数
numeric类似decimal,从3.0.0开始支持
日期时间类型
timestamp年月日时分秒毫秒
date年月日
interval
String类型
string

varchar
char
其他类型
boolean
binary
复合类型
数组array<data_type>
mapsmap<key_type,data_type>
structsstruct<col_name:data_type[comment col_comment],...>

二、数据库操作

1.建库语法

create (database/schema) [if not exist] database_name  [comment database_comment]  [location hdfs_path]  [with dbproperties (pro_name=pro_val,...)];

2.示例

普通创建hive> create database mydb1;OKTime taken: 2.73 seconds
指定hdfs地址hive> create database mydb2 location '/opt';OKTime taken: 5.211 second
带有注释hive> create database mydb3 comment "这是数据库3";OKTime taken: 2.07 second
带有属性hive> create database mydb4 > with dbproperties('pro1'='val1','pro2'='val2');OKTime taken: 2.086 second
如果不存在就创建hive> create schema if not exists mydb2;OKTime taken: 0.017 seconds

3.查看

hive> show databases;OKdefaultmydbmydb1mydb2mydb3mydb4Time taken: 0.019 seconds, Fetched: 6 row(s)
hive> desc database mydb4;OKmydb4 hdfs://mycluster/user/hive/warehouse/mydb4.db root USERTime taken: 0.028 seconds, Fetched: 1 row(s)
hive> desc database mydb3;OKmydb3 这是数据库3 hdfs://mycluster/user/hive/warehouse/mydb3.db root USERTime taken: 0.018 seconds, Fetched: 1 row(s)

4.删库语法,默认restrict(严格模式)

drop (database/schema) [if exist] database_name[restrict|cascade];

5.删库示例

hive> drop database mydb1;OKTime taken: 2.146 secondshive> show databases;OKdefaultmydbmydb2mydb3mydb4Time taken: 0.011 seconds, Fetched: 5 row(s)

三、创建表

1.建表语法

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name    -- (Note: TEMPORARY available in Hive 0.14.0 and later)  [(col_name data_type [column_constraint_specification] [COMMENT col_comment], ... [constraint_specification])]  [COMMENT table_comment]  [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]  [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]  [SKEWED BY (col_name, col_name, ...)                  -- (Note: Available in Hive 0.10.0 and later)]     ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)     [STORED AS DIRECTORIES]  [   [ROW FORMAT row_format]    [STORED AS file_format]     | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]  -- (Note: Available in Hive 0.6.0 and later)  ]  [LOCATION hdfs_path]  [TBLPROPERTIES (property_name=property_value, ...)]   -- (Note: Available in Hive 0.6.0 and later)  [AS select_statement];

2.建表示例

普通建表hive> create table t_user(    > uid int comment 'id',    > uname string comment '姓名',    > sex boolean comment '性别',    > birth date comment '生日',    > likes array<string> comment '兴趣',    > other map<string,string> comment '其他'    > );OKTime taken: 2.342 seconds
指定数据在hdfs文件中的分隔符hive> create table t_user1( > uid int comment 'id', > uname string comment '姓名', > sex boolean comment '性别', > birth date comment '生日', > likes array<string> comment '兴趣', > other map<string,string> comment '其他' > ) > row format > delimited > fields terminated by ',' > collection items terminated by '-' > map keys terminated by ':' > lines terminated by '\n';OKTime taken: 5.186 seconds
指定存储格式hive> create table t_user2( > uid int comment 'id', > uname string comment '姓名', > sex boolean comment '性别', > birth date comment '生日', > likes array<string> comment '兴趣', > other map<string,string> comment '其他'    > ) > stored as orc;OKTime taken: 4.439 seconds
创建外部表hive> create external table t_user3( > uid int comment 'id', > uname string comment '姓名', > sex boolean comment '性别', > birth date comment '生日', > likes array<string> comment '兴趣', > other map<string,string> comment '其他' > );OKTime taken: 2.126 seconds
指定hdfs文件hive> create table t_user4( > uid int comment 'id', > uname string comment '姓名', > sex boolean comment '性别', > birth date comment '生日', > likes array<string> comment '兴趣', > other map<string,string> comment '其他' > ) > location '/user/root/user.txt';OKTime taken: 2.108 seconds
创建分区表hive> create table t_user5( > uid int comment 'id', > uname string comment '姓名', > birth date comment '生日', > likes array<string> comment '兴趣', > other map<string,string> comment '其他' > ) > partitioned by (sex string comment '性别',age int comment '年龄');OKTime taken: 2.11 seconds

3.表数据存储格式

stored as avro
avro格式存储数据
stored as sequencefile压缩的序列文件存储
stored as jsonfilejson文件存储
stored as orcorc文件格式,支持ACID事务并做了基于资源的优化。存储列级别的元数据
stored as parquetparquet列存储格式
stored as textfile默认,普通文本文件格式,    
stored as rcfile列记录文件格式
stored by使用非本地表格式存储。创建或链接到非本地表

4.内部表和外部表

内部表
表或分区删除了,根表或分区关联的数据和元数据一并删除
外部表在外部文件存储元数据及表结构,可以被外部进程管理和访问,可以访问存储于ASV或者远程HDFS上的数据。当表删除后不会删除数据

5.查看

hive> desc formatted t_user1;OK# col_name              data_type               comment
uid int iduname string 姓名sex boolean 性别birth date 生日likes array<string> 兴趣other map<string,string> 其他
# Detailed Table InformationDatabase: mydbOwner: rootCreateTime: Mon Jul 05 21:21:01 CST 2021LastAccessTime: UNKNOWNProtect Mode: NoneRetention: 0Location: hdfs://mycluster/user/hive/warehouse/mydb.db/t_user1Table Type: MANAGED_TABLETable Parameters: transient_lastDdlTime 1625491261
# Storage InformationSerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDeInputFormat: org.apache.hadoop.mapred.TextInputFormatOutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormatCompressed: NoNum Buckets: -1Bucket Columns: []Sort Columns: []Storage Desc Params: colelction.delim - field.delim , line.delim \n mapkey.delim : serialization.format ,Time taken: 0.12 seconds, Fetched: 35 row(s)

6.like和as建表

表1和表2表结构一样,没有数据create table table_name_1 like table_name_2;
从表2查询出col1、col2列的数据集,并建立成一个名为表1的表create table table_name1 as select col1,col2 from table_name2


至此,hive基础操作就完成了,至于查询语法和常规的数据库操作大同小异,欢迎朋友留言指正。。。

2b679458d2c56ef6ed6be03a6d2095a0.webpEND



浏览 16
点赞
评论
收藏
分享

手机扫一扫分享

分享
举报
评论
图片
表情
推荐
点赞
评论
收藏
分享

手机扫一扫分享

分享
举报