赞
踩
MSCK SQL 语法如下:
MSCK [REPAIR] TABLE table_name [ADD/DROP/SYNC PARTITIONS];
先创建3个分区,把分区文件删除。这时 metastore 有这个3个分区,文件上不存在。再在文件系统上创建其他两个分区,这两个分区在 metastore 不存在。
drop table if exists t_drop_partitions;
create table t_drop_partitions(c1 string)partitioned by (pt string,hr string) stored as textfile location '/tmp/t_drop_partitions';
alter table t_drop_partitions add partition(pt='20230101',hr='00') location '/tmp/t_drop_partitions/pt=20230101/hr=00';
alter table t_drop_partitions add partition(pt='20230101',hr='01') location '/tmp/t_drop_partitions/pt=20230101/hr=01';
alter table t_drop_partitions add partition(pt='20230101',hr='02') location '/tmp/t_drop_partitions/pt=20230101/hr=02';
! hadoop fs -rm -r /tmp/t_drop_partitions/pt=20230101/hr=00;
! hadoop fs -rm -r /tmp/t_drop_partitions/pt=20230101/hr=01;
! hadoop fs -rm -r /tmp/t_drop_partitions/pt=20230101/hr=02;
! hadoop fs -mkdir -p /tmp/t_drop_partitions/pt=20230101/hr=03;
! hadoop fs -mkdir -p /tmp/t_drop_partitions/pt=20230101/hr=04;
hive> show partitions t_drop_partitions;
OK
pt=20230101/hr=00
pt=20230101/hr=01
pt=20230101/hr=02
! hadoop fs -ls /tmp/t_drop_partitions/pt=20230101;
Found 2 items
drwxr-xr-x - hive hadoop 0 2023-04-23 14:52 /tmp/t_drop_partitions/pt=20230101/hr=03
drwxr-xr-x - hive hadoop 0 2023-04-23 14:52 /tmp/t_drop_partitions/pt=20230101/hr=04
执行初始化SQL。
hive> MSCK TABLE t_drop_partitions;
OK
Partitions not in metastore: t_drop_partitions:pt=20230101/hr=03 t_drop_partitions:pt=20230101/hr=04
Partitions missing from filesystem: t_drop_partitions:pt=20230101/hr=00 t_drop_partitions:pt=20230101/hr=01 t_drop_partitions:pt=20230101/hr=02
Time taken: 0.094 seconds, Fetched: 2 row(s)
MSCK TABLE 不做任何操作,仅仅显示 metastore 和 文件系统中不匹配的分区。
Partitions not in metastore: 这些分区存在于文件系统,但是不在 Metastore 。
Partitions missing from filesystem: 这些存在于Metastore,但是不在文件系统 。
hive> show partitions t_drop_partitions;
OK
pt=20230101/hr=00
pt=20230101/hr=01
pt=20230101/hr=02
Time taken: 0.105 s
MSCK [REPAIR] TABLE
对表的分区进行修复。后面可以根修复的选项,默认是 ADD。
ADD PARTITIONS: 把文件系统上存在,metastore 上不存在的分区添加到 metastore。
DROP PARTITIONS: 把 metastore 上存在,文件系统上不存在的分区从 metastore 删除。
SYNC PARTITIONS: 代表 ADD PARTITIONS 和 DROP PARTITIONS 同时选中。
执行初始化SQL。
执行 MSCK REPAIR TABLE。
hive> MSCK REPAIR TABLE t_drop_partitions;
OK
Partitions not in metastore: t_drop_partitions:pt=20230101/hr=03 t_drop_partitions:pt=20230101/hr=04
Partitions missing from filesystem: t_drop_partitions:pt=20230101/hr=00 t_drop_partitions:pt=20230101/hr=01 t_drop_partitions:pt=20230101/hr=02
Repair: Added partition to metastore t_drop_partitions:pt=20230101/hr=03
Repair: Added partition to metastore t_drop_partitions:pt=20230101/hr=04
Repair: Added partition to metastore: 部分显示了 metastore 添加了哪些分区。
hive> show partitions t_drop_partitions;
OK
pt=20230101/hr=00
pt=20230101/hr=01
pt=20230101/hr=02
pt=20230101/hr=03
pt=20230101/hr=04
现在有5个分区。
执行初始化SQL。
执行 MSCK REPAIR TABLE xxx DROP PARTITIONS。
hive> MSCK REPAIR TABLE t_drop_partitions DROP PARTITIONS;
OK
Partitions not in metastore: t_drop_partitions:pt=20230101/hr=03 t_drop_partitions:pt=20230101/hr=04
Partitions missing from filesystem: t_drop_partitions:pt=20230101/hr=00 t_drop_partitions:pt=20230101/hr=01 t_drop_partitions:pt=20230101/hr=02
Repair: Dropped partition from metastore test.t_drop_partitions:pt=20230101/hr=02
Repair: Dropped partition from metastore test.t_drop_partitions:pt=20230101/hr=01
Repair: Dropped partition from metastore test.t_drop_partitions:pt=20230101/hr=00
Repair: Dropped partition: 部分显示了从metastore 删除了哪些分区。
hive> show partitions t_drop_partitions;
OK
Time taken: 0.066 seconds
现在有0个分区。
执行初始化SQL。
执行 MSCK REPAIR TABLE xxx SYNC PARTITIONS。
Partitions not in metastore: t_drop_partitions:pt=20230101/hr=03 t_drop_partitions:pt=20230101/hr=04
Partitions missing from filesystem: t_drop_partitions:pt=20230101/hr=00 t_drop_partitions:pt=20230101/hr=01 t_drop_partitions:pt=20230101/hr=02
Repair: Added partition to metastore t_drop_partitions:pt=20230101/hr=03
Repair: Added partition to metastore t_drop_partitions:pt=20230101/hr=04
Repair: Dropped partition from metastore test.t_drop_partitions:pt=20230101/hr=00
Repair: Dropped partition from metastore test.t_drop_partitions:pt=20230101/hr=02
Repair: Dropped partition from metastore test.t_drop_partitions:pt=20230101/hr=01
Time taken: 0.225 seconds, Fetched: 7 row(s)
Repair: Added partition to metastore: 显示了在 metastore 添加了哪些分区。
Repair: Dropped partition from metastore : 显示了从metastore 删除了哪些分区。
hive> show partitions t_drop_partitions;
OK
Time taken: 0.066 seconds
现在有0个分区。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。