赞
踩
目录
Hbase Shell启动
[root@master conf]# hbase shell SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hadoop/hbase-1.6.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hadoop/hadoop-2.10.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] HBase Shell Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. Version 1.6.0, r5ec5a5b115ee36fb28903667c008218abd21b3f5, Fri Feb 14 12:00:03 PST 2020 hbase(main):001:0>
<!--注意-->
hbase shell没法使用退格删除文字。所以我们在xshell设置一下。
1.连接到hbase
hbase shell
2.显示hbase shell 帮助文本
type help
然后按Enter键,显示HBASE Shell的一些基本使用信息,以及一些示例命令。注意,表名、行、列都必须用引号括起来。
3.创建表
使用create
命令创建一个新表。您必须指定表名和Column类名。
creat 'table_name','family1','family2','familyN'
4.列出有关表的信息
使用list
命令来确认表的存在。
list 'table_name'
现在使用describe
命令查看详细信息,包括配置默认值。
describe 'table_name'
5.把数据插到你的表中
若要将数据放入表中,请使用put
命令。
put 'table_name','rowkey','family:column','value' #eg: put 'test', 'row1', 'cf:a', 'value1'
插入位于row1
,列cf:a
,值为value1
。
HBASE中的列由列族前缀组成。
6.查看所有记录
从HBASE获取数据的方法之一是扫描。使用scan
命令扫描表以获取数据。您可以限制扫描,但就目前而言,所有数据都将被获取。
#全部显示 scan 'table_name' #显示10条记录 scan 'table_name',{LIMIT=>10}
7.获取一行数据
若要一次获取一行数据,请使用get
命令。
get 'table_name','rowkey'
8.禁用一张表
如果要删除表或更改其设置,以及在某些其他情况下,需要首先禁用表,使用disable
命令。您可以使用enable
命令。
disable 'table_name' # enable 'table_name'
9.删除一张表
若要删除(删除)表,请使用drop
命令。注:先得禁用
disable 'table_name' drop 'table_name'
10.查看表中记录总数
这个命令并不快,但是目前还没找到比这快的统计行数的方式。
count 'table_name'
11.删除记录
第一种方式删除一条记录的单列数据
第二种方式删除整条记录
delete 'table_name','rowkey','family_name:column' #### delete 'table_name','rowkey'
12.退出hbase shell
若要退出HBASE Shell并从群集断开连接,请使用quit
命令。HBASE仍在后台运行。
quit
在HBASE中,数据存储在具有行和列的表中。这是一个术语重叠关系数据库(RDBMS),但这不是一个有用的类比。相反,将HBASE表看作多维地图是有帮助的.
HBASE数据模型术语
表
HBASE表由多行组成。
在概念级别上,表可以被看作是一组稀疏的行,但它们实际上是由列族存储的。一个新的列限定符(列族:列限定符)可以在任何时候添加到现有列族中。
行键
HBASE中的行由一个行键和一个或多个列组成,列的值与它们相关联。行在存储时按行键按字母顺序排序。因此,行键的设计非常重要。其目标是以这样一种方式存储数据,即相关行彼此接近。常见的行键模式是网站域。如果您的行键是域,则应该将它们反向存储(org.apache.www、org.apache.mail、org.apache.jira)。这样,所有Apache域都在表中彼此接近,而不是根据子域的第一个字母展开。
列
HBASE中的列由列族和列限定符组成,它们由:
(冒号)字符。
列族
列族物理上共用一组列及其值,通常是出于性能原因。每个列族都有一组存储属性,例如其值是否应该缓存在内存中、其数据如何被压缩或行键如何编码等。表中的每一行都有相同的列族,尽管给定行可能不会在给定列族中存储任何内容。
列限定符
列限定符被添加到列族中,以便为给定的数据段提供索引。列族content
,列限定符可能是content:html
,另一个可能是content:pdf
。虽然列族在表创建时是固定的,但列限定符是可变的,并且在行之间可能有很大差异。
单元格
单元格是行、列系列和列限定符的组合,包含一个值和一个时间戳,表示该值的版本。
时间戳
时间戳与每个值并排写,是值的给定版本的标识符。默认情况下,时间戳表示写入数据时在RegionServer上的时间,但您可以在将数据放入单元格时指定不同的时间戳值。
list 命令
#语法格式 list <table> hbase(main):001:0> list TABLE 0 row(s) in 3.6950 seconds => [] hbase(main):002:0>
2.create命令
hbase(main):002:0> create 'scores2',{NAME=>'course',VERSIONS=>3},{NAME=>'grade',VERSIONS=>3} 0 row(s) in 3.0720 seconds => Hbase::Table - scores2 hbase(main):003:0> create 'scores','course','grade'
3.describe
hbase(main):005:0> describe 'scores' Table scores is ENABLED scores COLUMN FAMILIES DESCRIPTION {NAME => 'course', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS = > 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => ' 0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} {NAME => 'grade', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0 ', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 2 row(s) in 0.1460 seconds hbase(main):006:0>
4.disable
hbase(main):006:0> disable 'scores2' 0 row(s) in 2.4350 seconds hbase(main):007:0> drop 'scores2' 0 row(s) in 1.3560 seconds hbase(main):008:0> list TABLE scores 1 row(s) in 0.0500 seconds => ["scores"] hbase(main):009:0>
5.exists
hbase(main):009:0> exists 'scores2' Table scores2 does not exist 0 row(s) in 0.0200 seconds hbase(main):010:0>
6.is_enabled
判断表是否enable,语法:enable <table>
hbase(main):013:0> is_enabled 'scores' true 0 row(s) in 0.0280 seconds hbase(main):014:0>
7.is_disabled
判断表是否disable,格式:disable<table>
hbase(main):014:0> is_disabled 'scores' false 0 row(s) in 0.0450 seconds hbase(main):015:0>
8.alter
修改表结构。
语法格式:alter <table>,{NAME=><family>},{NAME=><family>,METHOD=>'delete'}
#向scores表添加一列族address,同时指定版本数为3: hbase(main):015:0> alter 'scores',NAME=>'address',VERSIONS=>3 Updating all regions with the new schema... 0/1 regions updated. 1/1 regions updated. Done. 0 row(s) in 3.6060 seconds #将scores表中的grade列族删掉 hbase(main):016:0> alter 'scores',NAME=>'grade',METHOD=>'delete' Updating all regions with the new schema... 0/1 regions updated. 1/1 regions updated. Done. 0 row(s) in 3.7150 seconds
<!--注意-->
在进行更改表结构之前需要将该表停用,操作执行完后再启动。例如:
disable 'scores' alter 操作 enable 'scores'
9.删除列族
(1)禁用表
hbase(main):017:0> disable 'scores' 0 row(s) in 2.3090 seconds
(2)删除列表(注意NAME和METHOD要大写)
hbase(main):018:0> alter 'scores',NAME=>'course',METHOD=>'delete' Updating all regions with the new schema... 1/1 regions updated. Done. 0 row(s) in 2.2050 seconds
(3)删除列族之后再启用表
hbase(main):019:0> enable 'scores' 0 row(s) in 1.3910 seconds
(4)再次查看表信息,可以看到course已经被删除
hbase(main):020:0> describe 'scores' Table scores is ENABLED scores COLUMN FAMILIES DESCRIPTION {NAME => 'address', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DAT A_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLO CKSIZE => '65536', REPLICATION_SCOPE => '0'} 1 row(s) in 0.0410 seconds
10.whoami
查看当前访问HBase的用户
hbase(main):021:0> whoami root (auth:SIMPLE) groups: root hbase(main):022:0>
11.version
查看HBase的版本信息
hbase(main):022:0> version 1.6.0, r5ec5a5b115ee36fb28903667c008218abd21b3f5, Fri Feb 14 12:00:03 PST 2020
12.status
查看当前HBase的状态
#####status hbase(main):023:0> status 1 active master, 0 backup masters, 2 servers, 0 dead, 1.5000 average load ####status 'summary' hbase(main):024:0> status 'summary' 1 active master, 0 backup masters, 2 servers, 0 dead, 1.5000 average load ####status 'detailed' hbase(main):025:0> status 'detailed' version 1.6.0 0 regionsInTransition active master: master:16000 1607158473117 0 backup masters master coprocessors: null 2 live servers slave1:16020 1607158482456 requestsPerSecond=0.0, numberOfOnlineRegions=1, usedHeapMB=20, maxHeapMB=235, numberOfStores=1, numberOfStorefiles=2, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=70, writeRequestsCount=9, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=21, currentCompactedKVs=21, compactionProgressPct=1.0, coprocessors=[MultiRowMutationEndpoint] "hbase:meta,,1" numberOfStores=1, numberOfStorefiles=2, storeRefCount=0, maxCompactedStoreFileRefCount=0, storefileUncompressedSizeMB=0, lastMajorCompactionTimestamp=1607159362926, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=70, writeRequestsCount=9, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=21, currentCompactedKVs=21, compactionProgressPct=1.0, completeSequenceId=45, dataLocality=1.0 slave2:16020 1607158481937 requestsPerSecond=0.0, numberOfOnlineRegions=2, usedHeapMB=13, maxHeapMB=235, numberOfStores=2, numberOfStorefiles=1, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=4, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, coprocessors=[] "hbase:namespace,,1607137216837.828e7f51fd1fef7501b8ccc4c3b373ca." numberOfStores=1, numberOfStorefiles=1, storeRefCount=0, maxCompactedStoreFileRefCount=0, storefileUncompressedSizeMB=0, lastMajorCompactionTimestamp=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=4, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, completeSequenceId=-1, dataLocality=1.0 "scores,,1607147570746.91663a6c1657acef0ce3c114638e81af." numberOfStores=1, numberOfStorefiles=0, storeRefCount=0, maxCompactedStoreFileRefCount=0, storefileUncompressedSizeMB=0, lastMajorCompactionTimestamp=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, completeSequenceId=-1, dataLocality=0.0 0 dead servers hbase(main):026:0>
13.权限管理
(1)分配权限
有R(读),W(写),X(执行),C(创造),A(管理员)。语法格式:
grant<user>,<permissions>,<table>,<column family>,<column qualifier>
Hbase的权限管理依赖协处理器,权限控制是通过AccessController Coprocessor协处理器框架实现的,可实现对用户的RWXCA的权限控制。
需要配置
hbase.security.authorization=true
hbase.coprocessor.master.classes和hbase.coprocessor.master.classes使其包含org.apache.hadoop.hbase.security.access.AccessController来提供安全管理能力,所以
需要设置下面参数:停止HBase运行,配置hbase-site.xml
<property> <name>hbase.superuser</name> <value>hbase</value> </property> <property> <name>hbase.coprocessor.region.classes</name> <value>org.apache.hadoop.hbase.security.access.AccessController,org.apache.hadoop.hbase.security.token.TokenProvider</value> </property> <property> <name>hbase.coprocessor.regionserver.classes</name> <value>org.apache.hadoop.hbase.security.access.AccessController</value> </property> <property> <name>hbase.coprocessor.master.classes</name> <value>org.apache.hadoop.hbase.security.access.AccessController</value> </property> <property> <name>hbase.rpc.engine</name> <value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value> </property> <property> <name>hbase.security.authorization</name> <value>true</value> </property>
配置完成保存退出 重新启动HBase。
(2)分配权限
hbase(main)> grant '<user>', '<permission>', '<table>' hbase(main)> grant 'user1', 'RWXCA', 'table1'
(3)查看权限
hbase(main)> user_permission '<table>' hbase(main)> user_permission 'table1' User Namespace,Table,Family,Qualifier:Permission user1 default,table1,,: [Permission: actions=READ,WRITE,EXEC,CREATE,ADMIN]
语法格式:user_permission<table>
(4)收回权限
hbase(main)> revoke '<user>', '<table>' hbase(main)> revoke 'user1', 'table1'
与分配权限类似,语法格式:revoke<user><table><column family><column qualifier>
1.put
向表中插入数据。语法:put<table>,<rowkey>,<family : column>,<value>,<timesstamp>
例如:向表scores2中插入数据,rk001是行键,course是列族,soft是列名,值是database。
hbase(main):012:0> list TABLE scores scores2 2 row(s) in 0.0170 seconds => ["scores", "scores2"] hbase(main):013:0> enable 'scores2' 0 row(s) in 0.0130 seconds hbase(main):014:0> put'scores2','rk001','course:soft','database' 0 row(s) in 0.1900 seconds hbase(main):015:0>
(1)put更新记录
将上条数据更新为english
hbase(main):015:0> put 'scores2','rk001','course:soft','english' 0 row(s) in 0.0160 seconds
(2)批量添加数据
编写一个文件one.txt,内容如下:
put 'scores2','rk002','course:soft','database' put 'scores2','rk002','course:jg','math' put 'scores2','rk003','course:soft','c' put 'scores2','rk004','course:soft','java'
在linux端执行命令hbase shell one.txt,执行结果:
[root@master usr]# mkdir egdata [root@master usr]# cd egdata/ [root@master egdata]# vi one.txt [root@master egdata]# hbase shell one.txt SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hadoop/hbase-1.6.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hadoop/hadoop-2.10.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 0 row(s) in 0.4800 seconds 0 row(s) in 0.0120 seconds 0 row(s) in 0.0100 seconds 0 row(s) in 0.0090 seconds HBase Shell Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. Version 1.6.0, r5ec5a5b115ee36fb28903667c008218abd21b3f5, Fri Feb 14 12:00:03 PST 2020 hbase(main):001:0> list TABLE scores scores2 2 row(s) in 0.0890 seconds => ["scores", "scores2"] hbase(main):002:0> decribe scores2 NameError: undefined local variable or method `scores2' for #<Object:0x31a136a6> hbase(main):003:0> describe 'scores2' Table scores2 is ENABLED scores2 COLUMN FAMILIES DESCRIPTION {NAME => 'course', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS = > 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => ' 0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 1 row(s) in 0.0330 seconds hbase(main):004:0> scan 'scores2' ROW COLUMN+CELL rk001 column=course:soft, timestamp=1607221272704, value=english rk002 column=course:jg, timestamp=1607221711379, value=math rk002 column=course:soft, timestamp=1607221711165, value=database rk003 column=course:soft, timestamp=1607221711394, value=c rk004 column=course:soft, timestamp=1607221711402, value=java 4 row(s) in 0.0290 seconds hbase(main):005:0>
2.get
查询数据。语法:get<table>,<rowkey>,[<family : column> , ........]
#查询scores2中rk001行的course:soft列的值 hbase(main):005:0> get 'scores2','rk001','course:soft' COLUMN CELL course:soft timestamp=1607221272704, value=english 1 row(s) in 0.1590 seconds #查询scores2中rk001行course列族的值 hbase(main):006:0> get 'scores2','rk001','course' COLUMN CELL course:soft timestamp=1607221272704, value=english 1 row(s) in 0.0130 seconds #查询scores2中rk001行的值 hbase(main):007:0> get 'scores2','rk001' COLUMN CELL course:soft timestamp=1607221272704, value=english 1 row(s) in 0.0120 seconds #查询scores2中rk002行course列族的值,版本数为3 hbase(main):008:0> get 'scores2','rk002',{COLUMN=>'course',VERSIONS=>3} COLUMN CELL course:jg timestamp=1607221711379, value=math course:soft timestamp=1607221711165, value=database 1 row(s) in 0.0710 seconds #下面这种方式能够得到之前保存的历史数据。 #例如,查询scores2中rk002行course列族的值,版本数为3,时间戳1607221711300~1607221711170之间的值。 hbase(main):011:0> get 'scores2','rk002',{COLUMN=>'course:soft',TIMERANGE=>[1607221711300,1607221711170],VERSIONS=>3}
下面是高级用法:
(1)ValueFilter
表示对值进行过滤。
#查找scores2中rk001行中值是database的数据: hbase(main):012:0> get 'scores2','rk001',{FILTER=>"ValueFilter(=,'binary:database')"} COLUMN CELL course:soft timestamp=1607221106518, value=database 1 row(s) in 0.5400 seconds hbase(main):013:0> #查找scores2中rk002行中含有a的数据: hbase(main):013:0> get 'scores2','rk002',{FILTER=>"ValueFilter(=,'substring:a')"} COLUMN CELL course:jg timestamp=1607221711379, value=math course:soft timestamp=1607221711165, value=database 1 row(s) in 0.0310 seconds hbase(main):014:0>
(2)QualifierFilter
表示对列进行过滤。
#查找scores2中rk001行中列名是db的数据: hbase(main):018:0> get 'scores2', 'rk001', {FILTER => "QualifierFilter(=, 'binary:db')"} COLUMN CELL 0 row(s) in 0.0390 seconds #查找scores2中rk001行中列名中含有db的数据: hbase(main):019:0> get 'scores2', 'rk001', {FILTER => "QualifierFilter(=, 'substring:db')"} COLUMN CELL 0 row(s) in 0.0170 seconds
3.scan
扫描表。语法:scan<table>,{COLUMN=>[<family : column> , ........], LIMIT=>num}
另外还可以添加STARTROW,TIMERANGE和FILTER等高级功能。
#scores2表 #扫描整个表: hbase(main):021:0> scan 'scores2' ROW COLUMN+CELL rk001 column=course:soft, timestamp=1607221272704, value=english rk002 column=course:jg, timestamp=1607221711379, value=math rk002 column=course:soft, timestamp=1607221711165, value=database rk003 column=course:soft, timestamp=1607221711394, value=c rk004 column=course:soft, timestamp=1607221711402, value=java 4 row(s) in 0.0160 seconds hbase(main):022:0> #扫描整个表列族为course的数据: hbase(main):022:0> scan 'scores2',{COLUMNS=>'course'} ROW COLUMN+CELL rk001 column=course:soft, timestamp=1607221272704, value=english rk002 column=course:jg, timestamp=1607221711379, value=math rk002 column=course:soft, timestamp=1607221711165, value=database rk003 column=course:soft, timestamp=1607221711394, value=c rk004 column=course:soft, timestamp=1607221711402, value=java 4 row(s) in 0.0310 seconds hbase(main):023:0> #扫描整个表列族为coursed的数据,同时设置扫描的来时和结束行键: hbase(main):027:0> scan 'scores2',{COLUMNS=>'course',STARTROW=>'rk001', ENDROW=>'rk003'} ROW COLUMN+CELL rk001 column=course:soft, timestamp=1607221272704, value=english rk002 column=course:jg, timestamp=1607221711379, value=math rk002 column=course:soft, timestamp=1607221711165, value=database 2 row(s) in 0.0150 seconds #扫描整个表列族为course的数据,同时设置版本为3: hbase(main):026:0> scan 'scores2',{COLUMNS=>'course',VERSIONS=>3} ROW COLUMN+CELL rk001 column=course:soft, timestamp=1607221272704, value=english rk002 column=course:jg, timestamp=1607221711379, value=math rk002 column=course:soft, timestamp=1607221711165, value=database rk003 column=course:soft, timestamp=1607221711394, value=c rk004 column=course:soft, timestamp=1607221711402, value=java 4 row(s) in 0.0470 seconds hbase(main):027:0>
4.delete
删除数据。语法:delete<table>,<rowkey>,<family : column>,<timestamp>
(1)删除行中的某个值
语法: delete<table>,<rowkey>,<family : column>,<timestamp>,必须指定列名。
#删除scores2中rk001行中course:soft列的数据 hbase(main):028:0> delete 'scores2','rk001','course:soft' 0 row(s) in 0.0570 seconds
<!--将删除rk001行f1 : coll列所有版本的数据-->
(2)删除行
可以不指定列名,删除整行数据。
#删除scores2中rk002行的数据: hbase(main):002:0> delete'scores2','rk002'
5.deleteall
删除行。语法:deleteall<table>,<rowkey>,<family : column>,<timestamp>
#删除表scores的rk004行的所有数据: hbase(main):003:0> deleteall 'scores2','rk004' 0 row(s) in 0.0330 seconds
6.count
查询表中总共有多少行数据。语法:count<table>
#统计scores2中的所有数据 hbase(main):005:0> count 'scores2' 1 row(s) in 0.0210 seconds => 1
7.truncate
清空表。语法:truncate<table>
#清空表scores2 hbase(main):006:0> truncate 'scores2' Truncating 'scores2' table (it may take a while): - Disabling table... - Truncating table... 0 row(s) in 4.2990 seconds hbase(main):007:0> list TABLE scores scores2 2 row(s) in 0.1550 seconds => ["scores", "scores2"] hbase(main):008:0> scan 'scores2' ROW COLUMN+CELL 0 row(s) in 0.3380 seconds hbase(main):009:0>
hbase数据来源于日志文件或者RDBMS,把数据迁移到HBASE表中,常见的方法有使用HBASE Put API,使用HBase批量加载工具,自定义MapReduce实现。
[root@master egdata]# vi 1.tsv [root@master egdata]# cat 1.tsv 1001 zhangsan 16 1002 lisi 18 1003 wangwu 19 1004 zhaoliu 20 1005 zhengqi 19 [root@master egdata]# hdfs dfs -mkdir -p /hbase/data1 [root@master egdata]# hdfs dfs -put 1.tsv /hbase/data1 [root@master egdata]# hbase shell SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hadoop/hbase-1.6.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hadoop/hadoop-2.10.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] HBase Shell Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. Version 1.6.0, r5ec5a5b115ee36fb28903667c008218abd21b3f5, Fri Feb 14 12:00:03 PST 2020 hbase(main):001:0> create 'student2','info' 0 row(s) in 1.6560 seconds => Hbase::Table - student2 hbase(main):002:0> quit [root@master egdata]# yarn jar /usr/hadoop/hbase-1.6.0/lib/hbase-server-1.6.0.jar importtsv -Dimporttsv.separator=\t-Dimporttsv.columns=HBASE_ROW_KEY,info:name student2 /hbase/data1/1.tsv Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/filter/Filter at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) at java.lang.Class.privateGetMethodRecursive(Class.java:3048) at java.lang.Class.getMethod0(Class.java:3018) at java.lang.Class.getMethod(Class.java:1784) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.<init>(ProgramDriver.java:59) at org.apache.hadoop.util.ProgramDriver.addClass(ProgramDriver.java:103) at org.apache.hadoop.hbase.mapreduce.Driver.main(Driver.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:244) at org.apache.hadoop.util.RunJar.main(RunJar.java:158) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.filter.Filter at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 14 more [root@master egdata]#
......
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。