当前位置:   article > 正文

GeoMesa安装

geomesa安装

GeoMesa安装

GeoMesa的安装主要包括5个组件的安装,分别是:
  • GeoMesa Accumulo安装
  • GeoMesa Kafka安装
  • GeoMesa HBase安装
  • GeoMesa Bigtable安装
  • GeoMesa Cassandra安装

1 GeoMesa Accumulo安装

Apache Accumulo 是一个可靠的、可伸缩的、高性能的排序分布式的 Key-Value 存储解决方案,基于单元访问控制以及可定制的服务器端处理。使用 Google BigTable 设计思路,基于 Apache Hadoop、Zookeeper 和 Thrift 构建。 GeoMesa 提供了对Accumulo的支持,可以将时空数据存储到Accumulo中。

1.1 安装要求

操作系统为Unbunu 14.04 server,需要有sudo权限,并且交换分区最少为2G。 目前GeoMesa 支持的Accumulo版本为1.7。因为Accumulo依赖于Hadoop和Zookeeper,因此在安装Accumulo之前,要首先安装Hadoop和Zookepper。为了方便演示和学习,本教程中Hadoop和Zookeeper以及Accumulo安装的都是单机版本,并会安装到同一台机器上。关于集群的安装和配置,请查阅相关文档。 GeoMesa Accumulo还提供了通过GeoServer访问的能力,因此需要使用该功能的话,还需要安装GeoServer。目前GeoMesa Accumulo支持的GeoServer版本为2.9.1。

1.2 安装Hadoop

安装java
首先,需要安装Java1.8,打开Ubuntu命令行,添加pa:webupd8team/java并更新apt-get,如下面代码:
  1. root@HDMachine:~$ sudo add-apt-repository ppa:webupd8team/java
  2. root@HDMachine:~$ sudo apt-get update
然后安装Java1.8
root@HDMachine:~$ sudo apt-get install oracle-java8-installer
安装完java之后可以使用下面命令来查看java版本:
  1. root@HDMachine:~$ java -version
  2. java version "1.8.0_121"
  3. Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
  4. Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
创建hadoop用户
使用下面的命令创建hadoop用户:
  1. root@HDMachine:~$ sudo addgroup hadoop
  2. Adding group `hadoop' (GID 1002) ...
  3. Done.
  4. root@HDMachine:~$ sudo adduser --ingroup hadoop hduser
  5. Adding user `hduser' ...
  6. Adding new user `hduser' (1001) with group `hadoop' ...
  7. Creating home directory `/home/hduser' ...
  8. Copying files from `/etc/skel' ...
  9. Enter new UNIX password:
  10. Retype new UNIX password:
  11. passwd: password updated successfully
  12. Changing the user information for hduser
  13. Enter the new value, or press ENTER for the default
  14. Full Name []:
  15. Room Number []:
  16. Work Phone []:
  17. Home Phone []:
  18. Other []:
  19. Is the information correct? [Y/n] Y
将hduser用户加入到sudo用户列表中,命令如下:
  1. hduser@HDMachine:~$ su root
  2. Password:
  3. root@HDMachine:/home/hduser$ sudo adduser hduser sudo
  4. [sudo] password for root:
  5. Adding user `hduser' to group `sudo' ...
  6. Adding user hduser to group sudo
  7. Done.
安装SSH
ssh包含两部分,ssh和sshd:
  • ssh:客户端,用于连接远程机器。
  • sshd:服务器,用于接受客户端的连接请求 ssh在Linux上是默认安装的,但是为了能够启动sshd服务,需要重新安装ssh,命令如下:
root@HDMachine:~$ sudo apt-get install ssh
使用下面命令检查ssh是否安装成功:
  1. root@HDMachine:~$ which ssh
  2. /usr/bin/ssh
  3. root@HDMachine:~$ which sshd
  4. /usr/sbin/sshd
创建和安装SSH证书
Hadoop使用SSH来管理其节点,对应单节点的安装,我们需要配置SSH能够免密码访问locahost。当ssh-keygen命令需要输入文件名的时候,直接按回车键即可创建没有密码的公钥。命令如下:
  1. root@HDMachine:~$ su hduser
  2. Password:
  3. hduser@HDMachine:~$ ssh-keygen -t rsa -P ""
  4. Generating public/private rsa key pair.
  5. Enter file in which to save the key (/home/hduser/.ssh/id_rsa):
  6. Created directory '/home/hduser/.ssh'.
  7. Your identification has been saved in /home/hduser/.ssh/id_rsa.
  8. Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.
  9. The key fingerprint is:
  10. 5c:9f:d5:64:8c:fa:2a:a0:a5:48:ff:5b:ed:9d:e0:85 hduser@HDMachine
  11. The key's randomart image is:
  12. +--[ RSA 2048]----+
  13. | oo|
  14. | .+.|
  15. | . .. .|
  16. | . . ..o |
  17. | S o. |
  18. | . o . .. |
  19. | . o + .. E.. |
  20. | . + ..o.+ . |
  21. | .o. .o o |
  22. +-----------------+
  23. hduser@HDMachine:~$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
第二个命令将新创建的密钥添加到授权密钥列表中,以便Hadoop可以使用ssh免密码登陆。 使用下面命令可以检查ssh是否工作:
  1. hduser@HDMachine:~$ ssh localhost
  2. The authenticity of host 'localhost (127.0.0.1)' can't be established.
  3. ECDSA key fingerprint is e1:8b:a0:a5:75:ef:f4:b4:5e:a9:ed:be:64:be:5c:2f.
  4. Are you sure you want to continue connecting (yes/no)? yes
  5. Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
  6. Welcome to Ubuntu 14.04.1 LTS (GNU/Linux 3.13.0-40-generic x86_64)
下载Hadoop
使用wget命令去hadoop网站下载hadoop2.8.0安装包,并解压,命令如下:
  1. hduser@HDMachine:~$ wget http://mirrors.sonic.net/apache/hadoop/common/hadoop-2.8.0/hadoop-2.8.0.tar.gz
  2. hduser@HDMachine:~$ tar xvzf hadoop-2.8.0.tar.gz
为了方便管理和使用,将hadoop移动到/usr/local/hadoop目录,命令如下:
  1. hduser@HDMachine:~/hadoop-2.6.0$ sudo mv * /usr/local/hadoop
  2. [sudo] password for hduser:
修改hadoop配置文件
需要修改的配置文件主要有5个:
  • ~/.bashrc
  • /usr/local/hadoop/etc/hadoop/hadoop-env.sh
  • /usr/local/hadoop/etc/hadoop/core-site.xml
  • /usr/local/hadoop/etc/hadoop/mapred-site.xml.template
  • /usr/local/hadoop/etc/hadoop/hdfs-site.xml
配置~/.bashrc
在修改 .bashrc文件之前,需要先找到Java的安装路径,命令如下:
  1. hduser@HDMachine update-alternatives --config java
  2. There is only one alternative in link group java (providing /usr/bin/java): /usr/lib/jvm/java-8-oracle/jre/bin/java
  3. Nothing to configure.
找到Java的安装路径后,即可以在.bashrc文件设置环境变量JAVA_HOME。在使用vi命令打开.bashrc文件,并在文件末尾添加下面的环境变量:
  1. hduser@HDMachine:~$ vi ~/.bashrc
  2. #HADOOP VARIABLES START
  3. export JAVA_HOME=/usr/lib/jvm/java-8-oracle
  4. export HADOOP_INSTALL=/usr/local/hadoop
  5. export PATH=$PATH:$HADOOP_INSTALL/bin
  6. export PATH=$PATH:$HADOOP_INSTALL/sbin
  7. export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
  8. export HADOOP_COMMON_HOME=$HADOOP_INSTALL
  9. export HADOOP_HDFS_HOME=$HADOOP_INSTALL
  10. export YARN_HOME=$HADOOP_INSTALL
  11. export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
  12. export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
  13. #HADOOP VARIABLES END
  14. hduser@HDMachine:~$ source ~/.bashrc
最后使用source命令使环境变量起作用。 需要注意的是JAVA_HOME环境变量设置的目录是java目录的'.../bin/',即 '/usr/lib/jvm/java-8-oracle/jre/bin/java' 目录中的'/usr/lib/jvm/java-8-oracle'。
配置 /usr/local/hadoop/etc/hadoop/hadoop-env.sh
在hadoop-env.sh文件中需要设置 JAVAHOME 环境变量,使用vi编辑hadoop-env.sh文件,并设置JAVAHOME环境变量。命令如下:
  1. hduser@HDMachine:~$ vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh
  2. export JAVA_HOME=/usr/lib/jvm/java-8-oracle
在hadoop-env.sh文件中添加上述语句可确保每当Hadoop启动时,JAVA_HOME变量的值都可用于Hadoop。
配置 /usr/local/hadoop/etc/hadoop/core-site.xml
/usr/local/hadoop/etc/hadoop/core-site.xml文件包含Hadoop启动时候的配置选项,在该文件里面配置的选项会覆盖Hadoop的默认设置。 在配置/usr/local/hadoop/etc/hadoop/core-site.xml文件前,先创建临时目录,用于Hadoop临时文件的存储,命令如下:
  1. hduser@HDMachine:~$ sudo mkdir -p /app/hadoop/tmp
  2. hduser@HDMachine:~$ sudo chown hduser:hadoop /app/hadoop/tmp
然后使用vi命令打开/usr/local/hadoop/etc/hadoop/core-site.xml文件,并输入以下内容:
  1. hduser@HDMachine:~$ vi /usr/local/hadoop/etc/hadoop/core-site.xml
  2. <configuration>
  3. <property>
  4. <name>hadoop.tmp.dir</name>
  5. <value>/app/hadoop/tmp</value>
  6. <description>A base for other temporary directories.</description>
  7. </property>
  8. <property>
  9. <name>fs.default.name</name>
  10. <value>hdfs://localhost:54310</value>
  11. <description>The name of the default file system. A URI whose
  12. scheme and authority determine the FileSystem implementation. The
  13. uri's scheme determines the config property (fs.SCHEME.impl) naming
  14. the FileSystem implementation class. The uri's authority is used to
  15. determine the host, port, etc. for a filesystem.</description>
  16. </property>
  17. </configuration>
在这个配置里需要记住hdfs的路径,即'hdfs://localhost:54310',这个在后面配置Accumulo的时候会用到。
配置/usr/local/hadoop/etc/hadoop/mapred-site.xml
默认情况下, /usr/local/hadoop/etc/hadoop/目录下包含/usr/local/hadoop/etc/hadoop/mapred-site.xml.template 文件,没有mapred-site.xml文件。因此我们要拷贝mapred-site.xml.template文件并重命名为mapred-site.xml,命令如下:
hduser@HDMachine:~$ cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
mapred-site.xml用来指定MapReduce使用的框架。使用vi编辑mapred-site.xml文件,并输入下面的内容: ~~~ hduser@HDMachine:~$ vi /usr/local/hadoop/etc/hadoop/mapred-site.xml mapred.job.tracker localhost:54311 The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. ~~~
配置 /usr/local/hadoop/etc/hadoop/hdfs-site.xml
/usr/local/hadoop/etc/hadoop/hdfs-site.xml需要在集群中的每台机器上配置,其主要用来配置namenode 和datanode使用的目录。在配置该文件之前,先创建两个目录用来给namenode和datanode使用,命令如下:
  1. hduser@HDMachine:~$ sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
  2. hduser@HDMachine:~$ sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
  3. hduser@HDMachine:~$ sudo chown -R hduser:hadoop /usr/local/hadoop_store
然后使用vi编辑hdfs-site.xml文件,并输入下面的内容:
  1. hduser@HDMachine:~$ vi /usr/local/hadoop/etc/hadoop/hdfs-site.xml
  2. <configuration>
  3. <property>
  4. <name>dfs.replication</name>
  5. <value>1</value>
  6. <description>Default block replication.
  7. The actual number of replications can be specified when the file is created.
  8. The default is used if replication is not specified in create time.
  9. </description>
  10. </property>
  11. <property>
  12. <name>dfs.namenode.name.dir</name>
  13. <value>file:/usr/local/hadoop_store/hdfs/namenode</value>
  14. </property>
  15. <property>
  16. <name>dfs.datanode.data.dir</name>
  17. <value>file:/usr/local/hadoop_store/hdfs/datanode</value>
  18. </property>
  19. </configuration>
格式化Hadoop文件系统
在使用Hadoop之前,我们需要格式化Hadoop文件系统,命令如下:
  1. hduser@HDMachine:~$ hadoop namenode -format
  2. DEPRECATED: Use of this script to execute hdfs command is deprecated.
  3. Instead use the hdfs command for it.
  4. 17/05/03 11:12:45 INFO namenode.NameNode: STARTUP_MSG:
  5. /************************************************************
  6. STARTUP_MSG: Starting NameNode
  7. STARTUP_MSG: user = hduser
  8. STARTUP_MSG: host = HDMachine/127.0.1.1
  9. STARTUP_MSG: args = [-format]
  10. STARTUP_MSG: version = 2.8.0
  11. STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r 91f2b7a13d1e97be65db92ddabc627cc29ac0009; compiled by 'jdu' on 2017-03-17T04:12Z
  12. STARTUP_MSG: java = 1.8.0_121
  13. ************************************************************/
  14. 17/05/03 11:12:45 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
  15. 17/05/03 11:12:45 INFO namenode.NameNode: createNameNode [-format]
  16. 17/05/03 11:12:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  17. Formatting using clusterid: CID-ae42624b-2814-4b43-b473-a181d3075c2e
  18. 17/05/03 11:12:49 INFO namenode.FSEditLog: Edit logging is async:false
  19. 17/05/03 11:12:49 INFO namenode.FSNamesystem: KeyProvider: null
  20. 17/05/03 11:12:49 INFO namenode.FSNamesystem: fsLock is fair: true
  21. 17/05/03 11:12:49 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
  22. 17/05/03 11:12:49 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
  23. 17/05/03 11:12:49 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
  24. 17/05/03 11:12:49 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
  25. 17/05/03 11:12:49 INFO blockmanagement.BlockManager: The block deletion will start around 2017 May 03 11:12:49
  26. 17/05/03 11:12:49 INFO util.GSet: Computing capacity for map BlocksMap
  27. 17/05/03 11:12:49 INFO util.GSet: VM type = 64-bit
  28. 17/05/03 11:12:49 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB
  29. 17/05/03 11:12:49 INFO util.GSet: capacity = 2^21 = 2097152 entries
  30. 17/05/03 11:12:49 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
  31. 17/05/03 11:12:49 INFO blockmanagement.BlockManager: defaultReplication = 1
  32. 17/05/03 11:12:49 INFO blockmanagement.BlockManager: maxReplication = 512
  33. 17/05/03 11:12:49 INFO blockmanagement.BlockManager: minReplication = 1
  34. 17/05/03 11:12:49 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
  35. 17/05/03 11:12:49 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
  36. 17/05/03 11:12:49 INFO blockmanagement.BlockManager: encryptDataTransfer = false
  37. 17/05/03 11:12:49 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
  38. 17/05/03 11:12:49 INFO namenode.FSNamesystem: fsOwner = hduser (auth:SIMPLE)
  39. 17/05/03 11:12:49 INFO namenode.FSNamesystem: supergroup = supergroup
  40. 17/05/03 11:12:49 INFO namenode.FSNamesystem: isPermissionEnabled = true
  41. 17/05/03 11:12:49 INFO namenode.FSNamesystem: HA Enabled: false
  42. 17/05/03 11:12:49 INFO namenode.FSNamesystem: Append Enabled: true
  43. 17/05/03 11:12:50 INFO util.GSet: Computing capacity for map INodeMap
  44. 17/05/03 11:12:50 INFO util.GSet: VM type = 64-bit
  45. 17/05/03 11:12:50 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB
  46. 17/05/03 11:12:50 INFO util.GSet: capacity = 2^20 = 1048576 entries
  47. 17/05/03 11:12:50 INFO namenode.FSDirectory: ACLs enabled? false
  48. 17/05/03 11:12:50 INFO namenode.FSDirectory: XAttrs enabled? true
  49. 17/05/03 11:12:50 INFO namenode.NameNode: Caching file names occurring more than 10 times
  50. 17/05/03 11:12:51 INFO util.GSet: Computing capacity for map cachedBlocks
  51. 17/05/03 11:12:51 INFO util.GSet: VM type = 64-bit
  52. 17/05/03 11:12:51 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB
  53. 17/05/03 11:12:51 INFO util.GSet: capacity = 2^18 = 262144 entries
  54. 17/05/03 11:12:51 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
  55. 17/05/03 11:12:51 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
  56. 17/05/03 11:12:51 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000
  57. 17/05/03 11:12:51 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
  58. 17/05/03 11:12:51 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
  59. 17/05/03 11:12:51 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
  60. 17/05/03 11:12:51 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
  61. 17/05/03 11:12:51 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
  62. 17/05/03 11:12:51 INFO util.GSet: Computing capacity for map NameNodeRetryCache
  63. 17/05/03 11:12:51 INFO util.GSet: VM type = 64-bit
  64. 17/05/03 11:12:51 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB
  65. 17/05/03 11:12:51 INFO util.GSet: capacity = 2^15 = 32768 entries
  66. 17/05/03 11:12:51 INFO namenode.NNConf: ACLs enabled? false
  67. 17/05/03 11:12:51 INFO namenode.NNConf: XAttrs enabled? true
  68. 17/05/03 11:12:51 INFO namenode.NNConf: Maximum size of an xattr: 16384
  69. 17/05/03 11:12:52 INFO namenode.FSImage: Allocated new BlockPoolId: BP-130729900-192.168.1.1-1429393391595
  70. 17/05/03 11:12:52 INFO common.Storage: Storage directory /usr/local/hadoop_store/hdfs/namenode has been successfully formatted.
  71. 17/05/03 11:12:52 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
  72. 17/05/03 11:12:52 INFO util.ExitUtil: Exiting with status 0
  73. 17/05/03 11:12:52 INFO namenode.NameNode: SHUTDOWN_MSG:
  74. /************************************************************
  75. SHUTDOWN_MSG: Shutting down NameNode at HDMachine/192.168.1.1
需要注意的是在开始使用Hadoop之前,hadoop namenode -format 命令只能执行一次。如果已经开始使用了Hadoop,再次执行该命令会销毁所有存储在HDFS上的数据。
启动Hadoop
启动Hadoop,直接执行sbin目录下的start-all.sh即可,命令如下:
  1. hduser@HDMachine:~$ cd /usr/local/hadoop/sbin/
  2. hduser@HDMachine:/usr/local/hadoop/sbin$ ls
  3. distribute-exclude.sh hdfs-config.sh refresh-namenodes.sh start-balancer.sh start-yarn.cmd stop-balancer.sh stop-yarn.cmd
  4. hadoop-daemon.sh httpfs.sh slaves.sh start-dfs.cmd start-yarn.sh stop-dfs.cmd stop-yarn.sh
  5. hadoop-daemons.sh kms.sh start-all.cmd start-dfs.sh stop-all.cmd stop-dfs.sh yarn-daemon.sh
  6. hdfs-config.cmd mr-jobhistory-daemon.sh start-all.sh start-secure-dns.sh stop-all.sh stop-secure-dns.sh yarn-daemons.sh
  7. hduser@HDMachine:/usr/local/hadoop/sbin$ start-all.sh
  8. This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
  9. 17/05/03 14:07:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  10. Starting namenodes on [localhost]
  11. hduser@localhost's password:
  12. localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hduser-namenode-HDMachine.out
  13. hduser@localhost's password:
  14. localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hduser-datanode-HDMachine.out
  15. Starting secondary namenodes [0.0.0.0]
  16. hduser@0.0.0.0's password:
  17. 0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hduser-secondarynamenode-HDMachine.out
  18. 17/05/03 14:07:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  19. starting yarn daemons
  20. starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hduser-resourcemanager-HDMachine.out
  21. hduser@localhost's password:
  22. localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hduser-nodemanager-HDMachine.out
  23. hduser@HDMachine:/usr/local/hadoop/sbin$
使用jps命令检查Hadoop是否在运行:
  1. hduser@HDMachine:/usr/local/hadoop/sbin$ jps
  2. 51633 Jps
  3. 50756 DataNode
  4. 50981 SecondaryNameNode
  5. 51318 NodeManager
  6. 50570 NameNode
  7. 51149 ResourceManager
看到上面的输出,意味着hadoop已经成功运行。 另外一种检查Hadoop是否在运行的方法是使用netstat命令,命令如下:
  1. hduser@HDMachine:/usr/local/hadoop/sbin$ netstat -plten | grep java
  2. (Not all processes could be identified, non-owned process info
  3. will not be shown, you would have to be root to see it all.)
  4. tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN 1003 119588 50570/java
  5. tcp 0 0 127.0.0.1:59447 0.0.0.0:* LISTEN 1003 127666 50756/java
  6. tcp 0 0 0.0.0.0:50010 0.0.0.0:* LISTEN 1003 127653 50756/java
  7. tcp 0 0 0.0.0.0:50075 0.0.0.0:* LISTEN 1003 119763 50756/java
  8. tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN 1003 128653 50756/java
  9. tcp 0 0 127.0.0.1:54310 0.0.0.0:* LISTEN 1003 128405 50570/java
  10. tcp 0 0 0.0.0.0:50090 0.0.0.0:* LISTEN 1003 130314 50981/java
  11. tcp6 0 0 :::8088 :::* LISTEN 1003 129481 51149/java
  12. tcp6 0 0 :::8030 :::* LISTEN 1003 131806 51149/java
  13. tcp6 0 0 :::8031 :::* LISTEN 1003 131788 51149/java
  14. tcp6 0 0 :::8032 :::* LISTEN 1003 131810 51149/java
  15. tcp6 0 0 :::8033 :::* LISTEN 1003 137455 51149/java
  16. tcp6 0 0 :::60261 :::* LISTEN 1003 131852 51318/java
  17. tcp6 0 0 :::8040 :::* LISTEN 1003 131858 51318/java
  18. tcp6 0 0 :::8042 :::* LISTEN 1003 134564 51318/java
50070、50010、54310等端口都是Hadoop在使用的端口。
停止Hadoop
通过运行sbin目录中的stop-all.sh脚本可以停止Hadoop。命令如下:
  1. hduser@HDMachine:/usr/local/hadoop/sbin$ stop-all.sh
  2. This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
  3. 17/05/03 14:25:08 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  4. Stopping namenodes on [localhost]
  5. hduser@localhost's password:
  6. localhost: stopping namenode
  7. hduser@localhost's password:
  8. localhost: stopping datanode
  9. Stopping secondary namenodes [0.0.0.0]
  10. hduser@0.0.0.0's password:
  11. 0.0.0.0: stopping secondarynamenode
  12. 17/05/03 14:25:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  13. stopping yarn daemons
  14. stopping resourcemanager
  15. hduser@localhost's password:
  16. localhost: stopping nodemanager
  17. no proxyserver to stop
Hadoop的wen管理页面
在浏览器中打开http://localhost:50070,即可以通过浏览器查看namenode的信息,如图1:
1.3 安装ZooKeeper
ZooKeeper是一个分布式的,开放源码的分布式应用程序协调服务,是Google的Chubby一个开源的实现,是Hadoop和Hbase的重要组件。它是一个为分布式应用提供一致性服务的软件,提供的功能包括:配置维护、域名服务、分布式同步、组服务等。 ZooKeeper的目标就是封装好复杂易出错的关键服务,将简单易用的接口和性能高效、功能稳定的系统提供给用户。
下载ZooKeeper
使用wget命令去ZooKeeper下载网站下载ZooKeeper3.4.10安装包,并解压,命令如下: ~~~ hduser@HDMachine:~$ wget http://www.eu.apache.org/dist/zookeeper/stable/zookeeper-3.4.10.tar.gz hduser@HDMachine:~$ tar xvzf zookeeper-3.4.10.tar.gz ~~~ 为了方便管理和使用,将zookeeper移动到/usr/local/zookeeper目录,命令如下:
  1. hduser@HDMachine:~/zookeeper-3.4.10$ sudo mv * /usr/local/zookeeper
  2. [sudo] password for hduser:
配置ZooKeeper
拷贝ZooKeeper的配置文件模板到conf目录下,并重命名为zoo.cfg,命令如下:
hduser@HDMachine:~$ cp /usr/local/zookeeper/conf/zoo_sample.cfg /usr/local/zookeeper/conf/zoo.cfg
启动ZooKeeper
使用bin目录下的zkServer.sh脚本启动ZooKeeper,命令如下:
hduser@HDMachine:/usr/local/zookeeper$ bin/zkServer.sh start
看到下面的日志,则说明ZooKeeper启动成功。
  1. ZooKeeper JMX enabled by default
  2. Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
  3. Starting zookeeper ... STARTED

1.4 安装Accumulo

下载Accumulo
使用wget命令去Accumulo下载网站下载Accumulo1.7 安装包,并解压,命令如下:
  1. hduser@HDMachine:~$ wget https://www.apache.org/dyn/closer.lua/accumulo/1.7.3/accumulo-1.7.3-bin.tar.gz
  2. hduser@HDMachine:~$ tar xvzf accumulo-1.7.3-bin.tar.gz
为了方便管理和使用,将Accumulo移动到/usr/local/accumulo目录,命令如下:
  1. hduser@HDMachine:~/accumulo-1.7.3-bin$ sudo mv * /usr/local/accumulo
  2. [sudo] password for hduser:
配置Accumulo
Accumulo提供了具有各种内存大小的服务器的示例配置,分别为512 MB,1 GB,2 GB和3 GB。本文采取512 MB的配置,用户可以根据服务器配置来选择不同的Accumulo配置文件。 拷贝512M对应的配置文件到conf目录,命令如下:
hduser@HDMachine:~$ cp /usr/local/accumulo/conf/examples/512MB/standalone/* /usr/local/accumulo/conf/
配置~/.bashrc文件
使用vi编辑~/.bashrc文件,设置HADOOPHOME和ZOOKEEPERHOME环境变量,命令如下:
  1. hduser@HDMachine:~$sudo vi ~/.bashrc
  2. export HADOOP_HOME=/usr/local/hadoop/
  3. export ZOOKEEPER_HOME=/usr/local/zookeeper/
配置accumulo-env.sh
使用vi打开accumulo-env.sh文件,设置ACCUMULOMONITORBIND_ALL选项为true,命令如下:
  1. hduser@HDMachine:~$sudo vi /usr/local/accumulo/conf/accumulo-env.sh
  2. export ACCUMULO_MONITOR_BIND_ALL="true"
默认情况下,Accumulo的HTTP监听进程仅绑定到本地网络接口。 为了能够通过Internet访问它,必须将ACCUMULOMONITORBIND_ALL的值设置为true。
配置accumulo-site.xml
accumulo的工作进程之间沟通需要使用密码,在accumulo-site.xml文件中我们可以将密码修改为一个安全的密码。在accumulo-site.xml文件中找到instance.secret,然后修改其value,本文中修改为PASS1234。修改后的配置文件如下:
  1. <property>
  2. <name>instance.secret</name>
  3. <value>PASS1234</value>
  4. <description>A secret unique to a given instance that all servers must know in order to communicate with one another.
  5. Change it before initialization. To
  6. change it later use ./bin/accumulo org.apache.accumulo.server.util.ChangeSecret --old [oldpasswd] --new [newpasswd],
  7. and then update this file.
  8. </description>
  9. </property>
然后在accumulo-site.xml文件中增加instance.volumes属性,该属性用来配置accumulo存储数据的HDFS路径,配置好的属性如下:
  1. <property>
  2. <name>instance.volumes</name>
  3. <value>hdfs://localhost:54310/accumulo</value>
  4. </property>
最后,在accumulo-site.xml文件中找到trace.token.property.password选项,修改其value值为安全的密码。这个密码在accumulo初始化的时候会使用,配置好的属性如下:
  1. <property>
  2. <name>trace.token.property.password</name>
  3. <value>mypassw</value>
  4. </property>
初始化Accumulo
使用bin目录下的accumulo进行初始化,命令如下:
  1. hduser@HDMachine:/usr/local/accumulo$ bin/accumulo init
  2. 2017-05-03 16:38:11,332 [conf.ConfigSanityCheck] WARN : Use of instance.dfs.uri and instance.dfs.dir are deprecated. Consider using instance.volumes instead.
  3. 2017-05-03 16:38:12,800 [fs.VolumeManagerImpl] WARN : dfs.datanode.synconclose set to false in hdfs-site.xml: data loss is possible on hard system reset or power loss
  4. 2017-05-03 16:38:12,802 [init.Initialize] INFO : Hadoop Filesystem is hdfs://localhost:54310
  5. 2017-05-03 16:38:12,803 [init.Initialize] INFO : Accumulo data dirs are [hdfs://localhost:54310/accumulo]
  6. 2017-05-03 16:38:12,803 [init.Initialize] INFO : Zookeeper server is localhost:2181
  7. 2017-05-03 16:38:12,803 [init.Initialize] INFO : Checking if Zookeeper is available. If this hangs, then you need to make sure zookeeper is running
  8. Instance name : geomesa
  9. Enter initial password for root (this may not be applicable for your security setup): ******
  10. Confirm initial password for root: ******
  11. 2017-05-03 16:38:28,350 [Configuration.deprecation] INFO : dfs.replication.min is deprecated. Instead, use dfs.namenode.replication.min
  12. 2017-05-03 16:38:33,501 [Configuration.deprecation] INFO : dfs.block.size is deprecated. Instead, use dfs.blocksize
  13. 2017-05-03 16:38:35,553 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKAuthorizor
  14. 2017-05-03 16:38:35,568 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKAuthenticator
  15. 2017-05-03 16:38:35,574 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKPermHandler
在初始化过程中需要输入Instance Name和password。Instance name 设置的为geomesa,password为accumulo-site.xml文件中设置的trace.token.property.password。
启动Accumulo
使用bin目录下的start-all.sh脚本可以启动accumulo,命令如下:
  1. hduser@HDMachine:/usr/local/accumulo$ ./bin/start-all.sh
  2. Starting monitor on localhost
  3. WARN : Max open files on localhost is 1024, recommend 32768
  4. Starting tablet servers .... done
  5. 2017-05-03 16:44:46,682 [conf.ConfigSanityCheck] WARN : Use of instance.dfs.uri and instance.dfs.dir are deprecated. Consider using instance.volumes instead.
  6. 2017-05-03 16:44:48,422 [fs.VolumeManagerImpl] WARN : dfs.datanode.synconclose set to false in hdfs-site.xml: data loss is possible on hard system reset or power loss
  7. 2017-05-03 16:44:48,426 [server.Accumulo] INFO : Attempting to talk to zookeeper
  8. 2017-05-03 16:44:48,578 [server.Accumulo] INFO : ZooKeeper connected and initialized, attempting to talk to HDFS
  9. 2017-05-03 16:44:48,720 [server.Accumulo] INFO : Connected to HDFS
  10. Starting tablet server on localhost
  11. WARN : Max open files on localhost is 1024, recommend 32768
  12. Starting master on localhost
  13. WARN : Max open files on localhost is 1024, recommend 32768
  14. Starting garbage collector on localhost
  15. WARN : Max open files on localhost is 1024, recommend 32768
  16. Starting tracer on localhost
  17. WARN : Max open files on localhost is 1024, recommend 32768
Web管理页面
Accumulo启动以后,可以通过 http://localhost:50095打开Accumulo Web管理页面,如图3-2:

1.5 安装GeoServer

GeoServer 是 OpenGIS Web 服务器规范的 J2EE 实现,利用 GeoServer 可以方便的发布地图数据,允许用户对特征数据进行更新、删除、插入操作,通过 GeoServer 可以比较容易的在用户之间迅速共享空间地理信息。GeoServer兼容 WMS 和 WFS 特性;支持 PostgreSQL、 Shapefile 、 ArcSDE 、 Oracle 、 VPF 、 MySQL 、 MapInfo ;支持上百种投影;能够将网络地图输出为 jpeg 、 gif 、 png 、 SVG 、 KML 等格式;能够运行在任何基于 J2EE/Servlet 容器之上;嵌入 MapBuilder 支持 AJAX 的地图客户端OpenLayers;除此之外还包括许多其他的特性。
下载GeoServer
使用wget命令去GeoServer下载网站下载GeoServer2.9.1 安装包,并解压,命令如下:
  1. hduser@HDMachine:~$ wget https://sourceforge.net/projects/geoserver/files/GeoServer/2.9.1/geoserver-2.9.1-bin.zip
  2. hduser@HDMachine:~$ unzip geoserver-2.9.1-bin.zip
为了方便管理和使用,将GeoServer移动到/usr/local/geoserver目录,命令如下:
  1. hduser@HDMachine:~/geoserver-2.9.1-bin$ sudo mv * /usr/local/geoserver
  2. [sudo] password for hduser:
配置~/.bashrc文件
使用vi打开 ~/.bashrc文件,设置GEOSERVER_HOME环境变量,命令如下:
  1. hduser@HDMachine:~$ vi ~/.bashrc
  2. export GEOSERVER_HOME=/usr/local/geoserver
使用source命令加载~/.bashrc文件,使新配置的环境变量生效,命令如下:
hduser@HDMachine:~$source ~/.bashrc
修改geoserver文件夹拥有者
使用chown命令修改geoserver文件夹拥有者,将geoserver文件夹拥有者修改为hadoop用户,即hduser。命令如下:
hduser@HDMachine:~$sudo chown -R hduser /usr/local/geoserver/
启动GeoServer
进入geoserver/bin目录,执行startup.sh脚本,即可启动GeoServer,命令如下:
  1. hduser@HDMachine:~$cd /usr/local/geoserver/bin
  2. hduser@HDMachine:/usr/local/geoserver/bin$ ./startup.sh
打开GeoServerWeb控制台
在Web浏览器中输入http://localhost:8080/geoserver,即可打开GeoServerWeb控制台,如图3-3: 用户可以使用默认 admin/geoserver 登陆进行管理。

1.6 安装GeoMesa Accumulo

GeoMesa Accumulo的安装有2种方式,从编译好的二进制包安装和编译源码安装。
1.6.1 从二进制包安装
从从编译好的二进制包安装比较简单,直接下载编译好的安装包,然后解压出来即可,脚本如下:
  1. $ wget http://repo.locationtech.org/content/repositories/geomesa-releases/org/locationtech/geomesa/geomesa-accumulo-dist_2.11/$VERSION/geomesa-accumulo-dist_2.11-$VERSION-bin.tar.gz
  2. $ tar xvf geomesa-accumulo-dist_2.11-$VERSION-bin.tar.gz
  3. $ cd geomesa-accumulo-dist_2.11-$VERSION
  4. $ ls
  5. bin/ conf/ dist/ docs/ emr4/ examples/ lib/ LICENSE.txt logs/
其中$VERSION为当前GeoMesa Accumulo的版本,目前最新版本为1.3.1
1.6.2 从源代码编译安装
1)环境依赖
在编译代码之前,需要确保安装了下面的软件:
  • Java JDK 8
  • Apache Maven 3.2.2+
  • git client
2)下载代码
使用git命令下载代码,命令如下:
  1. $ git clone https://github.com/locationtech/geomesa.git
  2. $ cd geomesa
切换代码到最新版本,当前版本为1.3.1($VERSION=1.3.1),命令如下:
 git checkout tags/geomesa-$VERSION -b geomesa-$VERSION
3)编译代码
使用maven编译代码,maven的项目文件pom.xml在代码的根目录下。命令如下:
$ mvn clean install
将skipTests属性设为true可以加快编译速度,命令如下:
$ mvn clean install -DskipTests=true
使用 build/mvn命令可以基于Zinc 进行增量编译,命令如下:
$ build/mvn clean install
1.6.3 安装Accumulo 分布式运行时库
geomesa-accumulo-dist_2.11-$VERSION/dist/accumulo目录下包含了accumuloserver端运行时所需要的类库,这些类库需要部署到accumulo集群中每台tablet服务器上。 需要注意的是安装目录下有2个运行时类库,一个支持Raster,一个不支持Raster。安装的时候只需要安装一个即可。两个同时安装会导致问题。另外运行时类库的版本必须和GeoMesa 数据存储客户端类库(在GeoServer中使用)的版本一致,否则在查询数据不一定能够正确工作。
1)手动安装
将运行时类库拷贝到集群中每个tablet服务器的$ACCUMULO_HOME/lib/ext目录中,命令如下:
  1. # something like this for each tablet server
  2. $ scp dist/accumulo/geomesa-accumulo-distributed-runtime_2.11-$VERSION.jar \
  3. tserver1:$ACCUMULO_HOME/lib/ext
  4. # or for raster support
  5. $ scp dist/accumulo/geomesa-accumulo-distributed-runtime-raster_2.11-$VERSION.jar \
  6. tserver1:$ACCUMULO_HOME/lib/ext
需要注意的是accumulo主服务器( master server)不需要安装运行时类库。
1)命名空间安装
使用手工的方式安装运行时库,可以保证能够GeoMesa Accumulo正确运行。但是从Accumulo 1.6+,我们可以利用namespace将GeoMesa类路径与其余的Accumulo隔离开来。 使用geomesa-accumulo-dist_2.11-$VERSION/bin目录下的setup-namespace.sh脚本,可以基于NameSpace进行安装,命令如下:
./setup-namespace.sh -u myUser -n myNamespace
setup-namespace.sh脚本的参数如下:
  • -u <Accumulo username>
  • -n <Accumulo namespace>
  • -p <Accumulo password> (可选,如果不提供,将提示)
  • -g <Path of GeoMesa distributed runtime JAR> (可选, 默认为distribution 文件夹,并且不支持Raster)
  • -h <HDFS URI e.g. hdfs://localhost:54310> (可选,如果不提供,将提示)
或者,可以使用以下命令手动安装分布式运行时类库:
  1. $ accumulo shell -u root
  2. > createnamespace myNamespace
  3. > grant NameSpace.CREATE_TABLE -ns myNamespace -u myUser
  4. > config -s general.vfs.context.classpath.myNamespace=hdfs://NAME_NODE_FDQN:54310/accumulo/classpath/myNamespace/[^.].*.jar
  5. > config -ns myNamespace -s table.classpath.context=myNamespace
执行完上面的命令后,可以手动拷贝分布式运行时类库到HDFS中指定的目录下。上面的例子中的目录只是个例子,可以使用包括项目名称,版本号和其他信息的嵌套文件夹,以便在同一个Accumulo实例上具有不同版本的GeoMesa。
1.6.4 配置Accumulo命令行工具
在geomesa-accumulo2.11-$VERSION/bin/目录中,GeoMesa提供了一些命令行工具帮助用户管理Accumulo。可以通过运行geomesa-accumulo2.11-$VERSION/bin/目录下的geomesa-env.sh脚本来设置环境变量。 在geomesa-accumulo_2.11-$VERSION目录下运行bin/geomesa configure 来配置这些工具,命令如下:
  1. ### in geomesa-accumulo_2.11-$VERSION/:
  2. $ bin/geomesa configure
  3. Warning: GEOMESA_ACCUMULO_HOME is not set, using /path/to/geomesa-accumulo_2.11-$VERSION
  4. Using GEOMESA_ACCUMULO_HOME as set: /path/to/geomesa-accumulo_2.11-$VERSION
  5. Is this intentional? Y\n y
  6. Warning: GEOMESA_LIB already set, probably by a prior configuration.
  7. Current value is /path/to/geomesa-accumulo_2.11-$VERSION/lib.
  8. Is this intentional? Y\n y
  9. To persist the configuration please update your bashrc file to include:
  10. export GEOMESA_ACCUMULO_HOME=/path/to/geomesa-accumulo_2.11-$VERSION
  11. export PATH=${GEOMESA_ACCUMULO_HOME}/bin:$PATH
执行完上面的命令后,编辑~/.bashrc文件,并将下面命令加入到bashrc文件中:
  1. export GEOMESA_ACCUMULO_HOME=/path/to/geomesa-accumulo_2.11-$VERSION
  2. export PATH=${GEOMESA_ACCUMULO_HOME}/bin:$PATH
然后保存bashrc文件,并重新加载bashrc文件,命令如下:
$ source ~/.bashrc
由于授权的限制,支持shapefile和raster的相关文件需要单独安装,命令如下:
  1. $ bin/install-jai.sh
  2. $ bin/install-jline.sh
测试GeoMesa命令行工具,执行geomesa即可,命令如下:
  1. $ geomesa
  2. Using GEOMESA_ACCUMULO_HOME = /path/to/geomesa-accumulo-dist_2.11-$VERSION
  3. Usage: geomesa [command] [command options]
  4. Commands:
  5. ...
1.6.5 在GeoServer中安装GeoMesa Accumulo插件
GeoMesa实现了兼容GeoTools的数据存储接口,因此在GeoServer中可以将GeoMesa Accumulo作为数据源使用。 在GeoServer运行之后,需要安装GeoServer的WPS插件,关于WPS插件的安装请参考GeoServer的安装文档。 在GeoServer中安装GeoMesa Accumulo插件可以使用bin目录下的manage-geoserver-plugins.sh脚本,命令如下:
  1. $ bin/manage-geoserver-plugins.sh --lib-dir /path/to/geoserver/WEB-INF/lib/ --install
  2. Collecting Installed Jars
  3. Collecting geomesa-gs-plugin Jars
  4. Please choose which modules to install
  5. Multiple may be specified, eg: 1 4 10
  6. Type 'a' to specify all
  7. --------------------------------------
  8. 0 | geomesa-accumulo-gs-plugin_2.11-$VERSION
  9. 1 | geomesa-blobstore-gs-plugin_2.11-$VERSION
  10. 2 | geomesa-process_2.11-$VERSION
  11. 3 | geomesa-stream-gs-plugin_2.11-$VERSION
  12. Module(s) to install: 0 1
  13. 0 | Installing geomesa-accumulo-gs-plugin_2.11-$VERSION-install.tar.gz
  14. 1 | Installing geomesa-blobstore-gs-plugin_2.11-$VERSION-install.tar.gz
  15. Done
如果使用手动安装的方式,需要解压geomesa-accumulo2.11-$VERSION/dist/geoserver/目录下的 geomesa-accumulo-gs-plugin2.11-$VERSION-install.tar.gz文件,然后将解压的文件拷贝到GeoServer’s lib目录下面,如果GeoServer使用Tomcat部署,命令如下:
  1. $ tar -xzvf \
  2. geomesa-accumulo_2.11-$VERSION/dist/geoserver/geomesa-accumulo-gs-plugin_2.11-$VERSION-install.tar.gz \
  3. -C /path/to/tomcat/webapps/geoserver/WEB-INF/lib/
  4. ~~~
  5. 如果使用GeoServer内置的Jetty,命令如下:
tar -xzvf \ geomesa-accumulo2.11-$VERSION/dist/geoserver/geomesa-accumulo-gs-plugin2.11-$VERSION-install.tar.gz \ -C /path/to/geoserver/webapps/geoserver/WEB-INF/lib/ ~~~ 还有一些其他的Jar包,例如Accumulo, Zookeeper, Hadoop, and Thrift的包需要拷贝到GeoServer’s WEB-INF/lib目录下面,使用geomesa-accumulo2.11-$VERSION/bin目录下的$GEOMESAACCUMULO_HOME/bin/install-hadoop-accumulo.sh脚本可以方便的安装这些依赖包,命令如下:
  1. $ $GEOMESA_ACCUMULO_HOME/bin/install-hadoop-accumulo.sh /path/to/tomcat/webapps/geoserver/WEB-INF/lib/
  2. Install accumulo and hadoop dependencies to /path/to/tomcat/webapps/geoserver/WEB-INF/lib/?
  3. Confirm? [Y/n]y
  4. fetching https://search.maven.org/remotecontent?filepath=org/apache/accumulo/accumulo-core/1.6.5/accumulo-core-1.6.5.jar
  5. --2015-09-29 15:06:48-- https://search.maven.org/remotecontent?filepath=org/apache/accumulo/accumulo-core/1.6.5/accumulo-core-1.6.5.jar
  6. Resolving search.maven.org (search.maven.org)... 207.223.241.72
  7. Connecting to search.maven.org (search.maven.org)|207.223.241.72|:443... connected.
  8. HTTP request sent, awaiting response... 200 OK
  9. Length: 4646545 (4.4M) [application/java-archive]
  10. Saving to: ‘/path/to/tomcat/webapps/geoserver/WEB-INF/lib/accumulo-core-1.6.5.jar’
  11. ...
安装完GeoMesa Accumulo插件后,就可以在GeoServer中使用Accumulo作为数据源了,具体的使用会在后面章节介绍。

查看原文:http://www.giser.net/?p=1559
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/繁依Fanyi0/article/detail/745265
推荐阅读
相关标签
  

闽ICP备14008679号