赞
踩
使用 $HADOOP_PREFIX/sbin/hadoop-daemons.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
启动 datanode 后,通过jps
发现并没有 datanode 进程,在 datanode 日志中存在以下报错信息:
2020-08-29 12:49:13,822 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid dfs.datanode.data.dir /data/datanode : EPERM: Operation not permitted at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method) at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:230) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:729) at org.apache.hadoop.fs.ChecksumFileSystem$1.apply(ChecksumFileSystem.java:505) at org.apache.hadoop.fs.ChecksumFileSystem$FsOperation.run(ChecksumFileSystem.java:486) at org.apache.hadoop.fs.ChecksumFileSystem.setPermission(ChecksumFileSystem.java:502) at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:140) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:156) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2385) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2427) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2409) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2301) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2348) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2530) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2554) 2020-08-29 12:49:13,824 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain java.io.IOException: All directories in dfs.datanode.data.dir are invalid: "/data/datanode/" at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2436) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2409) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2301) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2348) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2530) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2554) 2020-08-29 12:49:13,825 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2020-08-29 12:49:13,827 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at slave1/192.168.90.36 ************************************************************/
Invalid dfs.datanode.data.dir /data/datanode :
EPERM: Operation not permitted
从这句错误信息中可以看出是 datanode 启动时,对 /data/datanode
这个目录进行操作时缺少某些权限,那么 datanode 是对这个目录做什么样的操作,需要怎样的权限?
EPERM: Operation not permitted at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method) at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:230) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:729) at org.apache.hadoop.fs.ChecksumFileSystem$1.apply(ChecksumFileSystem.java:505) at org.apache.hadoop.fs.ChecksumFileSystem$FsOperation.run(ChecksumFileSystem.java:486) at org.apache.hadoop.fs.ChecksumFileSystem.setPermission(ChecksumFileSystem.java:502) at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:140) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:156) at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2385) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2427) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2409) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2301) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2348) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2530) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2554)
从上面这些报错信息中可以大体猜测出,datanode 启动时,需要对指定的目录进行 chmod 操作。
static DataNode makeInstance(Collection<StorageLocation> dataDirs,
Configuration conf, SecureResources resources) throws IOException {
LocalFileSystem localFS = FileSystem.getLocal(conf);
FsPermission permission = new FsPermission(
conf.get(DFS_DATANODE_DATA_DIR_PERMISSION_KEY,
DFS_DATANODE_DATA_DIR_PERMISSION_DEFAULT));
DataNodeDiskChecker dataNodeDiskChecker =
new DataNodeDiskChecker(permission);
List<StorageLocation> locations =
checkStorageLocations(dataDirs, localFS, dataNodeDiskChecker);
DefaultMetricsSystem.initialize("DataNode");
assert locations.size() > 0 : "number of data directories should be > 0";
return new DataNode(conf, locations, resources);
}
通过 hadoop 源码可以看出,datanode 需要将目录的权限,通过 chmod
改为 DFS_DATANODE_DATA_DIR_PERMISSION_DEFAULT
,DFS_DATANODE_DATA_DIR_PERMISSION_DEFAULT
是 700
。
而我这个环境里的 /data/datanode
路径所有者是 deploy
,启动 datanode 的用户却是 hdfs
,hdfs
这个用户是没有权限修改此目录的 file mode 的:
[deploy@slave1 ~]$ ll /data/
总用量 0
drwxrwxr-x 2 deploy hadoop 6 8月 28 15:48 datanode
drwxrwxr-x 2 deploy hadoop 115 8月 28 19:50 logs
drwxrwxr-x 2 deploy hadoop 6 8月 28 15:48 namenode
drwxrwxr-x 2 deploy hadoop 38 8月 28 19:50 pids
drwxrwxr-x 4 deploy hadoop 80 8月 28 16:22 software
drwxrwxr-x 3 deploy hadoop 26 8月 28 19:53 tmp
知道了是路径所有者的问题后,只需要在 datanode 节点上将 /data/datanode
路径的所有者改为hdfs
即可:
sudo chown -R hdfs:hdfs /data/datanode/
再次启动 datanode:
[hdfs@slave1 ~]$ $HADOOP_PREFIX/sbin/hadoop-daemons.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
slave2: starting datanode, logging to /data/logs/hadoop-hdfs-datanode-slave2.out
slave1: starting datanode, logging to /data/logs/hadoop-hdfs-datanode-slave1.out
slave3: starting datanode, logging to /data/logs/hadoop-hdfs-datanode-slave3.out
[hdfs@slave1 ~]$ jps
6771 Jps
6668 DataNode
可以看出,datanode 已经启动成功了~
一些简单的问题,可以尝试这看看源代码,hadoop 的源代码地址是:https://github.com/apache/hadoop
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。