赞
踩
我们将 Hadoop 的 classhpath 信息添加到 CLASSPATH 变量中,在 ~/.bashrc 中增加如下几行:</span>
export HADOOP_HOME=/home/hadoop/app/hadoop
export JAVA_HOME=/home/hadoop/app/java/jdk
exportSCALA_HOME=/home/hadoop/app/scala
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:${SCALA_HOME}/bin:${SPARK_HOME}/bin:$PATH
export CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath):$CLASSPATH
spark-env.sh中添加环境变量:
export JAVA_HOME=/home/hadoop/app/jdk
export SCALA_HOME=/home/hadoop/app/scala
export SPARK_MASTER_IP=zhangge
export SPARK_WORKER_CORES=1
export SPARK_WORKER_MEMORY=1g
export HADOOP_CONF_DIR=/home/hadoop/app/hadoop/etc/hadoop
export SPARK_DIST_CLASSPATH=$(hadoop classpath)//非常重要的变量。可参考下面的链接解释
https://spark.apache.org/docs/latest/hadoop-provided.html#using-sparks-hadoop-free-build
- Using Spark's "Hadoop Free" Build
- Spark uses Hadoop client libraries for HDFS and YARN. Starting in version Spark 1.4, the project packages “Hadoop free” builds that lets you more easily connect a single Spark binary to any Hadoop version. To use these builds, you need to modify SPARK_DIST_CLASSPATH to include Hadoop’s package jars. The most convenient place to do this is by adding an entry in conf/spark-env.sh.
- This page describes how to connect Spark to Hadoop for different types of distributions.
- Apache Hadoop
- For Apache distributions, you can use Hadoop’s ‘classpath’ command. For instance:
For Apache distributions, you can use Hadoop’s ‘classpath’ command. For instance:
- ### in conf/spark-env.sh ###
-
- # If 'hadoop' binary is on your PATH
- export SPARK_DIST_CLASSPATH=$(hadoop classpath)
-
- # With explicit path to 'hadoop' binary
- export SPARK_DIST_CLASSPATH=$(/path/to/hadoop/bin/hadoop classpath)
-
- # Passing a Hadoop configuration directory
- export SPARK_DIST_CLASSPATH=$(hadoop --config /path/to/configs classpath)
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。