mardi 29 septembre 2015

on Amazon EMR 4.0.0, setting /etc/spark/conf/spark-env.conf is ineffective

I'm launching my spark-based hiveserver2 on Amazon EMR, which has an extra classpath dependency. Due to this bug in Amazon EMR:

http://ift.tt/1iJodbh

My classpath cannot be submitted through "--driver-class-path" option

So I'm bounded to modify /etc/spark/conf/spark-env.conf to add the extra classpath:

# Add Hadoop libraries to Spark classpath
SPARK_CLASSPATH="${SPARK_CLASSPATH}:${HADOOP_HOME}/*:${HADOOP_HOME}/../hadoop-hdfs/*:${HADOOP_HOME}/../hadoop-mapreduce/*:${HADOOP_HOME}/../hadoop-yarn/*:/home/hadoop/git/datapassport/*"

where "/home/hadoop/git/datapassport/*" is my classpath.

However after launching the server successfully, the Spark environment parameter shows that my change is ineffective:

spark.driver.extraClassPath :/usr/lib/hadoop/*:/usr/lib/hadoop/../hadoop-hdfs/*:/usr/lib/hadoop/../hadoop-mapreduce/*:/usr/lib/hadoop/../hadoop-yarn/*:/etc/hive/conf:/usr/lib/hadoop/../hadoop-lzo/lib/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*

Is this configuration file obsolete? Where is the new file and how to fix this problem?




Aucun commentaire:

Enregistrer un commentaire