mercredi 16 septembre 2015

Can't connect to Spark cluster in EC2 using pyspark

I've followed the instructions on the website of Spark and I got 1 master and 1 slave running in my Amazon. However, I'm not able to connect to the master node using pyspark

I can connect to the master node using SSH without any problem.

Here's my command spark-ec2 --key-pair=graph-cluster --identity-file=/Users/.ssh/pem.pem --region=us-east-1 --zone=us-east-1a launch graph-cluster

I can go to http://ift.tt/1OfPqiX

and see that Spark is up and running I also see this Spark Master at

spark://ec2-54-152-xx-xxx.compute-1.amazonaws.com:7077

However when I run command

MASTER=spark://ec2-54-152-xx-xx.compute-1.amazonaws.com:7077 pyspark

I get this error

2015-09-16 15:39:31,800 ERROR actor.OneForOneStrategy (Slf4jLogger.scala:apply$mcV$sp(66)) -
java.lang.NullPointerException
    at org.apache.spark.deploy.client.AppClient$ClientActor$$anonfun$receiveWithLogging$1.applyOrElse(AppClient.scala:160)
    at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
    at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
    at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
    at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:59)
    at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42)
    at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
    at org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42)
    at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
    at org.apache.spark.deploy.client.AppClient$ClientActor.aroundReceive(AppClient.scala:61)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
    at akka.actor.ActorCell.invoke(ActorCell.scala:487)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
    at akka.dispatch.Mailbox.run(Mailbox.scala:220)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2015-09-16 15:39:31,804 INFO  client.AppClient$ClientActor (Logging.scala:logInfo(59)) - Connecting to master akka.tcp://sparkMaster@ec2-54-152-xx-xxx.compute-1.amazonaws.com:7077/user/Master...
2015-09-16 15:39:31,955 INFO  util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 52333.
2015-09-16 15:39:31,956 INFO  netty.NettyBlockTransferService (Logging.scala:logInfo(59)) - Server created on 52333
2015-09-16 15:39:31,959 INFO  storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Trying to register BlockManager
2015-09-16 15:39:31,964 INFO  storage.BlockManagerMasterEndpoint (Logging.scala:logInfo(59)) - Registering block manager xxx:52333 with 265.1 MB RAM, BlockManagerId(driver, xxx, 52333)
2015-09-16 15:39:31,969 INFO  storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Registered BlockManager
2015-09-16 15:39:32,458 ERROR spark.SparkContext (Logging.scala:logError(96)) - Error initializing SparkContext.
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
    at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:103)
    at org.apache.spark.SparkContext.getSchedulingMode(SparkContext.scala:1503)
    at org.apache.spark.SparkContext.postEnvironmentUpdate(SparkContext.scala:2007)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:543)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
    at py4j.Gateway.invoke(Gateway.java:214)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
    at py4j.GatewayConnection.run(GatewayConnection.java:207)
    at java.lang.Thread.run(Thread.java:745)
2015-09-16 15:39:32,460 INFO  spark.SparkContext (Logging.scala:logInfo(59)) - SparkContext already stopped.
Traceback (most recent call last):
  File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/pyspark/shell.py", line 43, in <module>
    sc = SparkContext(appName="PySparkShell", pyFiles=add_files)
  File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/pyspark/context.py", line 113, in __init__
    conf, jsc, profiler_cls)
  File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/pyspark/context.py", line 165, in _do_init
    self._jsc = jsc or self._initialize_context(self._conf._jconf)
  File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/pyspark/context.py", line 219, in _initialize_context
    return self._jvm.JavaSparkContext(jconf)
  File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/lib/http://ift.tt/1vNTRoR", line 701, in __call__
  File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/lib/http://ift.tt/10PrSvE", line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
    at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:103)
    at org.apache.spark.SparkContext.getSchedulingMode(SparkContext.scala:1503)
    at org.apache.spark.SparkContext.postEnvironmentUpdate(SparkContext.scala:2007)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:543)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
    at py4j.Gateway.invoke(Gateway.java:214)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
    at py4j.GatewayConnection.run(GatewayConnection.java:207)
    at java.lang.Thread.run(Thread.java:745)




Aucun commentaire:

Enregistrer un commentaire