I tried to run K-Means Code in AWS EMR Cluster (1 Master, 6 Core m3.xlarge). http://ift.tt/1LJIMBB
I am facing Spark Exceptions ("timeout, Recepient Termination and Java heap space"). All these errors are not part of the kmeans.py code as it is the inbuilt code that i am using while running the code. "K-Means is running fine as long as the data is upto 3Gb". But once i give the input file of higher size > 3Gb it gives me Spark Exceptions.
Command used to run code:
./bin/spark-submit examples/src/main/python/mllib/kmeans.py k > output.txt
Any ideas on what is happening?
Regards -Ashwin
Aucun commentaire:
Enregistrer un commentaire