vendredi 2 octobre 2015

Which version of Spark to download?

I understand you can download Spark source code (1.5.1), or prebuilt binaries for various versions of Hadoop. As of Oct 2015, the Spark webpage http://ift.tt/1kIKRfk has prebuilt binaries against Hadoop 2.6, 2.4, 1.3, and 1.X.

I'm not sure what version to download.

I want to run a Spark cluster in standalone mode using AWS machines. This means I need to access S3, but not hdfs. I do not have Hadoop installed.

I got a pre-built Spark for Hadoop 2.6. I can run it in local mode, such as the wordcount example. However, whenever I start it up, I get this message

WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Is this a problem? Do I need hadoop?

I currently cannot read files from S3. Is this problem related?




Aucun commentaire:

Enregistrer un commentaire