My question is straight forward.
So I want to run a mapreduce job on ec2.
I have a mapper.py reducer.py and helper.py and a package.
Basically my mapper.py will call helper.py and helper.py will import from for the modules in the package(which are a bunch of python files).
How should my command be when I run the hadoop job?
should I use -file, or -cache? I tried both, but they dont work
Aucun commentaire:
Enregistrer un commentaire