I tried to run a mrjob script on Amazon EMR. It worked well when I used instance c1.medium, however, it had an error when I changed instnace to t2.micro. The full error message was shown below.
C:\Users\Administrator\MyIpython>python word_count.py -r emr 111.txt using configs in C:\Users\Administrator.mrjob.conf creating new scratch bucket mrjob-875a948553aab9e8 using s3://mrjob-875a948553aab9e8/tmp/ as our scratch dir on S3 creating tmp directory c:\users\admini~1\appdata\local\temp\word_count.Administr ator.20150731.013007.592000 writing master bootstrap script to c:\users\admini~1\appdata\local\temp\word_cou nt.Administrator.20150731.013007.592000\b.py
PLEASE NOTE: Starting in mrjob v0.5.0, protocols will be strict by default. It's recommended you run your job with --strict-protocols or set up mrjob.conf as de scribed at http://ift.tt/1IvHDtU ols
creating S3 bucket 'mrjob-875a948553aab9e8' to use as scratch space Copying non-input files into s3://mrjob-875a948553aab9e8/tmp/word_count.Administ rator.20150731.013007.592000/files/ Waiting 5.0s for S3 eventual consistency Creating Elastic MapReduce job flow Traceback (most recent call last): File "word_count.py", line 16, in MRWordFrequencyCount.run() File "F:\Program Files\Anaconda\lib\site-packages\mrjob\job.py", line 461, in run mr_job.execute() File "F:\Program Files\Anaconda\lib\site-packages\mrjob\job.py", line 479, in execute super(MRJob, self).execute() File "F:\Program Files\Anaconda\lib\site-packages\mrjob\launch.py", line 153, in execute self.run_job() File "F:\Program Files\Anaconda\lib\site-packages\mrjob\launch.py", line 216, in run_job runner.run() File "F:\Program Files\Anaconda\lib\site-packages\mrjob\runner.py", line 470, in run self._run() File "F:\Program Files\Anaconda\lib\site-packages\mrjob\emr.py", line 881, in _run self._launch() File "F:\Program Files\Anaconda\lib\site-packages\mrjob\emr.py", line 886, in _launch self._launch_emr_job() File "F:\Program Files\Anaconda\lib\site-packages\mrjob\emr.py", line 1593, in _launch_emr_job persistent=False) File "F:\Program Files\Anaconda\lib\site-packages\mrjob\emr.py", line 1327, in _create_job_flow self._job_name, self._opts['s3_log_uri'], **args) File "F:\Program Files\Anaconda\lib\site-packages\mrjob\retry.py", line 149, i n call_and_maybe_retry return f(*args, **kwargs) File "F:\Program Files\Anaconda\lib\site-packages\mrjob\retry.py", line 71, in call_and_maybe_retry result = getattr(alternative, name)(*args, **kwargs) File "F:\Program Files\Anaconda\lib\site-packages\boto\emr\connection.py", lin e 581, in run_jobflow 'RunJobFlow', params, RunJobFlowResponse, verb='POST') File "F:\Program Files\Anaconda\lib\site-packages\boto\connection.py", line 12 08, in get_object raise self.ResponseError(response.status, response.reason, body) boto.exception.EmrResponseError: EmrResponseError: 400 Bad Request
SenderValidationError
Instance type 't2.micro' is not supported c3ee1107-3723-11e5-8d8e-f1011298229d
This is my config file detail
runners:
emr:
aws_access_key_id: xxxxxxxxxxx
aws_secret_access_key: xxxxxxxxxxxxx
aws_region: us-east-1
ec2_key_pair: EMR
ec2_key_pair_file: C:\Users\Administrator\EMR.pem
ssh_tunnel_to_job_tracker: false
ec2_instance_type: t2.micro
num_ec2_instances: 2
Aucun commentaire:
Enregistrer un commentaire