EMR memory errors & job fails occur on large runs, not on smaller test runs. To same time and expense, I check job logs to confirm the expected parameters were handed over from boto
to hadoop
in
http://s3job_flow_bucket/j_123ABC/jobs/job_..._conf.xml
Question 1: Is this the correct place to look for the purpose above?
What I observed is that some of my bootstrap action parameters are not implemented by hadoop. Also: adding a wrong/bad parameter doesn't generate any warning in the logs (that I can find).
Question 2: If I use the following bootstrap action parameters, how can I confirm that the hadoop call actually sees the request?
At this point I have to wait over an hour for the memory error to occur. There has to be a more efficient way to debug such memory issues.
from boto.emr.bootstrap_action import BootstrapAction
params = [ '-m' , 'mapred.child.java.opts=-Xmx2g' ,
'-m' , 'mapred.cluster.reduce.memory.mb=2000' ,
'-m' , 'mapred.job.reduce.memory.mb=2000' ]
config_bootstrapper = BootstrapAction( name="Bootstrap name" ,
path ='s3://elasticmapreduce/bootstrap-actions/configure-hadoop',
bootstrap_action_args = params)
jobid = conn.run_jobflow(name='The Debug Jobflow',
#api_params=api_params ,
#ec2_keyname="thekey",
bootstrap_actions=[config_bootstrapper],
ami_version="latest",
log_uri='s3://thebucket/jobflowlogs',
master_instance_type='m1.medium',
slave_instance_type='m1.medium',
num_instances=4,
steps=[step],
enable_debugging=True,
keep_alive=False)
Aucun commentaire:
Enregistrer un commentaire