jeudi 29 janvier 2015

Trying to build an automation script on AWS Data Pipeline

I am trying to use the AWS Data Pipeline service in the following manner:



  1. Select the activity type as Shell Command activity with the script uri set (to an s3 bucket) and Stage input set to true.

  2. Set the resource type of the activity as EC2.

  3. Use S3 as a data node.

  4. For the ec2 resource, I have selected the instance type as t2.medium and instance ID as a custom AMI created by me.

  5. Schedule the pipeline to run everyday at 10pm.


The script specified in step 1 (i.e. as part of script uri in the activity) has 2 lines: 1. To copy the S3 bucket data to the instance. 2. Run the python command to execute my program. The AMI I have created is based on Ubuntu instance of ec2 and it consists of some python software and also the code I would like to run.


Now, on initiation of the pipeline I notice that ec2 instance is indeed created and the S3 data is copied and made available to the instance but the python command is not run. The instance is in running state and the pipeline is in waiting for runner state for some time and then data pipeline fails with the message: "Resource stalled".


Can someone please let me know if i am doing something wrong or why doesn't my python code is not being executed or why am I getting the Resource stalled error? The code works fine if I run it manually without the pipeline.


Thanks in advance!!





1 commentaire:

  1. Thanks for sharing valuable information. Your blogs were helpful to AWS learners. I
    request to update the blog through step-by-step. Also, find the AWS news at
    AWS Online Training

    RépondreSupprimer