dimanche 11 octobre 2015

AWS Lambda: How to Move Large File from FTP server to S3

I wrote an AWS Lambda function that uses the ftplib, gzip, and boto3 libraries to move a file from an FTP server to an S3 bucket. It works great for small files but times out for larger ones due to the AWS Lambda timeout limit. Are there any strategies for dealing with larger files (and workloads in general) such as splitting them across several lambda function instances and then stitching the results back together.

The essential part of the current code is:

fname = 'large_file_to_move.txt'
target_bucket = 'some_s3_bucket'
s3 = boto3.client('s3')

ftp = ftplib.FTP(host, user, passwd)

with io.BytesIO() as data:
    ftp.retrbinary('RETR ' + fname, data.write)

    with gzip.open('/tmp/' + fname + '.gz', 'wb') as compressed:
        compressed.write(data.getvalue())

    s3.upload_file('/tmp/' + fname + '.gz', target_bucket, fname + '.gz')




Aucun commentaire:

Enregistrer un commentaire