lundi 3 août 2015

S3StreamWrapper for Ruby

I need to stream large text files from S3 to users via browser after performing some action on each line via Rails middleware. I already have a solution in which I spawn a python subprocess to read data from S3, transform it and pipe it to parent (Rails), which streams it to browsers.

But the system is kinda unpredictable, sometimes (~10-20% times) it streams full files (~1gb) and sometimes it just stream data between 1mb-10mb and completes.

#in controller
  cmd = 'python script.py arguments'
  pipe = IO.popen("-","w+")
  if pipe
    respond_to do |format|
      format.all { streaming_render(pipe, filename) }
    end
  else
    exec(cmd)
  end
  ....
  ....

 def streaming_render(pipe, filename)
   set_streaming_headers(filename)
   response.status = 200
   #setting the body to an enumerator, rails will iterate this enumerator
   self.response_body = pipe.enum_for
 end

 def set_streaming_headers(filename)
   headers["Content-Type"] = (params[:format] == 'csv') ? "text/csv" : "text/plain"
   headers["Content-disposition"] = "attachment; filename=\"#{filename}\""
   headers['X-Accel-Buffering'] = 'no' #for nginx
   headers["Cache-Control"] ||= "no-cache"
   headers.delete("Content-Length")
 end

The python script is well tested. It always loads data (~1gb) in 4-5 seconds.

In one of the AWS Blogs I found intrinsic support for streaming in aws php sdk. Does aws ruby sdk also have this support? If not how I can change my current streaming logic to make it full proof?




Aucun commentaire:

Enregistrer un commentaire