I have a service that is responsive to user input and needs to return results in real-time < 500 ms. The service is CPU bound and single threaded takes longer than the time needed, but the work is highly parallel-izable.
I am trying to evaluate a programming framework that will allow me to scale these chunks to different machines on AWS EC2. Right now, our service is built on celery. Using a quick sample with celery, spinning up a task for the parallel part and then acting on the response takes 600ms. To be clear, the above test is without any code running, just printf and return.
Are there low-latency frameworks out there that will serve my needs? (The overhead of the framework needs to be < 100ms and scale O(1)ish. The code inside the scaling needs to run python, but the distributed framework doesn't have to be.
Thanks!
Aucun commentaire:
Enregistrer un commentaire