There have been many times where I didn’t understand the need for service discovery products. Often services are discovered for you, either behind load balancers or using traditional DNS lookups. Why does one need a service discovery product?
Currently we are working on a machine learning and scientific computing product that spawns workers to fulfill requests. These workers were receiving their work through a queue, an AWS SQS queue. We had a situation where the initial connection to SQS was taking a very long time adding significant latency. That set us off on a path to optimize the heck out of SQS. Optimize, optimize, optimize! It got much better, but there was still tons of latency, especially on the initial connection.
Knowing that SQS is limited in how much we can optimize away, we started looking at alternatives. Unfortunately there is no really good alternative which is “cloud native”. There are products like RabbitMQ that can be setup to handle this, but then you are paying for multiple EC2 instances to be always running. When you architect for cloud one needs to architect for cost. We want our cake and the ability to eat it too. We only want to pay for what we use.
Problem mostly solved! We have a great solution that we are in the midst of testing which negates the use of SQS all together, This solution is based on service discovery which allows incoming requests to discover workers and dispatch work appropriately. All of this without the need for additional infrastructure.