Why Idempotency Matters

Originally posted 2014-08-23

Tagged: computer_science, software_engineering

Obligatory disclaimer: all opinions are mine and not of my employer


I finally understood why idempotency is such a big deal in computer science. The key realization came when I was reading about task queues and the need for request IDs with each task submitted.

First, a little background on task queues.

Task queues are a very useful abstraction when you want to separate your work into two halves - the half that generates the work (the producer), and the half that carries out the work (the consumer). This allows the producer to continue responding without blocking. It seems like a task queue should be an extremely simple thing to implement, and it usually is. However, one must be careful to attach a request ID to each task. This is of course useful in having a name by which to ask the task manager whether the task is done or not.

But this is not the only use for a request ID.

Having request IDs is necessary to prevent duplicate request submissions. Have you ever seen a website warn you not to hit the “Purchase” or “Submit Order” button more than once? That’s indicative of a task queue that doesn’t use request IDs. If there was a request ID submitted with the button press, you could hit the button all day long, and the task manager responsible for maintaining the backlog/history of purchase orders would theoretically be able to recognize duplicate orders and ensure that only one task is enqueued.

Perhaps in an ideal world with perfect customers, you would think that the extra work of implementing request IDs is unnecessary. But eager customers are not the only source of duplicate submissions. Network connections can be unreliable, and programmers will often write their own retry logic around actions, which at its core looks essentially like this:

while not succeeded(try_action()) and retries < MAX_RETRIES and (current_time() - start_time) < CONNECTION_TIMEOUT:
    retries += 1
    sleep(BASE_RETRY_WAIT_TIME * 2**retries)```

or in plain words, “try again with successively longer waiting times until either the action succeeds, we’ve tried too many times, or we’ve been trying for too long”.

It’s quite possible that only one half of your network connection is broken, leading your computer to say, “Well… I sent a packet, but I never heard back confirming receipt of the packet. Let’s try sending it again”. This can lead to multiple submissions by no fault of the customer. Request IDs solve this problem.

The key takeaway is this: using task IDs makes the act of submitting a task request idempotent, so that it does not matter if the underlying connection accidentally sends the same request multiple times!