Introduction
This week I did an interview process where I needed to answer the question: Determine a threadpool size to handle 5000 requests where each request takes 10 milliseconds. This is a common problem in a production environment; every day we need to think about how I can scale a microservice or how many microservice instances I should use. If just increasing the number of threads would be the best solution, I will not talk about concurrency and parallelism. Today I'll focus on how to solve this in case someday I need to come back here ๐๐ป
Solution
It's hard to determine the right size for a threadpool. What I could do is estimate the closest size that answers the question, but creating a scenario that does not exist, where I have a server with UNLIMITED resources, and my goal is just to reach 5000 requests per second (RPS) where the response time is 10ms, so I should use this formula:
Creating a formula based in requests per second
If each request took one second, to handle 5000 requests in one second, our server would need to create 5000 threads. (Remember, this is just a fictional scenario to make it easy to understand)
$$\text{Performance (RPS)} = \frac{\text{Threadpool size}}{\text{response time in seconds}}$$
$$ \text{RPS} = \frac{\text{5000}}{\text{1}} = \text{5000 requests(threads)/s}$$
Therefore, each response only takes 10 milliseconds, and we need to achieve a performance of 5000 requests per second. If 1 second = 1000 milliseconds, then 10 milliseconds = 0.01s = 1/100s
$$\text{1 second = 1000 milliseconds} \Rightarrow \text{10 milliseconds = 0.01s =} \frac{1}{100}\text{seconds}$$
So we just adjust the formula:
$$\text{Performance (RPS)} = \frac{\text{Threadpool size}}{\text{response time in seconds}}$$
$$ \text{Threadpool size} = \text{Performance (RPS)} * \text{(response time in seconds)}$$
$$ \text{Threadpool size} = 5000 * \frac{1}{100} = 50$$
Conclusion
Remember, to answer this question should clarify that we can't easily determine the right threadpool size based on response time and the throughput an app wants to achieve. When we think about achieving better throughput, first, we should identify all bottlenecks, which could be a database, an API, or even an algorithm. Pay attention to the server's limits like CPU-bound, Memory RAM-bound, and I/O bound, as all these limits will determine how many threads your system can handle. We should consider whether our server is shared with other apps/systems. Increasing the number of threads without taking all these factors into consideration may not meet expectations or even create a point of failure in the system. And we are not talking about concurrency and parallelism, which is an awesome topic related to this one.
This is just a short answer. In another post, I could bring more details about it and take all the factors mentioned above into consideration, along with other examples.
I hope that you enjoyed it, and all comments are welcome, if possible with a reference to a book or post to enrich the post that is open to editing and improvements!
I will leave some references below.
Resources
Java Concurrency in Pratice - Chap. 8