Mathematical optimization is a compute-intensive task. For this reason, the choice of (virtual) hardware is quite relevant as it may have an impact on the performance of your model and solver. Unfortunately, the relevant terminology for concepts like cores and threads can be confusing. In this article we try to clarify some of these words from a Gurobi perspective.
First, let’s distinguish between “computing power” (capacity, supply) and “computations” (work, demand). By viewing the concepts from this perspective, it will be easier to understand how changes to each of the sides will impact performance.
Computing Power
Regardless of whether you work on a personal laptop, a virtual machine in the cloud or a serverless environment – calculations are ultimately performed on physical hardware. Specifically, physical cores or CPUs are the components that execute your instructions sequentially. If you dig a bit deeper, one physical core consists of various execution resources. Besides the executing engine itself, there are resources like caches and system bus interfaces.
Now there may be situations where the executing engine “stalls”, because it is waiting for something else to happen (for example, retrieving data). This is where hyperthreading comes into the picture. Modern hardware and operating systems can view a single physical core as two or more virtual cores that can each execute a series of instructions. While this doesn’t really increase the physical capacity of the machine, because of stalling there is an opportunity to make better use of the execution engine. To add to the confusion, virtual cores are often called threads, just like the concept we will introduce next.
Computations
When you run software, you typically initiate what’s called a process. This applies to Gurobi as well: when you run your Python script and call Gurobi, all of that logic runs within a single process. Instructions within the software are typically expected to be executed sequentially. However, with the introduction of machines with more than one core, came the desire to run multiple instructions at the same time to make best use of available computing power.
This is where threads are introduced. One process has a main thread which represents the instructions to be executed when you start the process. One of those instructions may be a request to create a new thread, with its own set of instructions. This is what we call multi-threading.
Gurobi utilizes threads for several reasons. For example, some of the calculations in our Barrier algorithm for solving LP models and relaxations can be performed nicely in parallel because they are independent from each other. Similarly, multiple nodes during MIP branch-and-bound may be considered at the same time. And we may run more than one method for solving the root relaxation concurrently and use the first solution found.
Thread scheduling
So now we have defined computing resources (in the form of physical and virtual cores) and demand for those resources (in the form of software threads within a process). The final piece of the puzzle is how these threads are being allocated to cores. This is the responsibility of the operating system.
In practice this means that one Gurobi process may have one or more threads. The exact number can be chosen by the client application, using the Threads parameter. The operating system allocates these threads to cores; ideally each thread gets its own core assigned but this may not always be possible.
Final words
By default, the number of threads used by Gurobi is calculated automatically based on the number of cores of the machine. Often this is exactly what you want, but there are some situations where you may want to deviate from this behaviour:
- Depending on your license, you may run multiple models at the same time, with each model requiring one or more threads. You should make sure that the number of threads does not exceed the number virtual cores. Whether exceeding the number of physical cores leads to any performance degradations, can only be found out through testing.
- And even when solving a single model, using all available cores for Gurobi does not always yield the best performance. Threads need to synchronize their work (e.g. exchange information, distribute remaining work, ensure deterministic behaviour). Adding more threads means the existing and new threads have more dependencies on each other. At some point the added overhead outweighs the benefit of doing more work in parallel.
Comments
0 comments
Please sign in to leave a comment.