Background
Gurobi Compute Server is a software component that helps you solve multiple optimization models concurrently on one or more nodes (machines or containers). These models are submitted by your application(s) running on dedicated resources managed by the user.
Mathematical optimization is computationally intensive, so it’s important to ensure each model has sufficient resources available to allow for good performance. Here, “resources” mainly refers to the number of CPU cores. You can learn more about the relationship between performance and CPU cores in How many cores does my model need?
Without proper capacity management, you would end up running all your models concurrently on a Compute Server. Also, by default each model would attempt to use as many threads as you have cores on your machine. The total number of threads running would exceed the number of cores significantly which negatively impacts performance.
For that reason, we let you control how many models can run concurrently on your Compute Server. There are two approaches to this: (a) based on the number of jobs (b) based on the number of threads.
Job-based capacity
When all models you solve are relatively similar, it’s usually safe to assume you can pick a number of threads per model once and apply that to all your models. When you divide the number of cores on your Compute Server by the ideal number of threads per model, you know the number of models that can run concurrently. The way to configure your environments is as follows:
- Use the
JobLimit
setting on the Compute Server side to control the number of models that will run concurrently. - Use the
Threads
parameter on your Gurobi model/environment to control the number of threads to be used for each model.
For example, you could have an 8-core Compute Server where you set JobLimit=4
and then submit models with Threads=2
.
- Note that if you forget to set the
Threads
parameter, each model would request 8 threads. You end up with 32 threads on your 8-core machine, which should be avoided. - On the other hand, if you forget to set
JobLimit
and leave it at its default (2 jobs), you will never use more than 4 cores total, which is also not ideal: your machine has unused capacity, and you’re not using your license to its full potential.
Thread-based capacity
When there are significant differences between the models you solve (e.g. because of the dimensions of your input data, or because you share a Compute Server between multiple use cases) then it might not be desired to pick a single value for the Threads parameter. And if you use different values, you will have to calculate your JobLimit based on the maximum value of Threads across your models to avoid having more threads than cores. As you can see, capacity management based on jobs only is not ideal for this scenario.
Fortunately, Gurobi 12 introduced new settings.
- Use the
Node_ThreadLimit
setting on the Compute Server to control the total number of threads you want to allow on your Compute Server. Usually, this value would equal the number of cores you have available. - Use the
ThreadLimit
parameter on your Gurobi environment object when initializing Gurobi, to define the maximum number of threads you will use for your models within that environment.
For example, on your 8-core machine, you could set Node_ThreadLimit=8
. You could then submit several jobs:
- Job A with
ThreadLimit=4
will start immediately. - Job B with
ThreadLimit=8
would have to wait until job A completes. - Job C with
ThreadLimit=2
will start while job A is still active, since 4 threads of capacity are still available. Note that Job B is by-passed in the queue. - Job D with
ThreadLimit=2
will start while jobs A and C are still active (assuming you increased JobLimit above 2).
Settings, defaults and their interaction
The default values for the settings mentioned in this article are as follows.
Setting | Scope | Default | Meaning |
Threads |
Client | 0 | Automatic; usually the number of cores |
ThreadLimit |
Client | 0 | Use Threads value |
JobLimit |
Server | 2 | Max 2 concurrent jobs |
Node_ThreadLimit |
Server | 0 | Unlimited |
Notes
- Jobs will only be accepted when both the job and thread capacity are available. So, if you only want to use thread-based capacity, make sure to set JobLimit to a sufficiently large number.
- The number of threads used for optimizing a model is the minimum of Threads and ThreadLimit. So, if you would submit a model with
ThreadLimit=2
andThreads=4
, you would still only get 2 threads. However, if you do not specify either of these two parameters, the model will request and use a number of threads equal to the number of cores. In other words, only one model can run at any point in time.
Comments
0 comments
Article is closed for comments.