Introduction

Gurobi can be used in many different architectures, ranging from development on a single laptop or Databricks notebook, to a cluster of Compute Server instances that together solve models submitted from multiple applications. As you start using Gurobi and think about an architecture that fits your needs, you will be facing several questions:

What license types best fit our desired architecture?
How many of those licenses do we need for our usage?
What hardware should we select for each component?

In this article look at these questions. By sizing we refer to the second and third bullet, but obviously there is a strong relationship with the license type.

Determine expected usage

Before we can answer the questions above, we first need to look at how you plan to use your application. When we discuss usage, for simplicity we look at “models” and assume each (automated or manual) request triggers the creation and optimization of a single model. Try answering the questions below, for the various load scenarios you might face (for example, average and peak load). Note that answers may differ between environments: load on a test machine may be very differerent from production load.

How many models would be generated in parallel?
How much time would it take to generate these models (without solving them)?
How much time would Gurobi need to solve one such model?
Is queuing of your models acceptable? What is the maximum waiting time for an optimization request?

Determine core/runtime relationship

Next, we should understand how Gurobi behaves on your hardware. CPU cores are the main resources that determine Gurobi performance, as they deliver the computational power for optimizing your models. Two important choices will be the number of models that are being solved at the same time on a single machine, as well as the number of cores allocated to each of those models. Learn more about performing this analysis; this will help you in the next step, where we look at license requirements.

Calculate requirements for each suitable license type

Once you’ve established the relationship between number of cores and runtime, and the expected usage patterns, we can calculate the total number of cores required to cover the average and peak situation.

Now we can throw license types into the mix. (Standard) pricing of each license typically depends on the number of cores used and/or available on your machines, as well as the concurrency allowed within and across machines. Some license types provide queuing, while others don’t. And by applying client-server-separation using Compute Server, you may dedicate a machine purely to solving models, allowing you to solve more models concurrently than if you would use the same machine also for pre- and postprocessing.

We recommend working with your Technical Account Manager (TAM) to identify a small set of license types that may fit your architecture, and help you (re)design parts of your architecture if that would benefit licensing requirements. Then for each license type, the total required license volume and hardware can be calculated using the information you have collected in the previous step. Your TAM can also provide you with a technical comparison between these license types (e.g. advantages and disadvantages of each license type).

Make your sizing decision

The final choice of license sizing will always depend on the identified load scenarios and core/runtime relationship. Additional factors will typically include budget and technical requirements. We are pleased to support this exercise by guiding you through the steps required to come to an informed decision. Don’t hesitate to reach out to your (Technical) Account Manager for help validating concepts and assumptions based on experience.

Frequently asked questions

Do I need to test running multiple models concurrently? If you plan on doing that in practice as well, then ideally, yes. If a single model uses 4 cores, then you might expect to see the same runtime when solving two models concurrently on an 8-core machine. However, besides the CPU there are other hardware components (e.g. memory, bus) that are being shared. Also, chip temperature will impact clock speed. Finally, you will want to make sure that cores need to be exclusively available to the solver and not accidentally used by other processes.
Are cores and threads the same? For hardware we differentiate between physical and virtual cores. These often have a 1:2 ratio, but not always. Cores are the “capacity” to perform calculations. By (software) threads we refer to tasks defined by a single process that can be performed concurrently. The operating system assigns those threads to cores. Note that (to make things more confusing) virtual cores are often also called “threads”.
Should we calculate number of physical or logical cores? For licensing purposes, Gurobi looks at physical cores. There is always some performance loss when applying hyperthreading (e.g. having multiple virtual cores per physical core). So, when you have a machine with 8 physical, 16 virtual cores and you wish to solve a model with 2 threads, then you would be able to solve between 4 (8/2) and 8 (16/2) models concurrently without significant performance loss. Which number to choose, should be confirmed through testing.
How do I solve models with different core requirements on a single Compute Server? With the addition of new settings in Gurobi 12, this has recently become a lot easier. Have a look at the article about "Capacity management with Gurobi Compute Server" for more information.