While it's uncommon, the model-building phase can occasionally become a bottleneck in the process. When this happens, it's a good practice to start by profiling your model-building code to identify areas for improvement. Once you've pinpointed the bottlenecks, you can focus on improving those specific parts.
Finding bottlenecks
To find these possible bottlenecks in constructing a model, the first step is to determine the routines in which the time is spent. This can be accomplished in a number of ways, including:
- Third-party modeling tools: Using a 3rd party modeling language can significantly slow down model construction. Consider switching to one of Gurobi's native APIs, e.g., the Python API
- Profiling: Use a profiler on your application. Example profilers for Python include cProfile, line_profiler, and scalene. For other APIs, you can check the articles "Overview of the Profiling Tools" and "Top 10 Profiler Tools for Optimizing Software Performance" to learn about available options.
-
Timing: Adding timings to pinpoint which blocks of code are taking longer than expected. For example:
from timeit import default_timer as timer start = timer() # code to time goes here end = timer() print(end - start)
Once you've established where the bottlenecks are occurring, the following suggestions may be useful for designing more efficient code.
Inefficiencies in data slicing/queries
For Python programs, the most common reason for bottlenecks in model construction is from using inefficient data queries. This is especially true when using Pandas dataframes and methods such as iterrrows, loc, groupby, etc. as well as when making redundant queries to access the same data instead of storing the needed values in a smaller (temporary) data structure. For efficient and convenient handling of Pandas dataframes please see: gurobipy-pandas.
Usage of tupledicts
For Python, try using Model.addVars() to create a sparse tupledict of variables, then use the select(), sum(), and prod() methods to iterate over only the matching variables. This is illustrated in the netflow.py example.
Using a for-loop together with the Model.addConstr() / Model.addVar() function is slightly faster than Model.addConstrs() and Model.addVars(), especially if large numbers of objects are added. In order to retain convenient access to all objects, a variable dictionary needs to be constructed. Note that avoiding the retrieval of variables and constraints by name, such as Model.addConstrByName(), can also improve performance.
Efficiency of building expressions
For .NET and Python programs, it is more efficient to build a linear or quadratic expression by modifying an existing expression rather than repeatedly creating new expressions. For .NET, use the AddTerm()/AddTerms() methods instead of the overloaded operators. For Python, use the overloaded += or -= operators and/or add()/addTerms() rather than creating new expressions.
Efficiency of adding linear constraints
When adding individual linear constraints to a model in Python, the Model.addLConstr() method is a slightly faster alternative to the more general Model.addConstr(). It can be up to 50% faster, particularly for very sparse constraints.
Matrix API
When used properly, the Python Matrix API can be significantly faster than the traditional Model.addConstr() process. Watch the webinar Matrix-friendly Modeling with Gurobipy for a comprehensive introduction on how to utilize this API.
In some cases it makes sense to combine the Python Matrix API and the classic term-based Python API, for example when a set of constraints is naturally modeled with the matrix approach, but another set of constraints is not. If you choose to do this, please be aware of these caveats:
- Accessing individual variables from an MVar object is very slow. Be sure to convert your MVar object to a list first using the MVar.toList() method. MConstr, MQConstr and MGenConstr objects also have analogous toList() methods.
- If you have a sparse list of tuples
[(i1, j1), (i2, j2), ...]the termwise method Model.addVars() only creates the variables corresponding to those tuples. In contrast, the Python Matrix API does not allow sparse indexing, so an MVar requires a full 2D dense matrix of variables in order to be access the individual variables corresponding to the tuples, so mixing the two approaches can lead to creating many unnecessary variables.
Additional remarks
- Version: If your application was developed using an older version of Gurobi that required frequent calls to Model.update(), remove those method calls and use the latest version of Gurobi.
- Memory: If your machine doesn't have adequate memory, some disk-swapping may occur, leading to slower performance. To monitor memory usage, use the Activity Monitor on a Mac, Task Manager on a Windows machine, or the top command on Linux. You can also use memory profilers to analyze the memory usage of your script over time.
- Helper functions: Sometimes, performance bottlenecks may be caused by user-defined helper functions.
- API: In rare cases, switching to a lower-level language such as C, C++, or Java may also improve model construction time.