Bug with (Soft)MemLimit? Premature stop
AnsweredI use Gurobi Optimizer version 13.0.0 build v13.0.0rc1 (linux64 - "Rocky Linux 8.10 (Green Obsidian)") to solve the model neos-3402454-bohle.mps https://miplib.zib.de/instance_details_neos-3402454-bohle.html from MIPLIB 2017 on a dual socket node with 2x Intel® Xeon® Platinum 8360Y Processor https://www.intel.com/content/www/us/en/products/sku/212459/intel-xeon-platinum-8360y-processor-54m-cache-2-40-ghz/specifications.html with configuration 0 and hyper-threading turned on (i.e., in total 72 physical and 144 logical cores) and 256 GB RAM.
Non-default parameters:
TimeLimit 3600
(Soft)MemLimit 244
Threads -1
Could there be a bug with the (Soft)MemLimit parameters? I get (after a runtime of approximately 42 minutes):
- "Memory limit reached" for SoftMemLimit=244
- “Solve interrupted (error code 10001)” and “Error 10001: Out of memory” for MemLimit=244
However, Slurm and xbat https://xbat.dev/docs/user/metrics/memory report that only up to approximately 210 GB DRAM is used.
I tested as well Seed=1 and Seed=2 and reached a stop due to the memory limit even earlier at approximately 100 GB DRAM (according to Slurm and xbat). Moreover, using a larger memory limit such as SoftMemLimit=544 or no memory limit leads to reaching the time limit of 1h, i.e., the solver continues for approximately 20 minutes and does not fail due to a memory error.
-
Hello Willi,
Thanks for the detailed information. Based on what you’re seeing, the issue is most likely not a bug but rather a consequence of how Gurobi behaves when it is run with a very large number of threads. When
Threads = -1is used, Gurobi launches one thread per logical core, which on your machine means 144 threads. Each of these threads maintains its own full copy of the model, so memory usage grows extremely quickly, and the solver must constantly synchronize those threads, which adds substantial overhead. This combination of many model copies and heavy synchronization generally hurts both memory consumption and performance.It’s also worth mentioning that performance usually stops improving (and often gets worse) once you exceed roughly 32 threads. Gurobi has a soft recommendation around this range because above 32 threads you commonly see little to no speedup, sometimes even slower performance, and significantly higher memory usage. There are a few special cases, such as the NoRel heuristic, where larger thread counts scale better, but for most models we don’t recommend using
Threads = -1. A better approach is to test specific thread counts and choose the most effective configuration for your model. I would suggest trying 1, 2, 4, 16, and 32 threads and comparing results. Many users find that fewer threads actually give better performance and much lower memory consumption.Since each thread creates a full model copy, reducing the number of threads can also help avoid memory issues. In more advanced setups, you can even interrupt a solve, reduce the number of threads, and resume. Starting with Gurobi 11, changing the thread count before resuming optimization is allowed and will immediately reduce the number of model copies. Here is an example snippet:
m.Params.Threads = 8 m.Params.SoftMemLimit = 4 m.optimize() if m.status == gp.GRB.MEM_LIMIT: m.Params.Threads = 1 m.optimize()One more factor to keep in mind is your dual-socket machine. When Gurobi spreads threads across sockets, memory access latency increases and synchronization becomes more expensive. As threads on different sockets must communicate frequently, performance can degrade, especially when running with very high thread counts. This is a common effect on multi-socket systems and another reason why using all available cores is often counterproductive.
Regarding your memory measurements, different tools report usage differently, and what Gurobi considers “used” memory may not always match what the OS reports. But before digging further into that, I recommend starting with controlled thread experiments as described above. You may find that reducing thread count not only avoids these memory-limit stops but also gives you better performance overall.
Let me know what you see with the above test.
2
Please sign in to leave a comment.
Comments
1 comment