Problem with enlarging the model
AnsweredHi,
I am solving MILP models and trying to scale them up. When I try a smaller model like below, Gurobi quickly goes to the branch and bound procedure which solves the model very fast. I record one of the output as following:
Gurobi Optimizer version 10.0.3 build v10.0.3rc0 (linux64)
CPU model: 12th Gen Intel(R) Core(TM) i7-12800H, instruction set [SSE2|AVX|AVX2]
Thread count: 20 physical cores, 20 logical processors, using up to 20 threads
Optimize a model with 306397 rows, 141376 columns and 677415 nonzeros
Model fingerprint: 0x59ae2153
Variable types: 141283 continuous, 93 integer (93 binary)
Coefficient statistics:
Matrix range [1e+00, 1e+00]
Objective range [6e-04, 1e+00]
Bounds range [1e+00, 1e+00]
RHS range [1e+00, 1e+00]
Presolve removed 292022 rows and 105508 columns
Presolve time: 0.34s
Presolved: 14375 rows, 35868 columns, 76512 nonzeros
Variable types: 35784 continuous, 84 integer (84 binary)
Root relaxation: objective 3.738655e+01, 3700 iterations, 0.17 seconds (0.36 work units)
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 37.38655 0 48 - 37.38655 - - 0s
H 0 0 46.9796408 37.38655 20.4% - 0s
H 0 0 43.7945878 37.38655 14.6% - 0s
0 0 37.39450 0 48 43.79459 37.39450 14.6% - 0s
H 0 0 42.9770354 37.39450 13.0% - 0s
0 2 37.39450 0 48 42.97704 37.39450 13.0% - 0s
* 17 15 4 41.9998970 40.28538 4.08% 256 1s
Explored 50 nodes (13847 simplex iterations) in 1.47 seconds (2.45 work units)
Thread count was 20 (of 20 available processors)
Solution count 4: 41.9999 42.977 43.7946 46.9796
Optimal solution found (tolerance 1.00e-04)
Best objective 4.199989696177e+01, best bound 4.199989696177e+01, gap 0.0000%
However, when I solve a larger model (~10x continuous variables and constraints, but only 2x binaries), Gurobi spends significantly more time (like a few hundred seconds) before entering the branch and bound procedure, and oftentimes gets killed by the terminal. I record one of its output as following.
I wonder if there are some settings that I can tune to make the procedure before the branch and bound faster?
Thank you!
Gurobi Optimizer version 10.0.3 build v10.0.3rc0 (linux64)
CPU model: 12th Gen Intel(R) Core(TM) i7-12800H, instruction set [SSE2|AVX|AVX2]
Thread count: 20 physical cores, 20 logical processors, using up to 20 threads
Optimize a model with 1258522 rows, 575854 columns and 3275266 nonzeros
Model fingerprint: 0xa520414e
Variable types: 575698 continuous, 156 integer (156 binary)
Coefficient statistics:
Matrix range [1e+00, 1e+00]
Objective range [1e-03, 1e+00]
Bounds range [1e+00, 1e+00]
RHS range [1e+00, 1e+00]
Presolve removed 875034 rows and 188550 columns (presolve time = 5s) ...
Presolve removed 875847 rows and 191471 columns
Presolve time: 6.37s
Presolved: 382675 rows, 384383 columns, 1476259 nonzeros
Variable types: 384236 continuous, 147 integer (147 binary)
Deterministic concurrent LP optimizer: primal simplex, dual simplex, and barrier
Showing barrier log only...
Root barrier log...
Elapsed ordering time = 5s
Ordering time: 15.36s
Barrier statistics:
AA' NZ : 2.831e+07
Factor NZ : 2.211e+08 (roughly 2.0 GB of memory)
Factor Ops : 3.934e+11 (roughly 1 second per iteration)
Threads : 17
Objective Residual
Iter Primal Dual Primal Dual Compl Time
0 5.89645754e+04 -1.00511635e+06 3.91e+02 1.21e-02 1.03e+01 28s
1 2.98458760e+04 -6.37207040e+05 1.99e+02 6.69e-13 4.82e+00 30s
2 4.55143879e+03 -2.77530735e+05 3.02e+01 6.38e-13 8.13e-01 32s
3 8.40600305e+02 -7.61758834e+04 5.09e+00 7.59e-13 1.55e-01 35s
4 2.83066540e+02 -2.80240755e+04 1.21e+00 5.58e-13 4.52e-02 38s
5 1.22362341e+02 -8.34800970e+03 9.80e-02 7.41e-13 9.13e-03 41s
6 1.04239613e+02 -2.88132719e+03 9.19e-04 5.73e-13 2.68e-03 44s
7 9.61674688e+01 -1.81023666e+03 2.65e-04 5.67e-13 1.71e-03 46s
8 9.15046558e+01 -1.27754879e+03 1.65e-04 6.24e-13 1.23e-03 49s
9 8.46631300e+01 -7.03884473e+02 8.93e-05 7.67e-13 7.06e-04 51s
10 7.16732788e+01 -4.00237880e+02 2.62e-05 1.11e-12 4.22e-04 54s
11 6.36585887e+01 -2.21452245e+02 1.28e-05 2.52e-12 2.55e-04 57s
12 6.04512884e+01 -1.36590287e+02 9.17e-06 2.48e-12 1.76e-04 60s
13 5.32618159e+01 -1.30030603e+01 2.47e-06 3.78e-12 5.93e-05 64s
14 4.99021696e+01 2.91666070e+01 6.83e-07 6.05e-12 1.86e-05 66s
15 4.84331859e+01 3.56747085e+01 2.24e-07 5.44e-12 1.14e-05 69s
16 4.74550235e+01 4.24719670e+01 2.41e-08 1.07e-11 4.46e-06 71s
17 4.70727618e+01 4.53268859e+01 7.72e-09 2.83e-11 1.56e-06 75s
18 4.70378712e+01 4.54638768e+01 7.12e-09 2.53e-11 1.41e-06 77s
19 4.70024353e+01 4.55834071e+01 6.55e-09 2.75e-11 1.27e-06 80s
20 4.68236145e+01 4.58219291e+01 3.87e-09 2.84e-11 8.96e-07 83s
21 4.67348599e+01 4.61479663e+01 4.39e-09 2.60e-11 5.25e-07 87s
22 4.65918579e+01 4.64010654e+01 8.52e-09 3.15e-11 1.71e-07 89s
23 4.65739346e+01 4.64430414e+01 6.88e-09 2.69e-11 1.17e-07 93s
24 4.65076332e+01 4.64975526e+01 8.76e-10 5.41e-11 9.06e-09 95s
25 4.64978782e+01 4.64978777e+01 3.79e-12 6.32e-11 4.22e-13 99s
26 4.64978781e+01 4.64978780e+01 4.02e-13 6.72e-11 6.31e-15 102s
Barrier solved model in 26 iterations and 102.13 seconds (84.62 work units)
Optimal objective 4.64978781e+01
Root crossover log...
357868 DPushes remaining with DInf 0.0000000e+00 102s
0 DPushes remaining with DInf 0.0000000e+00 103s
509 PPushes remaining with PInf 0.0000000e+00 103s
0 PPushes remaining with PInf 0.0000000e+00 103s
Push phase complete: Pinf 0.0000000e+00, Dinf 4.3288386e+01 103s
Root simplex log...
Iteration Objective Primal Inf. Dual Inf. Time
75267 4.6497878e+01 0.000000e+00 4.328839e+01 103s
75270 4.6497878e+01 0.000000e+00 0.000000e+00 103s
Concurrent spin time: 0.01s
Solved with barrier
75270 4.6497878e+01 0.000000e+00 0.000000e+00 104s
Root relaxation: objective 4.649788e+01, 75270 iterations, 95.44 seconds (74.62 work units)
Total elapsed time = 106.04s
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 46.49788 0 60 - 46.49788 - - 107s
H 0 0 53.8437709 46.49788 13.6% - 108s
0 0 46.51863 0 94 53.84377 46.51863 13.6% - 119s
H 0 0 52.2681304 46.51863 11.0% - 122s
0 0 46.57028 0 96 52.26813 46.57028 10.9% - 131s
H 0 0 52.0500611 46.57028 10.5% - 134s
0 0 46.58058 0 98 52.05006 46.58058 10.5% - 145s
0 0 46.58058 0 98 52.05006 46.58058 10.5% - 149s
0 0 46.58487 0 96 52.05006 46.58487 10.5% - 157s
0 0 46.58615 0 99 52.05006 46.58615 10.5% - 168s
0 0 46.58615 0 99 52.05006 46.58615 10.5% - 170s
0 0 46.58698 0 99 52.05006 46.58698 10.5% - 178s
-
Hi Xuan,
I'd address this first:
oftentimes gets killed by the terminal
I suspect you are running out of memory. We have some tips on reducing the chances of this in the following article: https://support.gurobi.com/hc/en-us/articles/360013195772-How-do-I-avoid-an-out-of-memory-condition.
Once you've got that fixed then you can explore performance parameters.
- Riley
0 -
Hi Riley,
Thanks for the answer! I think using simplex instead of concurrent solvers helps to reduce memory usage. I also managed to reduce the model size by improving the formulation.
I have a more general question: for large-size LP/QPs, and MILP/MIQPs with large-size convex relaxations, can we parallelize them (LP/QPs) with GPUs to speed them up? It seems like Gurobi uses concurrent algorithms but I wonder if that is an elegant solution.Xuan Lin
0 -
Hi Xuan,
At the moment there is no support for using GPUs. It is theoretically possible but will be slower in practice. This is an active area of research though so we hope this may change in the future. Please see the following article for more information:
https://support.gurobi.com/hc/en-us/articles/360012237852-Does-Gurobi-support-GPUs
- Riley
0 -
Hi Riley,
Thanks for the answer! I personally haven't used any GPU-based QP solver. I have heard about some research to accelerate QP solvers using GPUs, for example, this one.
I think ADMM is known to be parallelizable. However, ADMM has its issues, such as low precision. Perhaps that is the reason why ADMM is not suitable for a general-purposed QP solver?
Thank you!
0
Please sign in to leave a comment.
Comments
4 comments