Solver gets stuck in "Root Simplex" after finding optimal solution

Official comment

Gurobi Staff

December 06, 2024 00:25

This post is more than three years old. Some information may not be up to date. For current information, please check the Gurobi Documentation or Knowledge Base. If you need more help, please create a new post in the community forum. Or why not try our AI Gurobot?.

Jaromił Najman

Gurobi Staff

September 27, 2021 11:40

Hi Axel,

Could you share the LP file as described in Posting to the Community Forum. This way, the Community can try to reproduce this issue.

Best regards,
Jaromił

0

Axel Bruck

September 27, 2021 13:19

Hi Jaromil,

Thanks for the answer. Unfortunately, I cannot because it is more than 1000 lines of code and depending on input files.

Also, for some certain input scenarios it works. Can it be due to the large matrix coefficient range? I am currently trying to resolve that issue.

Thanks

Axel

0

Jaromił Najman

Gurobi Staff

September 27, 2021 13:59

Hi Axel,

The coefficient range can definitely play a role as it is extremely large. You could have a look at our Guidelines for Numerical Issues for tips on how to scale your model properly.

You could also try tracking the memory usage just to make sure that you don't run into an out-of-memory condition.

Best regards,
Jaromił

0

Axel Bruck

September 27, 2021 14:24

I am working on this right now thanks.

It is definitely not a memory issue. Do you have any idea what could cause this. I find it weird as according to the gurobi documentation the fact that Primal Inf and Dual Inf is zero means that the model is successfully solved. So actually it would only miss the final output but it does not show up...

0

Jaromił Najman

Gurobi Staff

September 28, 2021 07:49

Hi Axel,

It is very hard to tell from the log only. It seems very strange though. Numerical issues can cause almost any kind of behavior so I would not be surprised if a better scaling actually solves this issue.

What happens if you set the DegenMoves parameter to 0? This parameter turns off integrality improvement of the root relaxation solution. Maybe this is the issue here.

Best regards,
Jaromił

0

Axel Bruck

September 28, 2021 08:24

Thanks Jaromil,

Yes I can imagine that it is hardly to tell just from the log file ;)

It is not the numeric issue. I resolved them (it was a huge big M number).

I will try the suggestion with the DegenMoves and let you know. Thanks!

0

Axel Bruck

September 28, 2021 09:21

Another thing that is quite weird:

This is another example but why does it do iteration 965072 twice? It should be done after the first time as far as I understand.

Same happened in the example above...

0

Jaromił Najman

Gurobi Staff

September 28, 2021 09:22

Hi Axel,

If this does not help, you could consider using the write() function to generate an MPS file and share one problematic instance.

Since you are using Pyomo, you could also have a look at How do I write an MPS model file from a third-party API?

Best regards,
Jaromił

0

Axel Bruck

September 28, 2021 09:45

I ran out of memory when trying the DegenMove == 0.

Maybe in the end this is the problem but I do not understand why... and also why I do not get a memory error.

Before I had two quadratic constraints that I linearized and I hoped that this would solve the memory issues

I will try writing a MPS file

0

Jaromił Najman

Gurobi Staff

September 28, 2021 11:06

The double printing of the last line in the simplex log may be some syncing issue so I would not bother much.

I ran out of memory when trying the DegenMove == 0.

Maybe in the end this is the problem but I do not understand why... and also why I do not get a memory error.

Depending on the OS settings, you don't run into an error but your OS may start memory swapping. This then results in extremely slow progress such that you can get the feeling of "getting stuck". Could you run the model without DegenMoves=0 and track your memory via the task manager?

With over 7 million nonzeros in the presolved problem, your model definitely one of the larger ones so an out-of-memory condition definitely makes sense. You could try setting the number of Threads to 1 to limit memory usage. You can find more info in How do I avoid an out-of-memory condition?

Best regards,
Jaromił

0

Axel Bruck

September 28, 2021 13:46

Thanks Jaromil,

I am always tracking my memory simultaneously. It is at around 85-95% most of the time until it hits the root simplex log where I have the feeling of getting stuck. Then it slowly drops. I also left it overnight and nothing happens. No memory error, nothing.

I tried all the out-of-memory suggestions but yeah as I said I have the feeling it is something else...

This is the memory after arriving at the stuck point and waiting for a couple fo minutes. Python still seems to do something in parallel according to the CPU usage. But then I do not understand what since the solver log seems to be done.

0

Jaromił Najman

Gurobi Staff

September 28, 2021 14:43 Edited

Hi Axel,

These might be Gurobi's parallel Threads running, that's why you have more than 1 Python process with large CPU and memory demands.

Does this behavior only occur for one model of your data set? Are you able to reproduce this behavior when running the same model written to an MPS file using the gurobi command line tool?

In case, that you cannot share the model file due to data security, please let me know and I will open a Support Ticket for you to share the model file.

Best regards,
Jaromił

0

Axel Bruck

September 29, 2021 08:59

Hi Jaromil,

I couldn't try the MPS version yet, sorry.

The problem occurs for some inputs and for some not. But it is always the same inputs where it occurs. So yes it is reproducible. I have the feeling for inputs that are slightly more challenging (by far not infeasible) it starts doing this. I also tried it on two different machiens (16 vs 8 GB RAM) and both showed the exact same behaviour and the exact same optimal objective. For me this means that the problem is solvable.

Example of not functioning --> getting stuck here

Example of functioning:

0

Axel Bruck

September 29, 2021 09:21

Regarding the code, this is the part that makes the issues: It is a heat pump that can run either in cooling or heating mode but not at the same time both. So I need binary variable(s). If I say there is no cooling demand (so the heat pump works in heating only and I do not need binary variables I do not have this problem). All lower case parameter are decision variables for each time step, except capacity with is not indexed. p is the electric power consumption, h is the heating output, c is the cooling output, b_h is a binary variable that is 1 during heating and 0 during cooling, capacity is the maximal heating capacity available. M is a big value (bigger than capacity could possibly be) and COP is the efficiency of the heat pump.

Before the model had quadratic constraints which I did not see and therefore I often ran into memory issues. However, it worked and I never observed the behaviour described. Did I do something wrong converting the MIQP to a MILP model?

MILP:
p = h/COPh + c/COPc
h <= capacity
c <= capacity * (COPc/(COCc+1))
h <= M * b_h + 0 * (1 - b_h)
c <= 0 * b_h + M * (1 - b_h)

MIQP:
p = h * b_h/COPh + c * b_c/COPc
h <= capacity
c <= capacity * (COPc/(COCc+1))
b_h + b_c <= 1

0

Jacob Mannhardt

January 23, 2023 14:35

Hey Jaromił,

to pick up the thread here, I observe exactly the same behavior: The root simplex gets stuck after the crossover after a barrier solve.

Set parameter BarHomogeneous to value 1
Set parameter Presolve to value 2
Set parameter Threads to value 46
Set parameter BarConvTol to value 1e-10
Set parameter ScaleFlag to value 2
Gurobi Optimizer version 9.5.2 build v9.5.2rc0 (win64)
Thread count: 24 physical cores, 48 logical processors, using up to 46 threads
Optimize a model with 14034346 rows, 11277165 columns and 37822849 nonzeros
Model fingerprint: 0x398518b9
Variable types: 11263485 continuous, 13680 integer (13680 binary)
Coefficient statistics:
  Matrix range     [1e-04, 4e+02]
  Objective range  [1e+00, 1e+00]
  Bounds range     [1e+00, 1e+00]
  RHS range        [5e-05, 2e+05]
Presolve removed 6661620 rows and 5913204 columns (presolve time = 5s) ...
Presolve removed 12283329 rows and 9175563 columns (presolve time = 11s) ...
Presolve removed 12530368 rows and 9544990 columns (presolve time = 15s) ...
Presolve removed 12530556 rows and 9553811 columns (presolve time = 20s) ...
Presolve removed 12530556 rows and 9605612 columns
Presolve time: 23.95s
Presolved: 1503790 rows, 1671553 columns, 6014614 nonzeros
Variable types: 1671553 continuous, 0 integer (0 binary)

Deterministic concurrent LP optimizer: primal simplex, dual simplex, and barrier
Showing barrier log only...

Root relaxation presolve removed 0 rows and 0 columns (presolve time = 5s) ...
Root relaxation presolved: 1503790 rows, 1671553 columns, 6014614 nonzeros

Root barrier log...

Elapsed ordering time = 15s
Elapsed ordering time = 20s
Ordering time: 67.88s
Elapsed ordering time = 68s
Elapsed ordering time = 70s
Elapsed ordering time = 78s
Elapsed ordering time = 81s
Elapsed ordering time = 86s
Elapsed ordering time = 90s
Elapsed ordering time = 95s
Elapsed ordering time = 100s
Elapsed ordering time = 105s
Elapsed ordering time = 110s
Ordering time: 118.48s

Barrier statistics:
 Dense cols : 1869
 AA' NZ     : 4.411e+07
 Factor NZ  : 9.331e+08 (roughly 9.0 GB of memory)
 Factor Ops : 2.999e+12 (roughly 5 seconds per iteration)
 Threads    : 44

                  Objective                Residual
Iter       Primal          Dual         Primal    Dual     Compl     Time
   0  -1.76457355e+12 -4.81475228e+13  2.05e+10 5.74e+01  3.78e+09   173s
   1  -1.32230738e+12 -4.83695600e+13  1.73e+10 1.31e+03  3.16e+09   178s
   2  -8.72868704e+11 -4.84380510e+13  1.32e+10 7.43e+02  2.37e+09   183s
   3  -5.30988895e+11 -4.94625874e+13  8.72e+09 2.62e+02  1.61e+09   188s
   4  -5.80100146e+10 -3.69274261e+12  8.26e+08 2.02e+01  1.79e+07   194s
   5  -1.66100902e+10 -1.19694881e+12  2.02e+08 7.54e+00  2.34e+06   200s
   6  -1.00545343e+10 -5.13877819e+11  1.22e+08 2.91e+00  9.48e+05   206s
   7  -6.31994078e+09 -2.63781108e+11  7.49e+07 1.47e+00  4.37e+05   211s
   8  -3.58323317e+09 -1.46278501e+11  4.30e+07 7.70e-01  2.02e+05   215s
   9  -2.22651740e+09 -8.49240385e+10  2.89e+07 6.03e-01  1.12e+05   220s
  10  -1.28718951e+09 -5.36329257e+10  1.87e+07 3.66e-01  6.31e+04   226s
  ...
 269   5.05601584e+06  5.05601597e+06  1.88e-05 1.16e-05  2.53e-09  3501s
 270   5.05601584e+06  5.05601597e+06  1.88e-05 1.16e-05  2.53e-09  3514s
 271   5.05601584e+06  5.05601610e+06  1.88e-05 2.51e-05  2.59e-09  3526s
 272   5.05601584e+06  5.05601609e+06  1.88e-05 2.35e-05  2.72e-09  3539s
 273   5.05601584e+06  5.05601605e+06  1.88e-05 1.99e-05  2.78e-09  3552s
 274   5.05601585e+06  5.05601588e+06  1.42e-05 1.58e-06  1.93e-09  3565s
 275   5.05601585e+06  5.05601587e+06  1.38e-05 9.08e-07  1.88e-09  3578s
 276   5.05601585e+06  5.05601586e+06  7.97e-06 3.49e-07  1.08e-09  3591s
 277   5.05601585e+06  5.05601587e+06  7.34e-06 5.54e-07  9.89e-10  3605s
 278   5.05601586e+06  5.05601586e+06  2.57e-07 1.73e-08  3.51e-11  3618s
 279   5.05601586e+06  5.05601586e+06  3.96e-09 2.45e-10  9.54e-13  3631s

Barrier solved model in 279 iterations and 3631.48 seconds (4423.13 work units)
Optimal objective 5.05601586e+06


Root crossover log...

 1252594 variables added to crossover basis                     3635s

  251106 DPushes remaining with DInf 1.7417488e-06              3639s
   11931 DPushes remaining with DInf 0.0000000e+00              3790s
    9115 DPushes remaining with DInf 0.0000000e+00              3799s
    7603 DPushes remaining with DInf 0.0000000e+00              3804s
    6566 DPushes remaining with DInf 0.0000000e+00              3809s
    5737 DPushes remaining with DInf 0.0000000e+00              3826s
    4849 DPushes remaining with DInf 0.0000000e+00              3833s
    4132 DPushes remaining with DInf 0.0000000e+00              3841s
    3516 DPushes remaining with DInf 0.0000000e+00              3853s
    2910 DPushes remaining with DInf 0.0000000e+00              3872s
    2307 DPushes remaining with DInf 0.0000000e+00              3896s
    1704 DPushes remaining with DInf 0.0000000e+00              3917s
    1101 DPushes remaining with DInf 0.0000000e+00              3940s
     498 DPushes remaining with DInf 0.0000000e+00              3963s
       0 DPushes remaining with DInf 0.0000000e+00              3987s

   16276 PPushes remaining with PInf 0.0000000e+00              3987s
   15289 PPushes remaining with PInf 0.0000000e+00              3997s
   14378 PPushes remaining with PInf 0.0000000e+00              4001s
   13770 PPushes remaining with PInf 0.0000000e+00              4006s
   13167 PPushes remaining with PInf 0.0000000e+00              4011s
   12559 PPushes remaining with PInf 0.0000000e+00              4016s
   11215 PPushes remaining with PInf 0.0000000e+00              4023s
   10607 PPushes remaining with PInf 0.0000000e+00              4027s
    9836 PPushes remaining with PInf 0.0000000e+00              4031s
    8475 PPushes remaining with PInf 0.0000000e+00              4037s
    7857 PPushes remaining with PInf 0.0000000e+00              4041s
    6510 PPushes remaining with PInf 0.0000000e+00              4047s
    5640 PPushes remaining with PInf 0.0000000e+00              4050s
    3707 PPushes remaining with PInf 0.0000000e+00              4055s
     405 PPushes remaining with PInf 0.0000000e+00              4060s
       0 PPushes remaining with PInf 0.0000000e+00              4062s

  Push phase complete: Pinf 0.0000000e+00, Dinf 4.2525637e+00   4063s


Root simplex log...

Iteration    Objective       Primal Inf.    Dual Inf.      Time
   96383    5.0560159e+06   0.000000e+00   4.252564e+00   4064s
   96385    5.0560159e+06   0.000000e+00   0.000000e+00   4066s
   96385    5.0560159e+06   0.000000e+00   0.000000e+00   4068s
[here, Gurobi was stuck for quite a while]
Concurrent spin time: 3980.12s (can be avoided by choosing Method=3)

Solved with barrier

Root relaxation: objective 5.056016e+06, 96385 iterations, 8016.25 seconds (5184.61 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

*    0     0               0    5056015.8599 5056015.86  0.00%     - 8048s

Explored 1 nodes (96385 simplex iterations) in 8049.39 seconds (5228.09 work units)
Thread count was 46 (of 48 available processors)

Solution count 2: 5.05602e+06 5.05602e+06

Optimal solution found (tolerance 1.00e-04)
Best objective 5.056015859939e+06, best bound 5.056015859939e+06, gap 0.0000%

I am painfully aware that my RHS ranges are not great, however, I still do not understand what triggers this behavior. While I wrote this comment, it actually continued after waiting 4000 seconds.

Do you know if I can expedite this process because it does not seem very efficient to idle around there?

Thanks so much in advance (for your always fantastic help!)
Best,

Jacob

0

Jaromił Najman

Gurobi Staff

January 23, 2023 15:06

Hi Jacob,

Is there a particular reason why you set the parameters? BarHomogeneous will make Gurobi use the homogeneous Barrier algorithm which is expected to be way slower than the standard Barrier algorithm. Presolve=2 and ScaleFlag=2 might or might not have a positive impact. Reducing the BarConvTol parameter will also make the Barrier algorithm require more iterations.

You should update to the latest Gurobi version 10.

You could try avoiding Crossover completely for MIPs by setting Method=2 Crossover=0 and NodeMethod=2. Note that these setting make only sense if the B&B search requires only very few nodes. In your case, you need only the root node, thus this might be a suitable setting for your model.

The big waiting time after Crossover and before the B&B tree output is often caused by degenerate simplex moves. Setting DegenMoves=0 might help.

Best regards,
Jaromił

0

Jacob Mannhardt

January 23, 2023 16:36

Hi Jaromił,

thanks for your quick reply! The parameter settings grew historically, but I am always looking for a better configuration, so your help is always much appreciated!

I used BarHomogeneous before to find out if my problem is infeasible or unbounded, but I don't need it anymore. Since the problem is large with nasty numerics, I wanted to use Presolve and ScaleFlag to tame that a bit. And I have observed before that a tighter BarConvTol leads to shorter crossovers.

I have disabled the Crossover before, but if I remember correctly, it leads to "ugly" variable values. I am using some variable values as parameters in a subsequent optimization. Then, pre-Crossover variable values lead to bad numerics in the next optimization, doesn't it?

I will try the DegenMoves = 0, thanks!

Thanks for recommending the update! I will see if we can do that in our project.

Many thanks and best regards
Jacob

0

Jaromił Najman

Gurobi Staff

January 24, 2023 08:17

Hi Jacob,

The parameter settings grew historically, but I am always looking for a better configuration, so your help is always much appreciated!

Understandable, but I can imagine that your model also grew and improved. Thus, maybe now is a good point to start with a fresh set of parameters.

I have disabled the Crossover before, but if I remember correctly, it leads to "ugly" variable values. I am using some variable values as parameters in a subsequent optimization. Then, pre-Crossover variable values lead to bad numerics in the next optimization, doesn't it?

This cannot be said in general. However, if the model is numerically difficult, then yes, Crossover often takes care of "un-clean" Barrier solutions. And yes, it is often best to use Crossover, when you want to re-use the solution point in a subsequent optimization. Still, it might be worth a try.

Best regards,
Jaromił

0

Jacob Mannhardt

January 24, 2023 08:19

Hi Jaromił,

thanks again for your continued help! I will try that!

Best wishes and many thanks
Jacob

0

Solver gets stuck in "Root Simplex" after finding optimal solution

Comments

Didn't find what you were looking for?