MIP model gets stuck at very specific variable bound values

Hello,

I have a makespan minimization MIP model which is getting stuck with no solution at very specific set values for upper variable bounds and big-M values. I'm not sure if it's numerical issues, big-M issues, complexity issues, or if I made a mistake with the model formulation.

Here is the code for the model I am using:

K = [k.arm_n for k in arm]      # list of arm numbers
    N = [i.index for i in fruit]    # list of fruit indexes
    Y = [i.y_coord for i in fruit]  # list of fruits' y-coordinate (x-coordinate in the paper)

    total_fruit = len(N) # needed to constraint FPE to a high picking percentage

    
    ### Create model
    m = gp.Model("makespan_mip")
    
    # if velocity becomes a variable, it is multiplied with another variable requiring the following settings
    m.params.NonConvex = 2
    m.setParam('NonConvex', 2)
    
    
    ### Decision variables
    # Arm-fruit assignment (is fruit i picked by arm k)
    x = m.addVars(K, N, vtype=GRB.BINARY, name="x")
    
    # Time arm k reaches fruit i
    t = m.addVars(K, N, lb=0, name="t")
    
    # Start and end of time window arm k can reach fruit i
    tw_s = m.addVars(K, N, lb=0, name="tw_s")
    tw_e = m.addVars(K, N, lb=0, name="tw_e")

    # required because gurobi doesn't like >= or <= constraints that deal with two variables
    aux_max = m.addVars(K, N, lb=0, name="aux_max")
    aux_min = m.addVars(K, N, lb=0, name="aux_min")

    # in m/s, vehicle velocity along orchard row
    # bounded by the cell length and Td (melon paper) and bounded by max velocity of the lift (90 cm/s)
    v_vy_lb = l / Td
    v_vy = m.addVar(lb=v_vy_lb, ub=v_vy_ub, name="v_vy")
    
    # create a variable that saves the last picking time, or makespan
    makespan = m.addVar(lb=0, name="makespan")
    t_max_arm = m.addVars(K, name='t_max')  # max t value for each arm
    
    
    ### Constraints
    # At most one arm can be assigned to a fruit (1)
    m.addConstrs((x.sum('*', i) <= 1 for i in N), name="assignOne")
    
    # Time elapsed between pickup of any two fruit reached by the same arm is at least Td (2)
    m.addConstrs((t[k, i] + Td - t[k, j] <= M * (2 - x[k, j] - x[k, i]) for i in N for j in N for k in K if Y[j] > Y[i]), name="atLeast")

    m.addConstrs(((tw_s[k, i] * v_vy == (Y[i] + k * l)) for i in N for k in K), name="TW_start")
    m.addConstrs(((tw_e[k, i] * v_vy == (Y[i] + (k + 1) * l)) for i in N for k in K), name="TW_end")
    
    # to learn how to deal with max/min with variables
    # see https://support.gurobi.com/hc/en-us/community/posts/360076808711-how-to-add-a-max-constr-General-expressions-can-only-be-equal-to-a-single-var
    m.addConstrs((aux_max[k, i] == gp.max_(tw_s[k, i], tw_e[k, i]) for i in N for k in K), name="auxMax")
    m.addConstrs((aux_min[k, i] == gp.min_(tw_s[k, i], tw_e[k, i]) for i in N for k in K), name="auxMin")

    # Ensure each node is visited within the given time window (3) and (4)
    # TW_start and TW_end are matching the fruit index number exactly (disctionary), so [2][0] == index 2 (even 
    # though it starts at zero, second arm back from 0th arm)  
    m.addConstrs(((t[k, i] <= aux_max[k, i]) for i in N for k in K), name="timeWinA")
    m.addConstrs(((t[k, i] >= aux_min[k, i]) for i in N for k in K), name="timeWinB")
    
    # Ensure at least 90% (or desired percentage) of available fruit are harvested
    m.addConstr((gp.quicksum(x[k, i] for i in N for k in K)/total_fruit >= FPE_min), name="percentHarvest")
    
    # set makespan as the max t^k_i value
    # see https://support.gurobi.com/hc/en-us/community/posts/360071830171-Use-index-of-decision-variable-in-max-constraint
    m.addConstrs((t_max_arm[k] == gp.max_(t.select(k, '*')) for k in K), name='max_value')
    
    m.addConstrs((makespan >= t_max_arm[k] for k in K), name='makespan_contraint')
    
    
    ### Objective function
    m.setObjective((makespan), GRB.MINIMIZE)

The model is trying to solve when there are six or less arms/machines and 80 or less fruit in a line. It is an extension of the MIP model in the article https://link.springer.com/content/pdf/10.1007/s10846-015-0211-5.pdf page 5 out of 13.

The issue first showed up when trying to solve the same problem with increasing number of arms and increasing vehicle speed (K and v_vy). It took a really long time to solve for K=1 to 3 arms if it even produced a solution (once waited 6 hours with no final results), but would almost instantly solve K=4 to 6 arms. At K=4 to 6 arms, the results would use the upper bound for the v_vy variable, which made me think it was a complexity issue. However, when testing this, I found out that the difference between getting a solution and not getting a solution was a very specific upper bound value, say 0.74275 worked, but 0.7428 did not, which doesn't seem like a complexity issue.

I tried relaxing the constraint that uses the velocity variable by changing it to:

     u1 = m.addVars(K, N, lb=-gp.GRB.INFINITY, name="u1")
     u2 = m.addVars(K, N, lb=-gp.GRB.INFINITY, name="u2")
     epsilon_min = m.addVar(ub=epsilon, name="epsilon_min")
     epsilon_max = m.addVar(ub=epsilon, name="epsilon_max") 
    
     m.addConstrs(((u1[k, i] == tw_s[k, i] * v_vy - (Y[i] + k * l)) for i in N for k in K), name="TW_start")
     m.addConstrs(((u2[k, i] == tw_e[k, i] * v_vy - (Y[i] + (k + 1) * l)) for i in N for k in K), name="TW_end")

     m.addConstrs(((epsilon_min == gp.abs_(u1[k, i]) for i in N for k in K)), name="TW_start_relaxed")
     m.addConstrs(((epsilon_max == gp.abs_(u2[k, i]) for i in N for k in K)), name="TW_end_relaxed")

Which worked slightly: at epsilon = 0.01, I could get results if I increased the v_vy upper bound to 0.8 m/s, instead of stopping at 0.7 m/s. However, when epsilon = 0.02, it reversed and I could only get results if the v_vy upper bound was 0.75 m/s. Whenever I had to manually stop the solver, I would also get the following Warning:

Warning: max constraint violation (5.4631e-06) exceeds tolerance

With one even resulting in:

Warning: max constraint violation (2.7679e-06) exceeds tolerance

Best objective 6.299051555350e+01, best bound 6.299051555350e+01, gap 0.0000%

Even though the solution printout had been and was stuck at a gap of 8.77%. This did not seem to be working, so I switched the model back and tried to look into numerical issues.

I found out about big-M and wondered if that was my problem. It had been set at 600, so I changed it to 100, 80, 50, and 30 to see the effects. For each decrease I would get the solution for an increase in the upper bound of 0.05 m/s, up to 0.8 m/s upper bound when M = 50. When M = 30, however, got stuck again if the upper bound was above 0.75 m/s. Furthermore, the gap value at which the solver got stuck for each upper bound value was the same. If [ub = 0.75, gap = 0.98%], if [ub = 0.8, gap = 7.16%], if [up = 0.85, gap = 12.6%]. It seems like the solver is getting stuck on a local minima but I'm not sure if that's correct or even what to do about it.

Here's an example of what I see when the solver get's stuck:

Set parameter NonConvex to value 2
Gurobi Optimizer version 9.5.0 build v9.5.0rc5 (linux64)
Thread count: 2 physical cores, 4 logical processors, using up to 4 threads
Optimize a model with 10044 rows, 1445 columns and 39366 nonzeros
Model fingerprint: 0x88cd75d2
Model has 480 quadratic constraints
Model has 483 general constraints
Variable types: 1205 continuous, 240 integer (240 binary)
Coefficient statistics:
  Matrix range     [1e-02, 5e+01]
  QMatrix range    [1e+00, 1e+00]
  Objective range  [1e+00, 1e+00]
  Bounds range     [1e-01, 1e+00]
  RHS range        [9e-01, 1e+02]
  QRHS range       [3e-01, 5e+01]
Presolve added 284 rows and 1186 columns
Presolve time: 0.19s
Presolved: 12244 rows, 2631 columns, 41803 nonzeros
Presolved model has 479 bilinear constraint(s)
Variable types: 1913 continuous, 718 integer (718 binary)

Root relaxation: objective 5.579053e+01, 1467 iterations, 0.08 seconds (0.04 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0   55.79053    0  500          -   55.79053      -     -    0s
     0     0   55.79053    0  625          -   55.79053      -     -    1s
     0     0   55.79053    0   51          -   55.79053      -     -    1s
     0     0   55.79053    0   29          -   55.79053      -     -    1s
     0     0   55.79053    0   29          -   55.79053      -     -    1s
     0     0   55.79053    0   27          -   55.79053      -     -    1s
     0     0   55.79053    0   26          -   55.79053      -     -    1s
     0     0   55.79053    0   22          -   55.79053      -     -    2s
     0     0   55.79053    0   24          -   55.79053      -     -    2s
     0     0   55.79053    0   17          -   55.79053      -     -    2s
     0     0   55.79053    0   36          -   55.79053      -     -    2s
     0     0   55.79053    0   26          -   55.79053      -     -    2s
     0     0   55.79053    0  262          -   55.79053      -     -    2s
     0     0   55.79053    0   20          -   55.79053      -     -    2s
     0     0   55.79053    0   20          -   55.79053      -     -    5s
     0     2   55.79053    0   20          -   55.79053      -     -    5s
*  280   285             124     129.6930469   55.79053  57.0%  10.8    6s
*  285   285             126     127.7909714   55.79053  56.3%  13.4    6s
*  289   285             128     127.7044331   55.79053  56.3%  14.0    6s
H  347   283                     106.2459014   55.79053  47.5%  15.3    6s
H  379   297                     106.2458884   55.79053  47.5%  18.5    6s
   751   456   56.71502   65   20  106.24589   55.79053  47.5%  21.2   11s
   756   459   55.79053   31   22  106.24589   55.79053  47.5%  21.1   15s
   879   637   55.79053   41   18  106.24589   55.79053  47.5%  43.0   20s
H 1031   785                     106.2458672   55.79053  47.5%  46.0   21s
* 1087   742             102     106.2458316   55.79053  47.5%  46.0   21s
  1851  1104 infeasible  106       106.24583   55.79053  47.5%  44.6   25s
* 1893   921             118      71.9117657   55.79053  22.4%  44.1   25s
* 1896   876             120      71.0673666   55.79053  21.5%  44.1   25s
* 1898   865             121      71.0262332   55.79053  21.5%  44.0   25s
* 1900   865             122      71.0246575   55.79053  21.4%  44.0   25s
* 2384  1094             102      71.0246567   55.79053  21.4%  40.8   26s
* 3171  1344              92      64.2692145   55.79053  13.2%  37.2   29s
* 3173  1328              93      63.9715547   55.79053  12.8%  37.2   30s
* 3174  1328              93      63.9572361   55.79053  12.8%  37.2   30s
* 4000  1697              84      63.8531093   55.79053  12.6%  36.4   32s
* 4001  1697              85      63.8522165   55.79053  12.6%  36.4   32s
  4455  2100 infeasible  107        63.85222   55.79053  12.6%  36.2   35s
* 5482  2439             126      63.8522164   55.79053  12.6%  33.4   37s
  6245  2983   56.23009   78   33   63.85222   55.79053  12.6%  32.9   40s
  7563  3485   56.21312   86   33   63.85222   55.79053  12.6%  32.5   45s
  8482  3968   60.47607   68  501   63.85222   55.79053  12.6%  33.3   50s
H 9762  4139                      63.8522100   55.79053  12.6%  32.4   53s
H10019  4068                      63.8522063   55.79053  12.6%  32.5   53s
 10211  4181 infeasible   71        63.85221   55.79053  12.6%  32.4   55s
H10720  4137                      63.8521980   55.79053  12.6%  32.6   57s
 11423  4460     cutoff   91        63.85220   55.79053  12.6%  33.5   60s
 12483  4883   55.79080   73  225   63.85220   55.79053  12.6%  34.7   66s
 13471  5360   55.79053   89    8   63.85220   55.79053  12.6%  35.1   70s
 14744  5703 infeasible  101        63.85220   55.79053  12.6%  36.1   75s
 15517  5951   55.79053   68   14   63.85220   55.79053  12.6%  37.2   80s
 16613  6342   55.79053   77   14   63.85220   55.79053  12.6%  38.8   85s
*17571  6556              71      63.8521898   55.79053  12.6%  39.5   89s
 17646  6674     cutoff  109        63.85219   55.79053  12.6%  39.7   90s
H17824  6671                      63.8521877   55.79053  12.6%  39.7   90s
 18698  6954   56.23009   83   32   63.85219   55.79053  12.6%  40.4   95s
 20008  7303 infeasible   70        63.85219   55.79053  12.6%  41.1  100s
 20789  7498 infeasible  107        63.85219   55.79053  12.6%  42.0  105s
 21765  7792   55.79180   74  235   63.85219   55.79053  12.6%  42.5  110s
 22852  8116 infeasible  103        63.85219   55.79053  12.6%  43.0  115s
 24024  8483 infeasible   82        63.85219   55.79053  12.6%  43.5  120s
 24982  8657 infeasible   94        63.85219   55.79053  12.6%  44.5  125s
 25577  8914 infeasible   84        63.85219   55.79053  12.6%  45.1  131s
H26611  9088                      63.8521827   55.79053  12.6%  45.3  135s
 27683  9367 infeasible   94        63.85218   55.79053  12.6%  45.9  140s
 28835  9655   55.79053   76   15   63.85218   55.79053  12.6%  46.4  145s
 29895  9904   63.85218   79  206   63.85218   55.79053  12.6%  47.2  150s
 30418 10020   61.74468   72  499   63.85218   55.79053  12.6%  47.6  155s
 31399 10271   61.20316   78  500   63.85218   55.79053  12.6%  47.9  160s
 32288 10456   55.79053   71   13   63.85218   55.79053  12.6%  48.4  165s
 33351 10740   55.79226   69  275   63.85218   55.79053  12.6%  48.8  171s
 34315 11052     cutoff   88        63.85218   55.79053  12.6%  49.0  175s
 35465 11355   61.47373   86  495   63.85218   55.79053  12.6%  49.2  180s
 35872 11458 infeasible   93        63.85218   55.79053  12.6%  49.4  185s
 36981 11721 infeasible   72        63.85218   55.79053  12.6%  49.8  191s
 37762 11973   59.65038   82  515   63.85218   55.79053  12.6%  50.1  195s
 38961 12448     cutoff   80        63.85218   55.79053  12.6%  50.5  200s
 39558 12597   55.79567   72  486   63.85218   55.79053  12.6%  50.7  205s
 40683 12951 infeasible   91        63.85218   55.79053  12.6%  50.9  212s
 41002 13251 infeasible   62        63.85218   55.79053  12.6%  51.0  215s
 42541 13712   60.98319  103  490   63.85218   55.79053  12.6%  50.8  221s
 43807 13980     cutoff   72        63.85218   55.79053  12.6%  51.0  226s
 44597 14122 infeasible   80        63.85218   55.79053  12.6%  51.2  230s
 45754 14439     cutoff   73        63.85218   55.79053  12.6%  51.6  235s
 46432 14737 infeasible   85        63.85218   55.79053  12.6%  51.7  240s
 47726 15156 infeasible   72        63.85218   55.79053  12.6%  51.7  245s
 48559 15450   55.79266   76  348   63.85218   55.79053  12.6%  51.9  250s
 49742 15897   61.83518   67  501   63.85218   55.79053  12.6%  51.9  255s
 50525 16102 infeasible   70        63.85218   55.79053  12.6%  52.1  260s
 51074 16230   55.79053   67   18   63.85218   55.79053  12.6%  52.3  265s
 52314 16745   55.79053   69   12   63.85218   55.79053  12.6%  52.5  270s
 53020 16966   55.79053   58   16   63.85218   55.79053  12.6%  52.7  275s
 54181 17318 infeasible   72        63.85218   55.79053  12.6%  52.8  280s
    ...
 159573 48892 infeasible   65        63.85218   55.79053  12.6%  57.1  820s
 160352 49190 infeasible   84        63.85218   55.79053  12.6%  57.2  825s
 161469 49607 infeasible   98        63.85218   55.79053  12.6%  57.2  830s
 162407 49831 infeasible   91        63.85218   55.79053  12.6%  57.2  835s
 163245 49980   61.31945   79  508   63.85218   55.79053  12.6%  57.3  840s
 164316 50316   58.56964   82  503   63.85218   55.79053  12.6%  57.3  846s
 164966 50596   55.79053   59   21   63.85218   55.79053  12.6%  57.4  850s
 166166 50836   55.79053   96   17   63.85218   55.79053  12.6%  57.4  856s
 167091 51263   55.79053   64   21   63.85218   55.79053  12.6%  57.4  860s
 168027 51464     cutoff   78        63.85218   55.79053  12.6%  57.4  865s
H168417 51541                      63.8521767   55.79053  12.6%  57.4  866s
 169110 51688   55.79053   70   16   63.85218   55.79053  12.6%  57.4  870s
 169950 51836     cutoff   67        63.85218   55.79053  12.6%  57.5  877s
 169983 51883     cutoff   72        63.85218   55.79053  12.6%  57.5  880s
 170690 52183   55.79053   84   15   63.85218   55.79053  12.6%  57.6  885s
 171982 52533   61.97459   71  501   63.85218   55.79053  12.6%  57.6  891s
 173063 52675 infeasible   63        63.85218   55.79053  12.6%  57.6  895s
 174187 53004 infeasible   72        63.85218   55.79053  12.6%  57.7  900s
 175186 53583   55.79053   61   11   63.85218   55.79053  12.6%  57.6  905s

Cutting planes:
  Gomory: 10
  Implied bound: 24
  MIR: 8
  Flow cover: 65
  RLT: 67
  Relax-and-lift: 1

Explored 175781 nodes (10121466 simplex iterations) in 905.82 seconds (850.40 work units)
Thread count was 4 (of 4 available processors)

Solution count 10: 63.8522 63.8522 63.8522 ... 63.8522

Solve interrupted
Best objective 6.385217673573e+01, best bound 5.579053132290e+01, gap 12.6255%
Optimization terminated with status 11

Does anyone have any experience with this type of issue and/or knows how to figure out what the problem might be?

Thank you!

Natalie

Jaromił Najman

Gurobi Staff

March 01, 2022 09:46

Hi Natalie,

There could be many explanations for the behavior you observe.

Could you try providing finite bounds for all variables participating in nonlinear terms and general constraints. In your case, this would mean to provide good bounds for \(\texttt{tw_s}\) and \(\texttt{tw_e}\).

Please note that the code you provided is not reproducible, because definitions of several objects and variables, e.g., \(\texttt{arm}\), are missing. You could share an MPS or LP file instead, see Posting to the Community Forum. This way, we can be sure which exact model you are using and talking about.

You said, that there is kind of a hard gap when the model jumps from solvable within minutes to unsolvable within hours. Could you please provide both models? This should make the investigation easier.

Best regards,
Jaromił

Natalie Pueyo

March 02, 2022 00:49

Hi Jaromil,

Thanks for the reply!

The gap isn't between solving quickly and solving after hours. It's between getting a solution (usually within 300 seconds) or not getting anything (after more than four hours stopped manually). I've been trying to get more information, and it seems that the best objective gets stuck on one value, say 64.08 (stays there all four hours). By changing the v_vy upper bound, \(\texttt{v_vy_ub}\), or the \(\texttt{M}\) value, I can get the starting best bound value closer say 63.23 (or further 59.28) to this stuck best objective value which can lead to the solver getting a result (or not). I also got slightly better results after switching \(\texttt{v_vy}\) to cm/s and set it as an integer variable. It increased the best objective value at which the solver got stuck.

When I lowered \(\texttt{K}\) (number of arms/machines) from 3 to 2 the solver has been unable to solve any of the same problems that were being solved for \(\texttt{K}\) = 3.

I also tried giving the solver a starting \(\texttt{v_vy}\) value I know worked since I got it from solving for a smaller number of fruit (\(\texttt{N}\) = 40) with all the same settings. At too high a number of fruit (\(\texttt{N}\) = ~70 to 80), the solver gives a "no new incumbent solution" and it gets stuck on the same place as before. At \(\texttt{N}\) = 60, the solver finds a solution that is higher than the first best bound, but then increases the best bound to match the solution.

The current .mps model with velocity in cm/s and set as integers, with settings that produce a solution:

https://drive.google.com/file/d/1vObD78pNAXrCnwAbIBCvcxdLUbY0Dhsz/view?usp=sharing

The same model stops working if the \(\texttt{v_vy_ub}\) is set to 90 cm/s or if \(\texttt{M}\) = 80 and \(\texttt{v_vy_ub}\) = 75 cm/s (this one was weird because it can be solved if \(\texttt{v_vy_ub}\) goes to 80 cm/s, though back to unsolvable at 90 cm/s).

I'll try adding good bounds to \(\texttt{tw_s}\) and \(\texttt{tw_e}\) next.

Thank you,

March 02, 2022 10:42 Edited

It looks like finding feasible solutions for you model is very hard and depends on variable bounds. Could you try experimenting with the NoRelHeurTime parameter which controls the time spent in the no relaxation heuristic? For example, set it to 600 seconds.

Please let me know whether providing tight finite bounds for your \(\texttt{tw_e}\) and \(\texttt{tw_s}\) variables helps.

March 04, 2022 23:10

Hi Jaromił,

Adding the bounds to tw_s and tw_e sped up solving for three arms below 0.9 m/s v_vy upper bound. Adding NoRelTime = 800s, led to finding solutions very quickly (all, except one, below 30s when before it could take hundreds of seconds) for 3 arms and all v_vy upper bounds up to 0.9 m/s. I then tested it on 2 arms and it can solve it if I set NoRelTime to 1000s, it just takes over 13.5 hours if the v_vy upper bound is 0.9 m/s. I also tested NoRelTime = 3000s with no change. Still, adding the bounds and NoRel was a huge step forward.

Thank you so much!

MIP model gets stuck at very specific variable bound values

Comments

Didn't find what you were looking for?