メインコンテンツへスキップ

Setting IntegralityFocus=1 leads to (very) suboptimal solution when "manually" implementing SOS1

回答済み

コメント

5件のコメント

  • Riley Clement
    • Gurobi Staff

    Hi Pieter,

    The instance/settings where the optimal objective value is -9998 looks buggy.  I'll open up a ticket so I can liaise with our dev team.  You'll receive an email regarding this shortly.

    Some other comments:

    * Do you intend to have two variables, StartUp and StartUP?  Or is there a typo?
    * R3 is implied by R6
    * I haven't thought too deeply about R4 and R5, but the slack variable seems unusual.  This means there are also 7 orders of magnitude between the smallest coefficient and the largest coefficient in this constraint, and this is opening the door to numerical issues.
    * For a simple translation of SOS1(x_1, x_2, …, x_n) you can introduce binaries y_1, …, y_n, with \sum_i^n y_i = 1 and for each i: x_i ≤ x^{UB}_i y_i, x ≥ x^{LB}_i y_i, where x^{UB} and x^{LB} are upper and lower bounds
    * For other possible SOS translations see this paper by Yıldız and Vielma: https://juan-pablo-vielma.github.io/publications/Incremental-and-Encoding-Formulations.pdf
    * If you are interested in tight formulations for models with startup variables then see (Proposition 12.4) pg 378 of Pochet, Yves, and Laurence A. Wolsey. Production planning by mixed integer programming. If you don't have access to this you can find a similar result (with shutdown variables also) in Section 6.2 of my PhD thesis

    - Riley

     

    0
  • Pieter Bots
    • Conversationalist
    • First Question

    Hi Riley,

    Thanks for your quick response and for your comments. In a way, I'm glad that you agree that it looks like a bug, rather than a modelling mistake.

    In reply to your first two comments: indeed a typo (just one variable: StartUp), and indeed R6 implies R3, so it is redundant.

    The rest pertains to the core of what I try to do. To me, it appears that R3 and R4 are essential to make the solver faithfully reflect the “status” of X: is X < 0 (only then I want NEG=1), is X > 0 (only then I want POS > 0) or is X = 0 (only then I want ZERO=1). My minimum working example achieves this behavior with help of the SOS1 constraint and R3 and R4, which basically are “tricks” to prevent “fake” POS=1 or NEG=1. You're right about the looming numerical problems, but when I test this method for problems with 10 thousand variables (of which about ⅓ binary) I get no warnings for numerical issues on any of the solvers I test with.

    The problem with the standard way of “manually” implementing SOS1 (the same that I use in Linny-R when a solver has no built-in SOS) is that they fail when X has a lower bound < 0. When I generalize using your notation (mapping my ZERO, neg_X and pos_X onto x_1, x_2 and x_3), my demonstration problem becomes as follows:

    I have three non-negative variables: x_1 (upper bound 1), x_2 (upper bound 5) and x_3 (upper bound 5) of which at most one may be > 0 (so a typical SOS1 constraint). Then by the standard approach, we define three binary variables y_1, y_2 and y_3 you get these constraints:

    C0: y_1 + y_2 + y_3 ≤ 0 
    a1: x_1 ≤ y_1
    a2: x_2 ≤ 5 y_2
    a3: x_3 ≤ 5 y_3
    b1: x_1 ≥ 0 y_1
    b2: x_2 ≥ 0 y_2
    b3: x_3 ≥ 0 y_3

    Precisely because LB = 0 for all three x_i, the “b” constraints are ineffective Without a lower bound > 0 for x_i, the indicator y_i can be 1 even when x_i = 0.

    Hence I believe that (given my specific purpose) solvers that do not treat explicit SOS1 constraints as “special” for the B&B phase will come up with imperfect solutions when the objective function rewards choosing y_i = 1 for some i when x_i = 0 for all i. And hence my “trick” of modifying the “b” constraints by using LB=epsilon (1e-6) plus a slack variable that will be needed only when x_2 or x_3 has a value close to 0. 

    Thanks for the references – indeed I have access, and your thesis is going to be a valuable resource for me.

    Best regards,

    Pieter.

    0
  • Riley Clement
    • Gurobi Staff

    Hi Pieter,

    when I test this method for problems with 10 thousand variables (of which about ⅓ binary) I get no warnings for numerical issues on any of the solvers I test with

    I suspect most solvers will advocate for a range of coefficient magnitudes that is 6 orders or less, but the warnings likely kick in a bit above this.  For example, we suggest in our reference manual to keep the spread no more than 6 orders of magnitude but we won't log a warning until the range is 9 orders of magnitude.

    And hence my “trick” of modifying the “b” constraints by using LB=epsilon (1e-6) plus a slack variable that will be needed only when x_2 or x_3 has a value close to 0. 

    If you need y = 1, implies x is nonzero, then typically this would be done with ay ≤ x (or equivalent) with a small value for a, but larger than the feasibility tolerance, so 1e-4 for example.  The slack variable is what I'm finding unusual, and I not understanding the motivation for.

    - Riley

    0
  • Pieter Bots
    • Conversationalist
    • First Question

    Hi Riley,

    The slack variable is what I'm finding unusual, and I not understanding the motivation for.

    Without slack, the solver finds the problem to be infeasible when X is near-zero. When I modify my working example in line with your suggestion by removing the slack variable like so:

    \ Minimum working example: X represents the production level of some asset
    Maximize
     POS + NEG + ZERO + StartUp
    Subject To
    R0: X - pos_X + neg_X = 0
    a1: - 6 NEG + neg_X <= 0   \ 6 suffices as "big M" because -5 <= X <= 5
    a2: - 6 POS + pos_X <= 0
    b1: 0.0001 NEG - neg_X <= 0
    b2: 0.0001 POS - pos_X <= 0 
    R6: POS + NEG + ZERO = 1
    R7: was_UP - POS - NEG + StartUP >= 0  \ so StartUp must be 1 when POS+NEG=1 and asset was not already UP
    R8: POS + NEG - StartUp >= 0           \ so StartUp must be 0 when asset remains DOWN (X=0, so POS+NEG=0)
    R9: was_UP + POS + NEG + StartUp <= 2  \ so StartUp cannot be 1 when asset was already UP in previous time step
    R10: was_UP = 0  \ Let was_UP be zero to represent that asset was DOWN in the previous time step
    R11: X - Y = 0   \ Let X be zero to demonstrate that solver nevertheless sets POS=1 and StartUp=1
    Bounds
    -5 <= X <= 5
    Y = 0.00009
    pos_X <= 5
    neg_X <= 5
    Binaries
    NEG ZERO POS was_UP StartUp
    SOS
    nzp:  S1 :: ZERO:1 neg_X:2 pos_X:3
    End

    Gurobi gives this result:

    Non-default parameters:
    IntegralityFocus  1
    Optimize a model with 11 rows, 10 columns and 28 nonzeros (Max)
    Model fingerprint: 0x1697c235
    Model has 4 linear objective coefficients
    Variable types: 5 continuous, 5 integer (5 binary)
    Coefficient statistics:
     Matrix range     [1e-04, 6e+00]
     Objective range  [1e+00, 1e+00]
     Bounds range     [9e-05, 5e+00]
     RHS range        [1e+00, 2e+00]
    Presolve removed 9 rows and 8 columns
    Presolve time: 0.02s
    Explored 0 nodes (0 simplex iterations) in 0.02 seconds (0.00 work units)
    Thread count was 1 (of 16 available processors)
    Solution count 0
    No other solutions better than -1e+100
    Model is infeasible
    Best objective -, best bound -, gap -
    Wrote result file 'x5.json'
    No solution available
    

    This makes sense because by R11, X = 0.00009, and then by a2, POS = 1, while by b2, POS must be 0 because 1e-5 is not ≤ 0.

    Hence my "workaround" with a slack variable. Maybe I'm overlooking something, but this is exactly the issue that's been puzzling me (on and off) for years.

    Still highly appreciating your comments,

    Pieter.

    0
  • Riley Clement
    • Gurobi Staff

    Hi Pieter,

    Without slack, the solver finds the problem to be infeasible when X is near-zero.

    This is the point of the b1,b2 constraints though.  They define a region from 0 to a threshold, where the continuous variable cannot land, and practically if the continuous variable is non-zero it is because it is close enough to be zero (up to tolerances).  If you need a smaller thresholds so that pos_X = 0.00009 is feasible, then you can do this, and for example use 1e-6 POS - pos_X <= 0 (or scaled equivalent), but you will need to reduce the FeasibilityTol parameter accordingly.

    The slack allows you to violate the b1,b2 constraints, making it a “soft constraint”, but I don't understand the motivation behind making it soft.  The penalty on the slack variable is large, so presumably an optimal solution would never violate the original hard constraint anyway?  When the slack variable is included you are permitting solutions where X=0, and POS=1, which is what you didn't want to have.

    - Riley

     

    0

サインインしてコメントを残してください。