Do we need bounds for all newly introduced variables when reformulating nonlinear constraints?
回答済みWhen modeling nonlinear constraints in Gurobi, I introduce new variables to convert them into bilinear or quadratic forms.
My question is:
Do we need to set lower and upper bounds for all newly introduced variables?
By “original variables,” I mean the decision variables that are directly defined in the model before any reformulation. I have already set bounds for these variables, but sometimes Gurobi cannot determine bounds for the newly introduced variables.
For example, consider
z = x^2 y + t^3, with x in [-2, 3], y in [0, 4], and t in [1, 5]
Introduce auxiliary variables:
u = x^2
r = t^2
Then:
z = u * y + r * t
Derived bounds:
u in [0, 9]
r in [1, 25]
-
Hi Huan,
You do not necessarily need to manually provide bounds for every auxiliary variable if those bounds are implied by other bounds in the model. Gurobi's presolve performs bound propagation, so in cases like an auxiliary variable defined by a nonlinear equality over bounded variables, presolve will typically compute meaningful bounds for the auxiliary variable.
That said, it is still good modeling practice to provide tight, valid bounds whenever they are easy to derive. Explicit bounds can support these presolve routines, strengthen relaxations, improve numerical behavior, and help in nonlinear/nonconvex optimization.
In your case, Gurobi appears capable of inferring the bounds for the auxiliary variables during presolve:
import gurobipy as gp from gurobipy import GRB model = gp.Model() model.Params.OutputFlag = 0 x = model.addVar(lb=-2, ub=3, name="x") y = model.addVar(lb=0, ub=4, name="y") t = model.addVar(lb=1, ub=5, name="t") u = model.addVar(name="u") # Default: [0, inf] r = model.addVar(name="r") # Default: [0, inf] model.addConstr(u == x * x) model.addConstr(r == t * t) # Prevent presolve from eliminating the variables. model.addConstr(u + r >= 2) model.setObjective(u * y + r * t, GRB.MINIMIZE) model.update() print("ORIGINAL MODEL:") print(f" u: [{u.LB}, {u.UB}]") print(f" r: [{r.LB}, {r.UB}]") presolved = model.presolve() print(f"\nPRESOLVED MODEL:") for var in presolved.getVars(): if var.VarName in ["u", "r"]: ub = f"{var.UB:.1f}" if var.UB < 1e20 else "inf" print(f" {var.VarName}: [{var.LB:.1f}, {ub}]")Running this Python script on my machine with Gurobi 13.0.1 gives:
ORIGINAL MODEL: u: [0.0, inf] r: [0.0, inf] PRESOLVED MODEL: u: [0.0, 9.0] r: [1.0, 25.0]So here, Gurobi derives your expected bounds for
uandr.That said, the practical recommendation is: add tight, valid bounds when you know them.
If you have an example where you believe the bounds should be easily implied by the formulation but Gurobi's presolve does not derive them, could you share a small reproducible model? Gurobi's presolve methods are powerful, but they may not expose every implied bound in every formulation.
1 -
Thank you, that clears up my confusion.
Here are two small examples that illustrate my observation.
Let me describe the key relationship. Suppose we have a constraint of the form
r * x^(4/3) = 0.5 * c* vc,
and assume that bounds for x, c and vc are already known. In principle, this should imply bounds on r as well.
Example 1 (reformulated with bilinear/quadratic constraints):
In this formulation, I introduce auxiliary variables and rewrite the nonlinear expressions using bilinear relationships. However, Gurobi’s presolve does not seem to infer meaningful bounds forr.import gurobipy as gp from gurobipy import GRB model = gp.Model("Presolve_Test") model.Params.NonConvex = 2 model.Params.OutputFlag = 0 # power-related variables x = model.addVar(lb=1, ub=10, name="x") x_pow_2_3 = model.addVar(name="x_pow_2_3") x_pow_4_3 = model.addVar(name="x_pow_4_3") x_pow_2 = model.addVar(name="x_pow_2") # main variable r = model.addVar(name="r") # auxiliary variables c = model.addVar(lb=0.1, ub=100, name="c") vc = model.addVar(lb=0, ub=100, name="vc") # constraints model.addConstr(x_pow_2 == x_pow_2_3 * x_pow_4_3, name="link_pow2") model.addConstr(x_pow_4_3 == x_pow_2_3 * x_pow_2_3, name="square_relation") z = model.addVar(name="z") model.addConstr(z == c * vc, name="product_c_vc") model.addConstr(r * x_pow_4_3 == 0.5 * z, name="define_r") # objective model.setObjective(x_pow_2_3 + x_pow_4_3 + x_pow_2, GRB.MINIMIZE) model.update() print("--- BOUNDS BEFORE PRESOLVE ---") print(f"r: [{r.LB}, {r.UB}]") presolved = model.presolve() print("\n--- BOUNDS AFTER PRESOLVE ---") for var in presolved.getVars(): if var.VarName == "r": ub_str = f"{var.UB:.3f}" if var.UB < GRB.INFINITY else "inf" lb_str = f"{var.LB:.3f}" if var.LB > -GRB.INFINITY else "-inf" print(f"{var.VarName}: [{lb_str}, {ub_str}]")Example 2 (direct nonlinear constraint with
addGenConstrNL):
When I model the same relationship using a native nonlinear constraint, presolve appears able to derive bounds forr.import gurobipy as gp from gurobipy import GRB model = gp.Model("Presolve_Test_NL_Direct") model.Params.NonConvex = 2 model.Params.FuncNonlinear = 1 model.Params.OutputFlag = 0 # base variable and powers x = model.addVar(lb=1, ub=10, name="x") x_pow_2 = model.addVar(name="x_pow_2") # auxiliary variables c = model.addVar(lb=0.1, ub=100, name="c") vc = model.addVar(lb=0, ub=100, name="vc") # main variable r = model.addVar(name="r") # constraints model.addConstr(x_pow_2 == x * x, name="square_relation") model.addGenConstrNL( r, (0.5 * c * vc) / (x**(4/3)), name="define_r_nl" ) # objective model.setObjective(r, GRB.MINIMIZE) model.update() print("--- BOUNDS BEFORE PRESOLVE ---") print(f"r: [{r.LB}, {r.UB}]") presolved = model.presolve() print("\n--- BOUNDS AFTER PRESOLVE ---") for var in presolved.getVars(): if var.VarName == "r": ub_str = f"{var.UB:.3f}" if var.UB < GRB.INFINITY else "inf" lb_str = f"{var.LB:.3f}" if var.LB > -GRB.INFINITY else "-inf" print(f"{var.VarName}: [{lb_str}, {ub_str}]")So in my tests, the direct nonlinear formulation allows presolve to propagate bounds to
r, while the reformulated bilinear version does not.Is this difference in behavior expected?
1 -
Hi Huan,
Thanks for sharing the examples. Yes, this difference in behavior is expected.
One small point first: in Example 1 as written,
xis not actually linked to the auxiliary power variables. You would also need a constraint such asmodel.addConstr(x_pow_2 == x * x, name="x_square")However, even after adding this constraint, it is not surprising that presolve's bound propagation does not derive the same bounds as in the direct nonlinear formulation.
The reason is that the bilinear reformulation hides the original power structure from presolve. If we write
\begin{equation} a = \textrm{x_pow_2_3}, ~ b = \textrm{x_pow_4_3}, ~ q = \textrm{x_pow_2}, \end{equation}
then the reformulated model contains
\begin{equation} q = ab, ~ b = a^2, ~ q = x^2. \end{equation}
Since the variables are nonnegative and \(x \in [1,10]\), these constraints imply
\begin{equation} q \in [1,100], ~ a^3 = q, ~ b = a^2, \end{equation}
and hence
\begin{equation} a \in [1,100^{1/3}], ~ b \in [1,100^{2/3}]. \end{equation}
So strictly positive lower bounds and finite upper bounds are indeed implied by the formulation. However, deriving them requires combining several nonlinear constraints and recognizing the implied cubic relationship \(a^3=q\).
Presolve bound propagation works by passing bound information through the constraints that are present in the model. Looking at the constraint \(q=ab\) alone, with \(a,b \in [0,\infty]\) and \(q \in [1,100]\), neither \(a\) nor \(b\) gets a nonzero lower bound nor a finite upper bound: either variable can be very small if the other is very large. Looking at \(b=a^2\) alone also does not give a finite upper bound on \(a\) or \(b\) if neither already has a finite upper bound.
The direct nonlinear formulation is different. When you write
model.addGenConstrNL( r, (0.5 * c * vc) / (x**(4/3)), name="define_r_nl" )Gurobi sees the expression tree for the function directly. Specifically, it sees the expression \(x^{4/3}\) with \(x \in [1,10]\). Gurobi's nonlinear constraints are represented internally as expression trees composed of arithmetic operations and nonlinear functions. This makes it easier to propagate bounds through the original expression.
In this example, the direct nonlinear expression gives
\begin{equation} x^{4/3} \in [1,10^{4/3}], \end{equation}
and
\begin{equation} 0.5 c v_{c} \in [0,5000], \end{equation}
so
\begin{equation} r = \frac{0.5 c v_{c}}{x^{4/3}} \in [0,5000]. \end{equation}
That bound is easy to obtain from the direct expression because the denominator is recognized as a positive bounded function of \(x\) (with a lower bound of 1). This is not the case in Example 1, where
x_pow_4_3has a lower bound of zero even after bound propagation.Finally, note that
Model.presolve()is useful for diagnostics, but it is not a complete guarantee of exactly what will happen internally duringoptimize(). The Python documentation says it returns an almost equivalent presolved model, but that this model may differ from the one computed during optimization.In short, the behavior you observe is expected. The direct nonlinear formulation preserves the original function structure, while the bilinear reformulation requires presolve to infer bounds through a chain of auxiliary constraints involving variables with lower bounds of zero and no upper bounds. If you must reformulate the model in this way, the practical recommendation remains: provide tight, valid bounds whenever they are easy to derive.
1 -
Thank you very much for the clear answer.
Regarding your last sentence:
“If you must reformulate the model in this way, the practical recommendation remains: provide tight, valid bounds whenever they are easy to derive.”I can use
addGenConstrNLto avoid the reformulation. However, it seems thataddGenConstrNLis slower than the reformulation, which is why I tried to reformulate it into quadratic and bilinear terms myself.Is this observation correct, or am I misunderstanding something?
1 -
Hi Huan,
Yes, your observation can be correct for your particular model. A manually introduced bilinear reformulation could sometimes solve faster than modeling the nonlinear relationship directly.
However, I would not interpret this as a general rule. Both formulations are nonconvex, so solving them to global optimality generally relies on the same techniques. The two formulations can nevertheless lead to different internal representations, presolve reductions, branching decisions, and numerical behavior. As a result, either formulation may be faster for a given model or instance.
For a fair comparison, I recommend testing with the latest Gurobi version, currently 13.0.1. Gurobi 13.0 introduced performance improvements across several model families. I would also compare the two formulations over several Seed values rather than relying on a single run. Changing the Seed parameter typically leads to different solution paths. This is a useful way to account for performance variability when comparing formulations.
So my practical recommendation would be:
- If you aren't already, use the latest Gurobi release.
- Compare the direct nonlinear formulation and the bilinear reformulation over several Seed values, and ideally over multiple representative instances.
- If the bilinear reformulation is consistently faster, then it may be the more favorable formulation for this particular model.
- In that case, still provide tight, valid bounds for the auxiliary variables whenever possible.
1 -
Thanks for the suggestions.
That also answers my question in the other post.1
サインインしてコメントを残してください。
コメント
6件のコメント