More breakpoints using "addGenConstrPWL" reduces elapsed time?
AnsweredHello, community,
I have been doing some experiments using the syntax "addGenConstrPWL." Adding more breakpoints speeds up the optimization, which seems counterintuitive to me. Although I read the documentation, it says, "If your piecewiselinear function contains a large number of segments, the specialized algorithm will be much faster than the standard simplex solver." My formulation also relates the continuous variable with a binary variable. So, I am not entirely sure how it would work in that case.
It would be great if you have any references to help me understand this behavior more deeply.
Below is part of my code that uses this feature. "lamb" is x, and "w" is f(x). Moreover, the values "lamb" takes depend on binary variables "y."
I realized that when using 30 breakpoints, the optimal solution is reached in a shorter time than when using 10 breakpoints.
# Decision variables
y = mdl.addVars(set_it, vtype=GRB.BINARY, name='y')
lamb = mdl.addVars(set_itl, vtype=GRB.CONTINUOUS, name='lambda') # x
w = mdl.addVars(set_itl, vtype=GRB.CONTINUOUS, name='w') # fx
# Piecewise constraints
for i in units:
for t in horizon:
mdl.addGenConstrPWL(lamb[i, t], w[i, t], break_points_x[t], break_points_fx[t])
for t in horizon:
mdl.addGenConstrPWL(lamb[1, t], w[1, t], break_points_x[t], break_points_fx[t])
mdl.addConstrs(lamb[i, t] == quicksum(p_min[k] * y[k, t] for k in units) +
quicksum(p[j, t] for j in range(i + 1)) for i in units for t in horizon)
mdl.addConstrs(lamb[1, t] == quicksum(p_min[k] * y[k, t] for k in units) for t in horizon)
# Objective function
quicksum(fc_1[i] * (w[i, t]  w[i  1, t]) for i in units for t in horizon)
penalty * quicksum(mu[t]  w[len(units)  1, t] for t in horizon), GRB.MINIMIZE)

Hi Carlos,
I have been doing some experiments using the syntax "addGenConstrPWL." Adding more breakpoints speeds up the optimization, which seems counterintuitive to me.
How big is the speedup? Can you see where the speedup comes from? Is it from improving the best bound or from finding the best feasible solution faster? Could you please share log outputs of the two optimization runs?
Although I read the documentation, it says, "If your piecewiselinear function contains a large number of segments, the specialized algorithm will be much faster than the standard simplex solver." My formulation also relates the continuous variable with a binary variable.
The specialized algorithm will not be used in this case, because your model is a MIP. However, it might be used for B&B nodes speeding up the overall optimization process.
I realized that when using 30 breakpoints, the optimal solution is reached in a shorter time than when using 10 breakpoints.
What do those PWL functions describe? Are they some approximation of a nonlinear function? Are they convex? When you use more break points, do you still approximate the same domain but now use more points to improve the accuracy?
Below is part of my code that uses this feature. "lamb" is x, and "w" is f(x). Moreover, the values "lamb" takes depend on binary variables "y."
Please note that your example is not reproducible. There are multiple object definitions missing. Moreover, it is best to provide a minimal reproducible example instead of providing the whole code.
Best regards,
Jaromił0 
Hi Jaromil,
How big is the speedup? Can you see where the speedup comes from? Is it from improving the best bound or from finding the best feasible solution faster? Could you please share log outputs of the two optimization runs?
Here is the log with 10 breakpoints and here with 30 breakpoints.
This table has an average and standard deviation of 30 experiments per instance. It is hard to explain why 30 breakpoints actually provide a better approximation. It is easier to solve than the one with 10 breakpoints.
Breakpoints Average (s) SD (s) 5 56 1 10 88 36 15 93 34 20 92 40 25 126 3 30 67 3 What do those PWL functions describe? Are they some approximation of a nonlinear function? Are they convex? When you use more break points, do you still approximate the same domain but now use more points to improve the accuracy?
They are an approximation of several nonlinear functions represented by Gamma(x)_t that is convex. See the objective function below. The gamma function is linearized.
I will try to provide a reproducible example.
Thanks for your help.
0 
Hi Carlos,
Here is the log with 10 breakpoints and here with 30 breakpoints.
It looks like the 10 breakpoints log is no longer accessible (at least for me).
This table has an average and standard deviation of 30 experiments per instance. It is hard to explain why 30 breakpoints actually provide a better approximation. It is easier to solve than the one with 10 breakpoints.
You can see that the optimization time increases with more breakpoints and then drops at 30 breakpoints. Could you try going for even more breakpoints up to 50? It is possible that for the particular 30 breakpoints case, the solver just got "lucky". This means that a feasible point could have been found earlier. If the best bound is already very good and the only issue is finding a good enough feasible point, then it is very possible that the optimization time fluctuates for bigger cases. This is because a heuristic might just get "lucky" and find the good feasible point quickly for a bigger case than for a smaller case. This may sound counterintuitive but this is how heuristics work from time to time.
If you notice that finding feasible points is the main difficulty for your models, you should try experimenting with the parameters MIPFocus, NoRelHeurTime, Heuristics.
Best regards,
Jaromił0
Please sign in to leave a comment.
Comments
3 comments