Barrier: Inflated problem iterates and solves much faster
AnsweredI’m conducting research on energy systems by modelling them as continuous linear optimization problems. These problems always have the same structure: To put it briefly, the amount of an energy carrier supplied needs to be greater equal the amount used at every time-step. This is achieved by various technologies that generate or convert energy carriers. These technologies are subject to costs that are minimized in the objective function.
Usually these models apply the same temporal resolution for each energy carrier, which needlessly inflates the problem, since this is not necessary from an engineering perspective. Therefore, I developed a method that allows to vary the temporal resolution by energy carrier (e.g. equations for electricity are hourly, but for gas daily). If I benchmark a lean problem, that applies this method, against the same problem, but with each carrier at the same small resolution, I get the following logs:
Lean problem with varied resolution
Optimize a model with 1386117 rows, 936526 columns and 5404759 nonzeros
Coefficient statistics:
Matrix range [2e-04, 2e+02]
Objective range [1e+00, 1e+02]
Bounds range [0e+00, 0e+00]
RHS range [2e-02, 2e+02]
Presolve removed 542346 rows and 190832 columns
Presolve time: 4.02s
Presolved: 843771 rows, 745694 columns, 3758236 nonzeros
Barrier statistics:
Dense cols : 126
AA' NZ : 4.517e+06
Factor NZ : 1.333e+08 (roughly 1.7 GBytes of memory)
Factor Ops : 5.380e+11 (roughly 8 seconds per iteration)
Threads : 4
Barrier solved model in 94 iterations and 1706.71 seconds
Optimal objective 1.81628103e+05
Inflated problem with uniform resolution
Optimize a model with 3031004 rows, 2645660 columns and 10466499 nonzeros
Coefficient statistics:
Matrix range [2e-04, 1e+02]
Objective range [1e+00, 1e+02]
Bounds range [0e+00, 0e+00]
RHS range [6e-03, 2e+02]
Presolve removed 1369127 rows and 989179 columns (presolve time = 6s) ...
Presolve removed 1369127 rows and 989179 columns
Presolve time: 8.65s
Presolved: 1661877 rows, 1656481 columns, 7161402 nonzeros
Barrier statistics:
Dense cols : 136
AA' NZ : 7.220e+06
Factor NZ : 3.767e+07 (roughly 1.7 GBytes of memory)
Factor Ops : 2.222e+09 (less than 1 second per iteration)
Threads : 4
Barrier solved model in 167 iterations and 423.31 seconds
Optimal objective 1.81209707e+05
As you can see, the results are very similar, as desired, and varied resolution does reduce the problem size significantly. However, since iterations on the small problem take much longer, it solves much slower, which is the opposite of what I was hoping to achieve. It also did the same benchmark on a scaled up system (14e6 and 26e6 non-zeroes) with even more pronounced results (5 GB and 3 seconds for the inflated problem, 10 GB and 200 seconds for the lean one).
Can you provide me with a better understanding what drives these difference? Do you have any suggestions, for example regarding the setting of parameters or problem re-formulation, that would allow to actually solve the lean problem faster than the inflated one?
I’m using Julia/JuMP to create the problem. It was solved with Gurobi.jl v0.7.3 and the following Gurobi parameters: Method = 2, Crossover = 0, BarConvTol = 1e-5
Thanks and regards,
Leo
-
Hello Leo:
The differences are likely to come from the way AA' is factored inside the barrier.
You can see that by comparing the 'lean' and 'inflated' log lines, focusing on number of non-zeros (think, Choleski factorization) and the respective estimated number of operations required, namely
<lean>
Barrier statistics:
Dense cols : 126
AA' NZ : 4.517e+06
Factor NZ : 1.333e+08 (roughly 1.7 GBytes of memory)
Factor Ops : 5.380e+11 (roughly 8 seconds per iteration)<inflated>
Barrier statistics:
Dense cols : 136
AA' NZ : 7.220e+06
Factor NZ : 3.767e+07 (roughly 1.7 GBytes of memory)
Factor Ops : 2.222e+09 (less than 1 second per iteration)
Threads : 4There is a number of things you can try to see if the number of non-zeros and FLOP can be reduced for the leaner model.
1) The row and column ordering could effect the factorization (even dramatically, in some cases). Since you seem to do well for the inflated model, I would try to closely match this ordering, if possible, for the leaner model.
2) You can experiment with the choice of the ordering algorithm Gurobi uses; basically, try all the options on a leaner model and see what happens (you do not need to push for the solution and terminate once you see the barrier stats), see here,
https://www.gurobi.com/documentation/9.0/refman/barorder.html
3) The number of dense columns may also play a big role in factorization -- you can think of these columns as being initiAlly removed from A, and added only later once factorization is achieved. Only as a last resort, you can try chaining the dense column threshold setting GURO_PAR_BARDENSETHRESH parameter to some value, e.g., set it to 50, 100, 200, 400 etc., and 40, 25, 12 ... and see what happens. Note the latter setting is undocumented and we do not guarantee that this will remain in place for any of the later version of Gurobi, so again, I would experiment with this only as a last resort.
It is also possible that the leaner model is sufficiently different, and you will be stuck with an expensive factorization no matter what. But, I would say it is worth a try.
Hope this helps.
0 -
Thank you for this very comprehensive answer! It provided me with a much better understanding about what is going on. I was actually able to solve the problem by updating the Gurobi package for Julia (not Gurobi itself), which probably affected the internal order of rows and columns. I’m now also achieving the performance improvements on the leaner model that I was hoping for.
0
Please sign in to leave a comment.
Comments
2 comments