How to make the search over B&B tree faster?
AnsweredHello all,
I have some issues for which I desperately need your help. I have some other tickets opened before, but this one will be the most comprehensive so that I can get your help in a faster and clearer way.
I have an capacity expansion model supported with hourly operational details for power system of western states of the U.S. Typical model statistics are shown in the figure below:
Let's call this model as "baseline". Matrix coefficient range and objective range are larger than suggested. I also know that tiny values in matrix coefficient and large values in objective coefficients may slow down the optimization process. I have both tiny values in matrix coefficient and large values in objective coefficient. I replaced some of the tiny values in matrix coefficient with larger ones (let's call this as "scaled (1)"). Resulting model statistics are shown in the figure below:
Afterwards, I divided all objective coefficients with 10,000. I just wanted to get rid of the large values, and did not care about having smaller values in the objective (let's call as "scaled (2)") Resulting model statistics are shown in the figure below:
Afterwards, I performed both tasks (remove tiny values from matrix coefficient and divide objective with 10,000). Let's call this as "scaled (3)". Resulting model statistics are shown in the figure below:
I run these models one by one (only Gurobi parameter I use is Method = 2). I was able to solve "baseline" model to optimality, but could not solve the rest. However, scaled model's barrier and crossover procedures are incredibly faster than those of "baseline" model. Let me summarize time performance of these models:
I used the following approaches to calculate time spent for barrier, crossover, finding the first feasible solution, and finding the optimal solution.
 Barrier: Elapsed time at the end of barrier method
 Crossover: Elapsed time at the end of crossover minus elapsed time at the end of barrier method
 Finding the first feasible solution: Elapsed time when first feasible solution was found (labeled with 'H' mark) minus elapsed time at the end of crossover
 Finding the optimal solution: Elapsed time when optimal solution was found minus elapsed time when first feasible solution was found
"" means in the above table that optimal solution could not be found. I was able to solve "baseline" model to optimality, but it spent so much time on crossover and for finding the optimal solution. Scaling objective coefficients (refer to Scaled (2)) and removing tiny values from coefficient matrix (refer to Scaled (1)) are incredibly fast in crossover and finding the first feasible solution, but they can not solve the model to optimality! Another note is that Scaled (3) is worse than Scaled (1) and Scaled (2). I could not understand the reason.
Okay, we understand that Scaled (2) is our target model because it is the fastest until finding the first feasible solution. I thought that if I can tweak some Gurobi parameters to find more and better feasible solutions, I could probably make the procedure for finding the optimal solution faster in Scaled (2). I used several parameters, and none of them work so far. Let me summarize which parameters I tried and their results in the following table:
Note that the first row in the above table is the same as Scaled (2) row in the first table. This is a comprehensive list of parameters I have tried so far. My main aim is to devote more effort to find more and better heuristic solutions so that the model is solved to optimality faster. As far as I observed on log files, best bound (LP relaxation) in B&B tree is a very tight bound. Hence, I do not need to focus on moving this, but rather more time should be spent for finding more and better feasible solutions.
Can you please help me to make this model faster? I need your help and guidance. What else parameters should I try? It may not be related to Gurobi parameters by the way. Then, what do you suggest me to do? In the next comment, I will try to provide log files so that you can investigate.
Kind regards

Here are the log files for all models:
Baseline model: https://www.dropbox.com/s/duoaid6m1sn96sm/renewableopt.stdout?dl=0
Scaled (1): https://www.dropbox.com/s/jl3i74zpbcgvf9r/renewableopt.stdout?dl=0
Scaled (2): https://www.dropbox.com/s/5udziog0rikymzq/renewableopt.stdout?dl=0
Scaled (3): https://www.dropbox.com/s/gz73p9hsc6lxd77/renewableopt.stdout?dl=0
If you want to see any log file which corresponds to one of the tried parameters in the second table, I can share them too.
Kind regards
0 
Hi Fikri,
Thanks for the detailed explanation of the issue and attaching the log files.
Please see a discussion of the relevant parameters below that you might want to try. You can find more detailed explanations of these parameters in our Performance Tuning for Gurobi's Barrier Algorithm video presented in 2021 Gurobi Digital Days.
 The barrier runtime is the first bottleneck in your model. In the best case, it takes around 110 minutes to finish. Your model after the default presolve has around 2.5 million rows and 1.3 million columns.
 Try Presolve=2 to see whether this helps further reduce the problem size. It would be great to check the model coefficient statistics after the presolve to ensure intensifying the presolve does not worsen the numerics.
 The baseline model's log file show that the first iteration of the barrier takes ~4000 seconds (more than an hour) which is odd given the number of nonzeros in the factored matrix has increased around 7 (1.805e+08/2.695e+07) times which is not a very large increase.
 Try using a more expensive ordering at the beginning BarOrder=1
 Try increasing the number of dense columns by decreasing the value of an integer internal parameter. You can find this parameter on slide 38 of the presentation linked above. Please be aware that increasing the number of dense columns might worsen the numerics.
 Since the aspect ratio of the presolved model (num_cols/num_rows) is less than 1 (~0.5), try forcing the barrier algorithm to solve the dual of the presolved model by setting PreDual=1.
 Although the barrier algorithm is very amenable to parallelization, using a large number of Threads can oversubscribe the machine and hence slow down the computation. Try experimenting with lower values for Threads.
 The barrier algorithm in all four logs attached encounters numerical issues and terminates suboptimally. Scaling the objective coefficients definitely helped to see smaller tolerances though still beyond the default BarConvTol value of 1e8.
 Try setting the NumericFocus parameter to a value >= 2. This setting forces the barrier to use more conservative step sizes, which can improve the numerics.
 After the barrier terminates suboptimally, there are a large number of superbasic dual and primal variables that need to be pushed to their bounds. Further, there are warning messages such as "Warning: Markowitz tolerance tightened to X" indicating numerical instability.
 Try using a more numerical stable basis by setting CorssoverBasis=1
Please consider the above as suggestions. You would need to experiment with them to see whether changing the default behaviour of any of the parameters above can improve the performance of the barrier.
After tuning the performance of the barrier algorithm, you can then try to use parameters such as Cuts=0, Heuristics, RINS, MIPFocus=1, and ConcurrentMIP to potentially find a better incumbent faster.
A few other points:
 The log file for the scaled (2) model shows that it was killed after 16 hours while the baseline model found the optimal solution after 25 hours. If you let the scaled (2) model run longer, you might reach the optimal solution before 25 hours.
 I did not fully understand how you scaled the constraint coefficients. Did you multiply some of the constraints or substitute some of the variables?
Please experiment with the parameters above and let us know how things progress.
Best regards,
Maliheh
0 
Hi Maliheh,
Thank you so much for your detailed answer. I have already started experimented with parameters you have suggested. I will report here what I get.
Regarding your questions;
 Some jobs were killed due to memory limit. I use 16 processors, and should not use more than that because other researchers in the HPC may want to run their own jobs. I am able to solve the "baseline" model in 25 hours with 16 processors. I would like to reduce the solution time when I still use 16 processors.
 In one constraint, I use capacity factors (generator variability for solar and wind resources). These numbers are between 0 and 1. Some of them are tiny, like 1e6. I updated capacity factor input data file by manually replacing any number less than 0.01 with 0.01. Hence, the minimum capacity factor should be 0.01. This is what I called "scaling" for this constraint (do not be confused that lower bound of matrix coefficient range is 7e4 when I do this scaling. Apparently, there are other input data, which is in the order of e4. Anyway, with this scaling, I was able to get rid of numbers like e5 and e6).
Kind regards
0 
Hi again,
I have completed the experiments that you suggested, except the one which requires me to increase the number of dense columns. I could not figure out which parameter I should tweak. I have done all these experiments on Scaled (2) model (where objective is divided by 10,000). Below, you can see the summary table:
As this table shows, only one of these parameter combinations (Method = 2, PreDual = 1) was able to find optimal solution. Total solution time is 17 hours, which is significantly less than previous case, 25 hours. This is a good news, even though I did not understand why this parameter combination worked. 0 (zero) under the column heading "Finding optimal solution" for Method = 2, PreDual = 1 means that the first feasible solution found was also the optimal solution. Hence, no time was spent to find better feasible solutions or to increase the best bound.
What I do not understand is why MIPFocus, ImproveStartTime, ImproveStartGap, etc. do not work to find better feasible solutions. I need more and better feasible solutions and tweak these parameters accordingly. However, they fail. For instance, I adjust Heuristics = 0.4 (default is 0.05), and expect to see more feasible solutions. But, it could not find even one. In the default setting, it was finding feasible solutions.
Could you please guide me here? Should I stop tweaking parameters and spend more time on scaling instead? What do you understand from the above table in general?
In the next message, I will share the log files.
Kind regards
0 
I think I have already shared the log file of Method = 2. This is Scaled (2)'s log file included in the message I sent 14 days ago. Below, you will see the log files of each of the remaining experiments:
Method = 2, MIPFocus = 1: https://www.dropbox.com/s/db350zp393taaw6/renewableopt.stdout?dl=0
Method = 2, MIPFocus = 1, ImproveStartTime = 26000, ImproveStartGap = 0.99: https://www.dropbox.com/s/yzbgjll2memlgm5/renewableopt.stdout?dl=0
Method = 2, Heuristics = 0.3, NoRelHeurTime = 7200: https://www.dropbox.com/s/orhcc5exfpjegzw/renewableopt.stdout?dl=0
Method = 2, Presolve = 0, Aggregate = 0: https://www.dropbox.com/s/n55y2ddphooj157/renewableopt.stdout?dl=0
Method = 2, Cuts = 1: https://www.dropbox.com/s/q42b5j7ml810q60/renewableopt.stdout?dl=0
Method = 2, BarConvTol = 1e9, FeasibilityTol = 1e7, IntFeasTol = 1e6 OptimalityTol = 1e7: https://www.dropbox.com/s/93s5hazjved459c/renewableopt.stdout?dl=0
Method = 1: https://www.dropbox.com/s/fjcq13d48zldh04/renewableopt.stdout?dl=0
Method = 0: https://www.dropbox.com/s/npeia3o24jmj1kk/renewableopt.stdout?dl=0
Presolve = 2: https://www.dropbox.com/s/jejfzaaa531d6m7/renewableopt.stdout?dl=0
Method = 2, BarOrder = 1: https://www.dropbox.com/s/2yq0h02wbgr3pn2/renewableopt.stdout?dl=0
Method = 2, NumericFocus = 3: https://www.dropbox.com/s/1b6958phaw0vc6g/renewableopt.stdout?dl=0
Method = 2, Heuristics = 0.4: https://www.dropbox.com/s/i31kg6zs2jfkvyq/renewableopt.stdout?dl=0
Method = 2, CrossoverBasis = 1: https://www.dropbox.com/s/lii68yxc7ira3id/renewableopt.stdout?dl=0
Method = 2, RINS = 1: https://www.dropbox.com/s/yi6i00q5e57bune/renewableopt.stdout?dl=0
Method = 2, ConcurrentMIP = 4: https://www.dropbox.com/s/tj7j8q44vme9j8g/renewableopt.stdout?dl=0
Method = 2, Cuts = 0: https://www.dropbox.com/s/g484rdzw50d9yl6/renewableopt.stdout?dl=0
Method = 2, PreDual = 1: https://www.dropbox.com/s/z7zen7jbmt144xc/renewableopt.stdout?dl=0
Method = 2, Number of threads is 8: https://www.dropbox.com/s/tjqx14lonavwr0t/renewableopt.stdout?dl=0
Method = 2, ScaleFlag = 3: https://www.dropbox.com/s/rxpi9cnemchcq0w/renewableopt.stdout?dl=0
0 
Hi Fikri,
Thank you for the detailed message and sharing all the experiments.
I have completed the experiments that you suggested, except the one which requires me to increase the number of dense columns. I could not figure out which parameter I should tweak.
Since it is an undocumented parameter, I will not write it here. However, if you look at time 44:01 in the video linked in my previous message, you will see the name of the parameter mentioned in the second bullet.
Here are few comments on the experiment results you shared:
 Presolve=2: The log file associated with this parameter seems mixed up with another log file (it seems there is information for two different presolve settings). The default presolve in the Scaled (2) model reduces the problem size to "2,555,779 rows, 1,331,529 columns, 11,274,371 nonzeros" (see the log file that you posted 14 days ago). If setting Presolve=2 further reduces the problem size, I would suggest sticking to Presolve=2. It is always useful to check the numerics of the presolved model by calling the printStats() method on the presolved model.
 Setting PreDual=1 to force solving the dual of the presolved model increases the number of nonzeros in the factored matrix by a large factor of 25 despite having a huge number of dense columns (~3500). As expected, this setting has the largest barrier + crossover run time among all the settings you have tried. However, the extra run time has paid off when exploring the search tree resulting in the shortest time to optimality overall. It might be worth trying the setting Method=2, PreDual=1, CrossoverBasis=1 plus Presolve=2 in case you decide to use Presolve=2 based on the discussion in bullet # 1. It would be great to test the winning setting with different seed values to evaluate how robust this setting is.
 We still see the message "Suboptimal termination" for the barrier algorithm for all the settings you have tried, which leads to spending a large amount of time in crossover. This message means that the barrier cannot reach the convergence threshold (default 1e8) due to numerical issues. We also see signs of numerical issues (likely parallel rows/columns) in crossover through messages such as "Warning: Markowitz tolerance tightened to 0.5". One idea is to loosen the BarConvTol (i.e., increase it to 1e6, for example). However, this might lead to spending even more time in crossover. A better approach, which answers your last question about what to focus on next, is to spend some time addressing the numerical issues. Some general ideas to consider are:
 Avoid very small and very large values in the model
 Use reasonable units for the variables (for example, it might be sufficient to represent the electricity produced by megawatthour instead of kilowatthour).
 Avoid using BigM in constraints and formulate them using indicator and SOS constraints
 Use double precision and avoid rounding the coefficients
 Since you have access to a machine with 16 cores, it might be useful to test the winning setting with ConcurrentMIP=2 and Threads=16 as well. This will assign 8 threads to each of the independent solve and will use different values for Seed and MIPFocus parameters.
What I do not understand is why MIPFocus, ImproveStartTime, ImproveStartGap, etc. do not work to find better feasible solutions. I need more and better feasible solutions and tweak these parameters accordingly.
These are just parameters that trigger a higher utilization of heuristic approaches. Spending more time on using heuristics does not guarantee finding a better incumbent. Since your model is big with ~5.5 million constraints, ~2 million variables, and ~18 million nonzeros and the relaxation is very expensive, I am not very surprised at this. The NoRel heuristic in Gurobi does not rely on the relaxation, however, your table shows that the NoRel heuristic also did not help.
In summary, I think spending some time on improving the model to address the numerical issues is likely the best next step.
Best regards,
Maliheh
0 
Hi Maliheh,
Thank you so much for your answer.
 I am sorry for the confusion for Presolve log files. Yes, I tried Presolve = 0 and Presolve = 2 at different times. That is why you might be confused. If you open the log file shared near Presolve = 2 (9th link in my previous message), you will see the corresponding results. Presolve = 2 results in "1,942,475 rows, 1,331,327 columns, 9,858,430 nonzeros". It significantly reduces problem size when compared original size of "2,555,779 rows, 1,331,529 columns, 11,274,371 nonzeros". I will stick with Presolve = 2.
 I will try Method = 2, PreDual = 1, CrossoverBasis = 1, and Presolve = 2. Just a question: How do I try different seeds? In your bullet point #3, you mention ConcurrentMIP. Is it what you refer while you were saying "It would be great to test the winning setting with different seed values to evaluate how robust this setting is."?
 I will try BarConvTol parameter with a higher value. Also, I will keep your suggestions in my mind.
You were so helpful. Thank you so much for your patience and support.
Kind regards
0 
Hi Fikri,
How do I try different seeds?
There is a parameter named Seed in Gurobi that you can use like any other parameter. The default value is 0.
You are very welcome!
Best regards,
Maliheh
0 
Okay, thank you so much.
Kind regards
0 
Hi Maliheh,
I have a quick (and probably a very beginner level question) question. You mention that I need to "Use double precision and avoid rounding the coefficients".
 I have no idea what precision I am using in the default setting. Is there a way to figure this out?
 How will I use double precision, if I use something else in the default setting?
 I did not quite understand the way of avoiding rounding the coefficients (are you referring to what I did above in Scaled (1) model? I replaced all values less than 0.01 in matrix coefficient with 0.01. Is this what you refer as rounding?).
I am not a computer scientists, and hence I have difficulty to understand the importance of precision here.
I look forward to hear from you.
Kind regards
0 
Okay, I think I have an answer for the first and the second questions. In the link below, it is stated that Gurobi prefers double precision.
https://www.gurobi.com/documentation/9.5/refman/real_numbers_are_not_real.html
So, the default is double precision. However, sometimes the algorithm may switch to quad precision, implied by the warning "switch to quad precision". Hence, unless I see such a warning, Gurobi always uses double precision. Am I right?
I still need a clarification about the third bullet point in my previous message.
Kind regards
0 
Hi Fikri,
Yes, you are right. Almost all modern computers use double precision (64 bits) nowadays. The parameter Quad is specific to the simplex algorithm and the warning message "switch to quad precision" implies that we got a nearsingular basis matrix due to bad numerics and we are switching to a higher precision.
I did not quite understand the way of avoiding rounding the coefficients (are you referring to what I did above in Scaled (1) model? I replaced all values less than 0.01 in matrix coefficient with 0.01. Is this what you refer as rounding?).
No, I was not referring to the replacement you did. Check the Avoid rounding of input section in the numerical guidelines. There is an example that will hopefully make the point clear.
Best regards,
Maliheh
0 
Okay, thank you.
Kind regards
0
Please sign in to leave a comment.
Comments
13 comments