Skip to main content

How to make the search over B&B tree faster?

Answered

Comments

13 comments

  • Fikri Kucuksayacigil
    Gurobi-versary
    Collaborator
    Curious

    Here are the log files for all models:

    Baseline model: https://www.dropbox.com/s/duoaid6m1sn96sm/renewableopt.stdout?dl=0

    Scaled (1): https://www.dropbox.com/s/jl3i74zpbcgvf9r/renewableopt.stdout?dl=0

    Scaled (2): https://www.dropbox.com/s/5udziog0rikymzq/renewableopt.stdout?dl=0

    Scaled (3): https://www.dropbox.com/s/gz73p9hsc6lxd77/renewableopt.stdout?dl=0

    If you want to see any log file which corresponds to one of the tried parameters in the second table, I can share them too.

    Kind regards

    0
  • Maliheh Aramon
    Gurobi Staff Gurobi Staff

    Hi Fikri, 

    Thanks for the detailed explanation of the issue and attaching the log files.  

     Please see a discussion of the relevant parameters below that you might want to try. You can find more detailed explanations of these parameters in our Performance Tuning for Gurobi's Barrier Algorithm video presented in 2021 Gurobi Digital Days.

    • The barrier runtime is the first bottleneck in your model. In the best case, it takes around 110 minutes to finish. Your model after the default presolve has around 2.5 million rows and 1.3 million columns.
    • The baseline model's log file show that the first iteration of the barrier takes ~4000 seconds (more than an hour) which is odd given the number of non-zeros in the factored matrix has increased around 7 (1.805e+08/2.695e+07) times which is not a very large increase. 
      • Try using a more expensive ordering at the beginning BarOrder=1
      • Try increasing the number of dense columns by decreasing the value of an integer internal parameter. You can find this parameter on slide 38 of the presentation linked above. Please be aware that increasing the number of dense columns might worsen the numerics.
      • Since the aspect ratio of the presolved model (num_cols/num_rows) is less than 1 (~0.5), try forcing the barrier algorithm to solve the dual of the presolved model by setting PreDual=1.
      • Although the barrier algorithm is very amenable to parallelization, using a large number of Threads can oversubscribe the machine and hence slow down the computation. Try experimenting with lower values for Threads.
    • The barrier algorithm in all four logs attached encounters numerical issues and terminates sub-optimally. Scaling the objective coefficients definitely helped to see smaller tolerances though still beyond the default BarConvTol value of 1e-8.
      • Try setting the NumericFocus parameter to a value >= 2. This setting forces the barrier to use more conservative step sizes, which can improve the numerics. 
    • After the barrier terminates sub-optimally, there are a large number of super-basic dual and primal variables that need to be pushed to their bounds. Further, there are warning messages such as "Warning: Markowitz tolerance tightened to X" indicating numerical instability.

    Please consider the above as suggestions. You would need to experiment with them to see whether changing the default behaviour of any of the parameters above can improve the performance of the barrier.

    After tuning the performance of the barrier algorithm, you can then try to use parameters such as Cuts=0, Heuristics, RINS, MIPFocus=1, and ConcurrentMIP to potentially find a better incumbent faster.

     A few other points:

    • The log file for the scaled (2) model shows that it was killed after 16 hours while the baseline model found the optimal solution after 25 hours. If you let the scaled (2) model run longer, you might reach the optimal solution before 25 hours.
    • I did not fully understand how you scaled the constraint coefficients. Did you multiply some of the constraints or substitute some of the variables? 

    Please experiment with the parameters above and let us know how things progress.

    Best regards,

    Maliheh

     

    0
  • Fikri Kucuksayacigil
    Gurobi-versary
    Collaborator
    Curious

    Hi Maliheh,

    Thank you so much for your detailed answer. I have already started experimented with parameters you have suggested. I will report here what I get.

    Regarding your questions;

    • Some jobs were killed due to memory limit. I use 16 processors, and should not use more than that because other researchers in the HPC may want to run their own jobs. I am able to solve the "baseline" model in 25 hours with 16 processors. I would like to reduce the solution time when I still use 16 processors.
    • In one constraint, I use capacity factors (generator variability for solar and wind resources). These numbers are between 0 and 1. Some of them are tiny, like 1e-6. I updated capacity factor input data file by manually replacing any number less than 0.01 with 0.01. Hence, the minimum capacity factor should be 0.01. This is what I called "scaling" for this constraint (do not be confused that lower bound of matrix coefficient range is 7e-4 when I do this scaling. Apparently, there are other input data, which is in the order of e-4. Anyway, with this scaling, I was able to get rid of numbers like e-5 and e-6).

    Kind regards

    0
  • Fikri Kucuksayacigil
    Gurobi-versary
    Collaborator
    Curious

    Hi again,

    I have completed the experiments that you suggested, except the one which requires me to increase the number of dense columns. I could not figure out which parameter I should tweak. I have done all these experiments on Scaled (2) model (where objective is divided by 10,000). Below, you can see the summary table:

    As this table shows, only one of these parameter combinations (Method = 2, PreDual = 1) was able to find optimal solution. Total solution time is 17 hours, which is significantly less than previous case, 25 hours. This is a good news, even though I did not understand why this parameter combination worked. 0 (zero) under the column heading "Finding optimal solution" for Method = 2, PreDual = 1 means that the first feasible solution found was also the optimal solution. Hence, no time was spent to find better feasible solutions or to increase the best bound.

    What I do not understand is why MIPFocus, ImproveStartTime, ImproveStartGap, etc. do not work to find better feasible solutions. I need more and better feasible solutions and tweak these parameters accordingly. However, they fail. For instance, I adjust Heuristics = 0.4 (default is 0.05), and expect to see more feasible solutions. But, it could not find even one. In the default setting, it was finding feasible solutions.

    Could you please guide me here? Should I stop tweaking parameters and spend more time on scaling instead? What do you understand from the above table in general?

    In the next message, I will share the log files.

    Kind regards

    0
  • Fikri Kucuksayacigil
    Gurobi-versary
    Collaborator
    Curious

    I think I have already shared the log file of Method = 2. This is Scaled (2)'s log file included in the message I sent 14 days ago. Below, you will see the log files of each of the remaining experiments:

    Method = 2, MIPFocus = 1: https://www.dropbox.com/s/db350zp393taaw6/renewableopt.stdout?dl=0

    Method = 2, MIPFocus = 1, ImproveStartTime = 26000, ImproveStartGap = 0.99: https://www.dropbox.com/s/yzbgjll2memlgm5/renewableopt.stdout?dl=0

    Method = 2, Heuristics = 0.3, NoRelHeurTime = 7200: https://www.dropbox.com/s/orhcc5exfpjegzw/renewableopt.stdout?dl=0

    Method = 2, Presolve = 0, Aggregate = 0: https://www.dropbox.com/s/n55y2ddphooj157/renewableopt.stdout?dl=0

    Method = 2, Cuts = 1: https://www.dropbox.com/s/q42b5j7ml810q60/renewableopt.stdout?dl=0

    Method = 2, BarConvTol = 1e-9, FeasibilityTol = 1e-7, IntFeasTol = 1e-6 OptimalityTol = 1e-7: https://www.dropbox.com/s/93s5hazjved459c/renewableopt.stdout?dl=0

    Method = 1: https://www.dropbox.com/s/fjcq13d48zldh04/renewableopt.stdout?dl=0

    Method = 0: https://www.dropbox.com/s/npeia3o24jmj1kk/renewableopt.stdout?dl=0

    Presolve = 2: https://www.dropbox.com/s/jejfzaaa531d6m7/renewableopt.stdout?dl=0

    Method = 2, BarOrder = 1: https://www.dropbox.com/s/2yq0h02wbgr3pn2/renewableopt.stdout?dl=0

    Method = 2, NumericFocus = 3: https://www.dropbox.com/s/1b6958phaw0vc6g/renewableopt.stdout?dl=0

    Method = 2, Heuristics = 0.4: https://www.dropbox.com/s/i31kg6zs2jfkvyq/renewableopt.stdout?dl=0

    Method = 2, CrossoverBasis = 1: https://www.dropbox.com/s/lii68yxc7ira3id/renewableopt.stdout?dl=0

    Method = 2, RINS = 1: https://www.dropbox.com/s/yi6i00q5e57bune/renewableopt.stdout?dl=0

    Method = 2, ConcurrentMIP = 4: https://www.dropbox.com/s/tj7j8q44vme9j8g/renewableopt.stdout?dl=0

    Method = 2, Cuts = 0: https://www.dropbox.com/s/g484rdzw50d9yl6/renewableopt.stdout?dl=0

    Method = 2, PreDual = 1: https://www.dropbox.com/s/z7zen7jbmt144xc/renewableopt.stdout?dl=0

    Method = 2, Number of threads is 8: https://www.dropbox.com/s/tjqx14lonavwr0t/renewableopt.stdout?dl=0

    Method = 2, ScaleFlag = 3: https://www.dropbox.com/s/rxpi9cnemchcq0w/renewableopt.stdout?dl=0

    0
  • Maliheh Aramon
    Gurobi Staff Gurobi Staff

    Hi Fikri, 

    Thank you for the detailed message and sharing all the experiments.

     I have completed the experiments that you suggested, except the one which requires me to increase the number of dense columns. I could not figure out which parameter I should tweak.

    Since it is an undocumented parameter, I will not write it here. However, if you look at time 44:01 in the video linked in my previous message, you will see the name of the parameter mentioned in the second bullet. 

    Here are few comments on the experiment results you shared:

    1. Presolve=2: The log file associated with this parameter seems mixed up with another log file (it seems there is information for two different presolve settings). The default presolve in the Scaled (2) model reduces the problem size to "2,555,779 rows, 1,331,529 columns, 11,274,371 nonzeros" (see the log file that you posted 14 days ago). If setting Presolve=2 further reduces the problem size, I would suggest sticking to Presolve=2. It is always useful to check the numerics of the presolved model by calling the printStats() method on the presolved model.
    2. Setting PreDual=1 to force solving the dual of the presolved model increases the number of non-zeros in the factored matrix by a large factor of 25 despite having a huge number of dense columns (~3500). As expected, this setting has the largest barrier + crossover run time among all the settings you have tried. However, the extra run time has paid off when exploring the search tree resulting in the shortest time to optimality overall. It might be worth trying the setting Method=2, PreDual=1, CrossoverBasis=1 plus Presolve=2 in case you decide to use Presolve=2 based on the discussion in bullet # 1. It would be great to test the winning setting with different seed values to evaluate how robust this setting is.
    3. We still see the message "Sub-optimal termination" for the barrier algorithm for all the settings you have tried, which leads to spending a large amount of time in crossover. This message means that the barrier cannot reach the convergence threshold (default 1e-8) due to numerical issues. We also see signs of numerical issues (likely parallel rows/columns) in crossover through messages such as "Warning: Markowitz tolerance tightened to 0.5". One idea is to loosen the BarConvTol (i.e., increase it to 1e-6, for example). However, this might lead to spending even more time in crossover. A better approach, which answers your last question about what to focus on next,  is to spend some time addressing the numerical issues. Some general ideas to consider are:
      • Avoid very small and very large values in the model
      • Use reasonable units for the variables (for example, it might be sufficient to represent the electricity produced by megawatt-hour instead of kilowatt-hour). 
      • Avoid using Big-M in constraints and formulate them using indicator and SOS constraints
      • Use double precision and avoid rounding the coefficients
    4. Since you have access to a machine with 16 cores, it might be useful to test the winning setting with ConcurrentMIP=2 and Threads=16 as well. This will assign 8 threads to each of the independent solve and will use different values for Seed and MIPFocus parameters

    What I do not understand is why MIPFocus, ImproveStartTime, ImproveStartGap, etc. do not work to find better feasible solutions. I need more and better feasible solutions and tweak these parameters accordingly.

    These are just parameters that trigger a higher utilization of heuristic approaches. Spending more time on using heuristics does not guarantee finding a better incumbent. Since your model is big with ~5.5 million constraints, ~2 million variables, and ~18 million non-zeros and the relaxation is very expensive, I am not very surprised at this. The NoRel heuristic in Gurobi does not rely on the relaxation, however, your table shows that the NoRel heuristic also did not help. 

    In summary, I think spending some time on improving the model to address the numerical issues is likely the best next step. 

    Best regards,

    Maliheh

     

    0
  • Fikri Kucuksayacigil
    Gurobi-versary
    Collaborator
    Curious

    Hi Maliheh,

    Thank you so much for your answer.

    1. I am sorry for the confusion for Presolve log files. Yes, I tried Presolve = 0 and Presolve = 2 at different times. That is why you might be confused. If you open the log file shared near Presolve = 2 (9th link in my previous message), you will see the corresponding results. Presolve = 2 results in "1,942,475 rows, 1,331,327 columns, 9,858,430 nonzeros". It significantly reduces problem size when compared original size of "2,555,779 rows, 1,331,529 columns, 11,274,371 nonzeros". I will stick with Presolve = 2.
    2. I will try Method = 2, PreDual = 1, CrossoverBasis = 1, and Presolve = 2. Just a question: How do I try different seeds? In your bullet point #3, you mention ConcurrentMIP. Is it what you refer while you were saying "It would be great to test the winning setting with different seed values to evaluate how robust this setting is."?
    3. I will try BarConvTol parameter with a higher value. Also, I will keep your suggestions in my mind.

    You were so helpful. Thank you so much for your patience and support.

    Kind regards

    0
  • Maliheh Aramon
    Gurobi Staff Gurobi Staff

    Hi Fikri, 

    How do I try different seeds?

    There is a parameter named Seed in Gurobi that you can use like any other parameter. The default value is 0. 

    You are very welcome!

    Best regards,

    Maliheh

     

    0
  • Fikri Kucuksayacigil
    Gurobi-versary
    Collaborator
    Curious

    Okay, thank you so much.

    Kind regards

    0
  • Fikri Kucuksayacigil
    Gurobi-versary
    Collaborator
    Curious

    Hi Maliheh,

    I have a quick (and probably a very beginner level question) question. You mention that I need to "Use double precision and avoid rounding the coefficients".

    • I have no idea what precision I am using in the default setting. Is there a way to figure this out?
    • How will I use double precision, if I use something else in the default setting?
    • I did not quite understand the way of avoiding rounding the coefficients (are you referring to what I did above in Scaled (1) model? I replaced all values less than 0.01 in matrix coefficient with 0.01. Is this what you refer as rounding?).

    I am not a computer scientists, and hence I have difficulty to understand the importance of precision here.

    I look forward to hear from you.

    Kind regards

    0
  • Fikri Kucuksayacigil
    Gurobi-versary
    Collaborator
    Curious

    Okay, I think I have an answer for the first and the second questions. In the link below, it is stated that Gurobi prefers double precision.

    https://www.gurobi.com/documentation/9.5/refman/real_numbers_are_not_real.html

    So, the default is double precision. However, sometimes the algorithm may switch to quad precision, implied by the warning "switch to quad precision". Hence, unless I see such a warning, Gurobi always uses double precision. Am I right?

    I still need a clarification about the third bullet point in my previous message.

    Kind regards

    0
  • Maliheh Aramon
    Gurobi Staff Gurobi Staff

    Hi Fikri,

    Yes, you are right. Almost all modern computers use double precision (64 bits) nowadays. The parameter Quad is specific to the simplex algorithm and the warning message "switch to quad precision" implies that we got a near-singular basis matrix due to bad numerics and we are switching to a higher precision. 

    I did not quite understand the way of avoiding rounding the coefficients (are you referring to what I did above in Scaled (1) model? I replaced all values less than 0.01 in matrix coefficient with 0.01. Is this what you refer as rounding?).

    No, I was not referring to the replacement you did. Check the Avoid rounding of input section in the numerical guidelines. There is an example that will hopefully make the point clear.

    Best regards,

    Maliheh



    0
  • Fikri Kucuksayacigil
    Gurobi-versary
    Collaborator
    Curious

    Okay, thank you.

    Kind regards

    0

Please sign in to leave a comment.