Skip to main content

Varying Ordering Time for similar models

Answered

Comments

9 comments

  • Jaromił Najman
    Gurobi Staff Gurobi Staff

    Hi Jacob,

    Even small changes to a model can change the optimization path significantly.

    EDIT: I re-ran it, and now the ordering times switched (model B < model A), so is the ordering time in fact somewhat arbitrary?

    Can you reproduce this behavior consistently, i.e., can you reproduce a run such that ordering time is model B < model A and vice versa? Did you change any parameter settings or the machine?

    Did you try experimenting with the parameter BarOrder and set it to a fixed value for both models?

    Is it possible for you to share the two models? Note that uploading files in the Community Forum is not possible but we discuss an alternative in Posting to the Community Forum.

    Best regards, 
    Jaromił

    0
  • Jacob Mannhardt
    Gurobi-versary
    Conversationalist
    First Question

    Hi Jaromił,

    once again thanks for your fast and comprehensive help! As always a pleasure to talk to you!

    Interestingly, it is mostly but not necessarily always model A < model B, however, different runs of the same model result in different ordering times (e.g., from ~4% of solution time to 26%).

    Could you explain a bit (or hint me at an explanation) what the ordering does? If I understand correctly from the BarOrder, it is about filling sparse matrices, correct? I haven't found anything else during my Google search. Can I see or check which BarOrder parameter was used? Is one or the other method faster or slower?

    Thanks a lot again for your help!

    Best,
    Jacob

    0
  • Jaromił Najman
    Gurobi Staff Gurobi Staff

    Hi Jacob,

    once again thanks for your fast and comprehensive help! As always a pleasure to talk to you!

    Thank you for the kind words.

    Could you explain a bit (or hint me at an explanation) what the ordering does? If I understand correctly from the BarOrder, it is about filling sparse matrices, correct? I haven't found anything else during my Google search.

    The goal of a barrier ordering algorithm is to permute the ordering of rows in order to reduce fill needed for Cholesky factorization. In other words, a good ordering can reduce the overall time spent in matrix computations. Gurobi uses Approximate Minimum Degree ordering or Nested Dissection ordering. There are many papers on both algorithms (AMD and ND) but even Wikipedia gives a quite acceptable description for both AMD and ND.

    Can I see or check which BarOrder parameter was used? Is one or the other method faster or slower?

    It is not possible to see which algorithm is used and can be very hard to say a priori which algorithm will be faster for a given model. Usually, Gurobi chooses a fitting algorithm automatically but depending on the model complexity, this choice can be improved. Thus, it definitely makes sense to experiment with the BarOrder parameter if ordering times are surprisingly long.

    Interestingly, it is mostly but not necessarily always model A < model B, however, different runs of the same model result in different ordering times (e.g., from ~4% of solution time to 26%).

    Do you mean that when you run model A on the same machine without changing any parameters and the ordering time differs? Is something running in the background? Can you consistently reproduce this non-determinism? Is the rest of the optimization path the same?

    Best regards, 
    Jaromił

     

    0
  • Jacob Mannhardt
    Gurobi-versary
    Conversationalist
    First Question

    Dear Jaromił,

    Thanks for the explanations, I will look into that.

    Turns out, the ordering times for the same model with the same parameters and the same settings vary significantly. I started four identical optimizations for the same model:

    Model 1:

    Presolve removed 9001776 rows and 7151373 columns (presolve time = 7s) ...
    Presolve removed 9066936 rows and 7190373 columns (presolve time = 11s) ...
    Presolve removed 12051531 rows and 10174968 columns (presolve time = 15s) ...
    Presolve removed 12053031 rows and 10176468 columns (presolve time = 21s) ...
    Presolve removed 12053031 rows and 10176468 columns (presolve time = 25s) ...
    Presolve removed 12113163 rows and 10247279 columns (presolve time = 30s) ...
    Presolve removed 12113949 rows and 10247633 columns
    Presolve time: 34.39s
    Presolved: 7288417 rows, 5227012 columns, 24079584 nonzeros
    Elapsed ordering time = 5s
    Elapsed ordering time = 10s
    Elapsed ordering time = 15s
    Elapsed ordering time = 72s
    Elapsed ordering time = 75s
    Elapsed ordering time = 80s
    Elapsed ordering time = 85s
    Elapsed ordering time = 90s
    Elapsed ordering time = 95s
    Elapsed ordering time = 100s
    Ordering time: 502.78s

    Barrier statistics:
     Dense cols : 1057
     AA' NZ     : 1.435e+08
     Factor NZ  : 3.365e+09 (roughly 30.0 GB of memory)
     Factor Ops : 2.413e+13 (roughly 30 seconds per iteration)
     Threads    : 64

    Model 2:

    Presolve removed 8939746 rows and 7130603 columns (presolve time = 6s) ...
    Presolve removed 9002806 rows and 7169603 columns (presolve time = 11s) ...
    Presolve removed 11988901 rows and 10155698 columns (presolve time = 15s) ...
    Presolve removed 11988901 rows and 10155698 columns (presolve time = 20s) ...
    Presolve removed 11988901 rows and 10155698 columns (presolve time = 25s) ...
    Presolve removed 12049343 rows and 10226800 columns (presolve time = 31s) ...
    Presolve removed 12049933 rows and 10226866 columns
    Presolve time: 33.06s
    Presolved: 7117233 rows, 5130179 columns, 23542730 nonzeros
    Elapsed ordering time = 5s
    Elapsed ordering time = 10s
    Elapsed ordering time = 15s
    ...
    Elapsed ordering time = 2875s
    Elapsed ordering time = 2880s
    Elapsed ordering time = 2885s
    Ordering time: 2891.83s

    Barrier statistics:
     Dense cols : 194
     AA' NZ     : 2.691e+09
     Factor NZ  : 6.589e+09 (roughly 60.0 GB of memory)
     Factor Ops : 2.604e+13 (roughly 40 seconds per iteration)
     Threads    : 64

    Model 3:

    Presolve removed 8960473 rows and 7137068 columns (presolve time = 6s) ...
    Presolve removed 9024253 rows and 7176068 columns (presolve time = 11s) ...
    Presolve removed 12010348 rows and 10162163 columns (presolve time = 16s) ...
    Presolve removed 12010348 rows and 10162163 columns (presolve time = 21s) ...
    Presolve removed 12010348 rows and 10162163 columns (presolve time = 25s) ...
    Presolve removed 12070483 rows and 10233262 columns (presolve time = 31s) ...
    Presolve removed 12071069 rows and 10233328 columns
    Presolve time: 33.09s
    Presolved: 7176737 rows, 5164037 columns, 23729079 nonzeros
    Elapsed ordering time = 5s
    Elapsed ordering time = 10s
    Elapsed ordering time = 15s
    ...
    Elapsed ordering time = 2415s
    Elapsed ordering time = 2420s
    Elapsed ordering time = 2425s
    Elapsed ordering time = 2430s
    Elapsed ordering time = 2435s
    Ordering time: 2476.85s

    Barrier statistics:
     Dense cols : 1591
     AA' NZ     : 2.048e+09
     Factor NZ  : 5.339e+09 (roughly 50.0 GB of memory)
     Factor Ops : 2.257e+13 (roughly 30 seconds per iteration)
     Threads    : 64

    Model 4:

    Presolve removed 9012673 rows and 7153926 columns (presolve time = 6s) ...
    Presolve removed 9078253 rows and 7192926 columns (presolve time = 11s) ...
    Presolve removed 12064348 rows and 10179021 columns (presolve time = 16s) ...
    Presolve removed 12064348 rows and 10179021 columns (presolve time = 21s) ...
    Presolve removed 12064348 rows and 10179021 columns (presolve time = 26s) ...
    Presolve removed 12124820 rows and 10250120 columns (presolve time = 30s) ...
    Presolve removed 12125422 rows and 10250186 columns
    Presolve time: 32.44s
    Presolved: 7323984 rows, 5247979 columns, 24191215 nonzeros
    Elapsed ordering time = 5s
    Elapsed ordering time = 10s
    Elapsed ordering time = 15s
    ...
    Elapsed ordering time = 3170s
    Elapsed ordering time = 3175s
    Elapsed ordering time = 3180s
    Ordering time: 3184.60s

    Barrier statistics:
     Dense cols : 29
     AA' NZ     : 2.902e+09
     Factor NZ  : 6.801e+09 (roughly 60.0 GB of memory)
     Factor Ops : 2.587e+13 (roughly 30 seconds per iteration)
     Threads    : 64

    I run the models on a High Performance Cluster with Slurm Resource Allocation, but requested the same CPU and Memory. Because of Slurm, I cannot further control the execution of the code.

    Do you have an idea what could explain this behavior? I will try the same again, once with BarOrder = 0, and once with BarOrder = 1.

    EDIT: Now looking at it, it seems like the number of Dense Cols varies significantly. Can that explain the behavior?

    Many thanks and best wishes,

    Jacob

    0
  • Jaromił Najman
    Gurobi Staff Gurobi Staff

    Do you have an idea what could explain this behavior?

    Could you also show the first lines of the log files? In particular the finger print values. If the finger print values differ, then something is different.

    I run the models on a High Performance Cluster with Slurm Resource Allocation, but requested the same CPU and Memory. Because of Slurm, I cannot further control the execution of the code.

    Could you try reproducing the behavior on one particular machine, maybe your own, if it's possible?

    0
  • Jacob Mannhardt
    Gurobi-versary
    Conversationalist
    First Question

    You are right, the fingerprints are different, and now that you mention that, we use a time series aggregation scheme, so the models are not 100% identical. The aggregation works well enough, so that the solution is in a band of +/-2% but the models are not identical.

    However, I made an observation regarding the different BarOrder values: When selecting BarOrder = 0, the ordering times are super short (~19s), and longer (440-520s) when selecting BarOrder = 1. However, the model with BarOrder = 1 solves faster afterward, as you have said before. In particular, the ordering times are stable, i.e., similar for all 3 runs.

    None of the models reaches an ordering time of ~2500s as seen above. Why does it often take so much longer to order when using the automatic value for BarOrder (= -1) than selecting BarOrder = 0 or BarOrder = 1? Afterward, the runs with BarOrder = 1 solve faster than those with BarOrder = -1.

    I think this might get a bit academic, I only observed this difference in the ordering, which I did not understand.

    Many thanks
    Jacob

    0
  • Jaromił Najman
    Gurobi Staff Gurobi Staff

    None of the models reaches an ordering time of ~2500s as seen above. Why does it often take so much longer to order when using the automatic value for BarOrder (= -1) than selecting BarOrder = 0 or BarOrder = 1?

    Gurobi has to make a decision which ordering algorithm to choose based on the information it has about the model at the given time. Sometimes, a model is at the edge of a decision.

    As a very simplied example, let's say that we will always use BarOrder = 1 if #variables < 1000. A model with 1001 variables would then always be ordered with BarOrder = 0 but there definitely are models with 1001 variables for which BarOrder 1 would be better. Of course, the decision making implemented in Gurobi is more complex but I think you will get the idea.

    In your particular case, it looks like your models are at some edge in the decision making process such that some fall into category BarOrder 1 while others fall into category BarOrder 0, such that manual parameter adjustment is necessary.

    Note that in general, just using both algorithms in parallel is not a great idea because then one could not utilize all threads for ordering and thus the performance would be slower overall.

    If you are interested only in a primal feasible solution, i.e., you run your models without crossover, you might want to also experiment more with the Presolve, PreSparsify, and the Aggregate parameters. Also AggFill can sometimes help but is very specific and can lead to overtuning.

    Best regards, 
    Jaromił

    1
  • Jacob Mannhardt
    Gurobi-versary
    Conversationalist
    First Question

    Thanks for your hints and tips!

    So if I understand correctly, selecting either 0 or 1 is fast, because Gurobi "knows what to do", whereas selecting the automatic selection takes so much additional time because Gurobi "is not really sure what to select". Is that the layman's version of what is happening?

    For the time being I will then go with BarOrder = 1 and play with the other parameters. 

    Thanks a lot again for your time and insights!

    Best
    Jacob

    0
  • Jaromił Najman
    Gurobi Staff Gurobi Staff

    So if I understand correctly, selecting either 0 or 1 is fast, because Gurobi "knows what to do", whereas selecting the automatic selection takes so much additional time because Gurobi "is not really sure what to select". Is that the layman's version of what is happening?

    Yes and no. When you set the parameter then Gurobi knows what to do and does not have to decide which ordering algorithm to choose. However, it is not the decision time that makes the difference but the final choice itself. You could try running Model2 with BarOrder=-1, 0, and 1 and see that BarOrder=-1 and 0 perform slowly and 1 will perform better as you said. This means that Gurobi automatically decided to use BarOrder=0 but the best decision would be to use BarOrder=1.

    Best regards,
    Jarek

    0

Please sign in to leave a comment.