Skip to main content

Performance of Gurobi on different Machines

Answered

Comments

5 comments

  • Maliheh Aramon
    Gurobi Staff Gurobi Staff
    Hi Tobias, 
     
    Have you checkout the article on "Does using more threads make Gurobi faster?"? 
     
    Increasing the number of threads does not necessarily speed up the optimization algorithm. How many Gurobi jobs are you solving in each subroutine? Are they independent of each other? And how long does each job take to be solved? 
     
    If you are solving so many independent Gurobi jobs in each subroutine, increasing the number of threads and running those jobs in parallel would likely provide a noticeable speedup (the longer each individual job is, the speedup is likely more). However, if the Gurobi jobs in each subroutine are dependent on each other and/or their optimal solutions are found near the root, increasing the number of threads will not likely help a lot. 
     
    I suggest looking into the log file to identify the bottleneck in each Gurobi job. For example, if it takes so long for the root relaxation to be solved, changing the Method parameter to 2 (barrier) and having more threads can help. As another example, if it takes so long for the model to find an incumbent solution or the best bound changes very slowly, increasing the number of threads to explore more nodes in parallel can help.


    Best regards,
    Maliheh
    0
  • Tobias Höschen
    Gurobi-versary
    Conversationalist
    First Question

    Hi Maliheh,

    yes I already checked before asking the question and am still kinda confused.

    I only solve one LP per iteration but in total solve thousands that are all independent of each other.

    Here is the Log file, can you maybe help me figure out what exactly is the problem?

    It seem like it used the Primal Simplex in this example, often it uses the barrier to solve the problem so I already changed the method parameter to 2, but this didn't change anything at all

    Thanks for the answer,

    Tobias

    Gurobi Optimizer version 9.1.1 build v9.1.1rc0 (linux64)
    Thread count: 20 physical cores, 20 logical processors, using up to 20 threads
    Optimize a model with 47040 rows, 234248 columns and 421456 nonzeros
    Model fingerprint: 0x960d6838
    Coefficient statistics:
    Matrix range [1e+00, 1e+00]
    Objective range [1e+00, 1e+00]
    Bounds range [0e+00, 0e+00]
    RHS range [2e+01, 1e+04]

    Concurrent LP optimizer: primal simplex, dual simplex, and barrier
    Showing barrier log only...

    Presolve removed 0 rows and 47040 columns
    Presolve time: 0.21s
    Presolved: 47040 rows, 187208 columns, 374416 nonzeros

    Ordering time: 0.22s

    Barrier statistics:
    AA' NZ : 9.360e+04
    Factor NZ : 1.281e+06 (roughly 100 MBytes of memory)
    Factor Ops : 8.199e+07 (less than 1 second per iteration)
    Threads : 18

    Objective Residual
    Iter Primal Dual Primal Dual Compl Time
    0 2.05891230e+09 0.00000000e+00 4.28e+03 0.00e+00 1.40e+04 1s
    1 7.99199355e+08 9.57896973e+05 3.45e+03 0.00e+00 3.87e+03 1s
    2 3.76984709e+08 3.06791526e+06 1.58e+03 0.00e+00 1.81e+03 1s
    3 2.17507994e+08 5.56777488e+06 8.85e+02 0.00e+00 1.03e+03 1s
    4 1.69117220e+08 7.35356168e+06 6.71e+02 0.00e+00 7.95e+02 1s
    5 1.66107116e+08 8.10683166e+06 6.58e+02 0.00e+00 7.80e+02 1s
    6 1.42775281e+08 9.11885810e+06 5.56e+02 0.00e+00 6.64e+02 1s
    7 1.30527905e+08 1.00436052e+07 5.05e+02 0.00e+00 6.04e+02 1s
    8 1.22682719e+08 1.08259037e+07 4.70e+02 0.00e+00 5.64e+02 1s
    9 1.17732427e+08 1.21074031e+07 4.48e+02 0.00e+00 5.40e+02 1s
    10 1.01230816e+08 1.31266937e+07 3.79e+02 0.00e+00 4.56e+02 1s
    11 8.84816115e+07 1.37958396e+07 3.24e+02 0.00e+00 3.90e+02 1s
    12 8.08822943e+07 1.41970742e+07 2.91e+02 0.00e+00 3.51e+02 1s
    13 7.80735443e+07 1.46924292e+07 2.78e+02 0.00e+00 3.36e+02 1s
    14 7.54425567e+07 1.51181733e+07 2.67e+02 0.00e+00 3.22e+02 1s
    15 6.81732340e+07 1.54442178e+07 2.34e+02 0.00e+00 2.83e+02 1s
    16 6.52567699e+07 1.59344337e+07 2.21e+02 0.00e+00 2.68e+02 1s
    17 5.63537328e+07 1.63419649e+07 1.81e+02 0.00e+00 2.20e+02 1s
    18 5.45568618e+07 1.66435557e+07 1.73e+02 0.00e+00 2.10e+02 1s
    19 5.18277826e+07 1.70695697e+07 1.60e+02 0.00e+00 1.95e+02 1s
    20 4.90282837e+07 1.75061425e+07 1.46e+02 0.00e+00 1.79e+02 1s
    21 4.67290813e+07 1.78044770e+07 1.35e+02 0.00e+00 1.66e+02 2s
    22 4.26935434e+07 1.80588622e+07 1.16e+02 0.00e+00 1.43e+02 2s
    23 4.09976121e+07 1.82619013e+07 1.08e+02 0.00e+00 1.34e+02 2s
    24 4.04580421e+07 1.83067660e+07 1.06e+02 0.00e+00 1.30e+02 2s
    25 3.80796646e+07 1.85633324e+07 9.44e+01 0.00e+00 1.16e+02 2s
    26 3.76709901e+07 1.86846947e+07 9.24e+01 0.00e+00 1.14e+02 2s
    27 3.51954559e+07 1.89242514e+07 8.03e+01 0.00e+00 9.89e+01 2s
    28 3.41656973e+07 1.89954450e+07 7.52e+01 0.00e+00 9.28e+01 2s
    29 3.28912756e+07 1.92260278e+07 6.89e+01 0.00e+00 8.49e+01 2s
    30 3.20983253e+07 1.94213779e+07 6.45e+01 0.00e+00 7.99e+01 2s
    31 3.14211729e+07 1.95116652e+07 6.11e+01 0.00e+00 7.57e+01 2s
    32 2.96667304e+07 1.96751105e+07 5.20e+01 0.00e+00 6.45e+01 2s
    33 2.90271887e+07 1.98540889e+07 4.86e+01 0.00e+00 6.03e+01 2s
    34 2.84261421e+07 2.00042665e+07 4.54e+01 0.00e+00 5.62e+01 2s
    35 2.79724035e+07 2.01113819e+07 4.29e+01 0.00e+00 5.31e+01 2s
    36 2.74371507e+07 2.01864588e+07 3.97e+01 0.00e+00 4.94e+01 2s
    37 2.67412113e+07 2.02413849e+07 3.58e+01 0.00e+00 4.47e+01 2s
    38 2.63019678e+07 2.03496742e+07 3.32e+01 0.00e+00 4.15e+01 2s
    39 2.59709759e+07 2.04474068e+07 3.14e+01 0.00e+00 3.92e+01 2s
    40 2.52349514e+07 2.04955855e+07 2.72e+01 0.00e+00 3.39e+01 2s
    41 2.47606481e+07 2.05694251e+07 2.44e+01 0.00e+00 3.05e+01 2s
    42 2.45219430e+07 2.05922626e+07 2.30e+01 0.00e+00 2.88e+01 2s
    43 2.41791973e+07 2.07380977e+07 2.10e+01 0.00e+00 2.61e+01 2s
    44 2.36722657e+07 2.08341283e+07 1.76e+01 0.00e+00 2.21e+01 3s
    45 2.35410657e+07 2.09570443e+07 1.67e+01 0.00e+00 2.10e+01 3s
    46 2.32812390e+07 2.09790761e+07 1.50e+01 0.00e+00 1.89e+01 3s
    47 2.30427296e+07 2.10444627e+07 1.33e+01 0.00e+00 1.69e+01 3s
    48 2.28762555e+07 2.11398052e+07 1.21e+01 0.00e+00 1.54e+01 3s
    49 2.27474204e+07 2.11969228e+07 1.12e+01 0.00e+00 1.43e+01 3s
    50 2.25195896e+07 2.12536300e+07 9.39e+00 0.00e+00 1.21e+01 3s
    51 2.23303152e+07 2.13490227e+07 7.97e+00 0.00e+00 1.03e+01 3s
    52 2.22251215e+07 2.14193508e+07 7.13e+00 0.00e+00 9.21e+00 3s
    53 2.21864602e+07 2.14632161e+07 6.81e+00 0.00e+00 8.79e+00 3s
    54 2.20864814e+07 2.15007518e+07 5.92e+00 0.00e+00 7.62e+00 3s
    55 2.20660439e+07 2.15185550e+07 5.69e+00 0.00e+00 7.34e+00 3s
    56 2.20492259e+07 2.15459904e+07 5.52e+00 0.00e+00 7.15e+00 3s
    57 2.19645299e+07 2.16082241e+07 4.66e+00 0.00e+00 6.02e+00 3s
    58 2.19334221e+07 2.16250948e+07 4.28e+00 0.00e+00 5.53e+00 3s
    59 2.18982616e+07 2.16497986e+07 3.78e+00 0.00e+00 4.88e+00 3s
    60 2.18868179e+07 2.16447062e+07 3.60e+00 0.00e+00 4.67e+00 3s
    61 2.18673452e+07 2.16733854e+07 3.31e+00 0.00e+00 4.28e+00 3s
    62 2.18322595e+07 2.16931197e+07 2.63e+00 0.00e+00 3.40e+00 3s
    63 2.18303926e+07 2.17042422e+07 2.59e+00 0.00e+00 3.34e+00 3s
    64 2.18224530e+07 2.17201714e+07 2.39e+00 0.00e+00 3.09e+00 3s
    65 2.18206818e+07 2.17272214e+07 2.35e+00 0.00e+00 3.04e+00 4s
    66 2.18196399e+07 2.17339658e+07 2.32e+00 0.00e+00 3.01e+00 4s
    67 2.18188674e+07 2.17397969e+07 2.26e+00 0.00e+00 2.93e+00 4s
    68 2.18126091e+07 2.17348009e+07 2.08e+00 0.00e+00 2.71e+00 4s
    69 2.18126592e+07 2.17435435e+07 2.04e+00 0.00e+00 2.66e+00 4s
    70 2.18059067e+07 2.17568759e+07 1.83e+00 0.00e+00 2.38e+00 4s
    71 2.18017488e+07 2.17587278e+07 1.62e+00 0.00e+00 2.11e+00 4s
    72 2.18014928e+07 2.17688525e+07 1.61e+00 0.00e+00 2.09e+00 4s
    73 2.18013265e+07 2.18000346e+07 1.56e+00 0.00e+00 2.02e+00 4s
    74 2.18030801e+07 2.18148719e+07 1.32e+00 0.00e+00 1.71e+00 4s
    75 2.18040928e+07 2.17997729e+07 1.27e+00 0.00e+00 1.66e+00 4s
    76 2.18054092e+07 2.18168314e+07 1.22e+00 0.00e+00 1.59e+00 4s
    77 2.18064954e+07 2.18043784e+07 1.18e+00 0.00e+00 1.55e+00 4s
    78 2.18089734e+07 2.18331897e+07 1.12e+00 0.00e+00 1.46e+00 4s
    79 2.18107303e+07 2.18475845e+07 1.07e+00 0.00e+00 1.39e+00 4s
    80 2.18139975e+07 2.18565554e+07 9.91e-01 0.00e+00 1.29e+00 4s
    81 2.18147929e+07 2.18652514e+07 9.80e-01 0.00e+00 1.28e+00 4s
    82 2.18182834e+07 2.18741457e+07 9.34e-01 0.00e+00 1.22e+00 4s
    83 2.18188508e+07 2.18847518e+07 9.24e-01 0.00e+00 1.21e+00 4s
    84 2.18230492e+07 2.18859168e+07 8.56e-01 0.00e+00 1.12e+00 4s
    85 2.18269300e+07 2.19091529e+07 7.99e-01 0.00e+00 1.05e+00 4s
    86 2.18298682e+07 2.19137177e+07 7.83e-01 0.00e+00 1.04e+00 4s
    87 2.18360091e+07 2.19251598e+07 7.26e-01 0.00e+00 9.59e-01 4s
    88 2.18375323e+07 2.19197556e+07 7.12e-01 0.00e+00 9.44e-01 5s
    89 2.18442774e+07 2.19289311e+07 6.66e-01 0.00e+00 8.91e-01 5s
    90 2.18456880e+07 2.19322744e+07 6.59e-01 0.00e+00 8.82e-01 5s
    91 2.18486071e+07 2.19389188e+07 6.36e-01 0.00e+00 8.55e-01 5s
    92 2.18536361e+07 2.19549627e+07 6.09e-01 7.33e-15 8.23e-01 5s
    93 2.18546736e+07 2.19558852e+07 6.02e-01 1.22e-14 8.15e-01 5s
    94 2.18575814e+07 2.19613671e+07 5.84e-01 1.69e-14 7.90e-01 5s
    95 2.18635622e+07 2.19643155e+07 5.49e-01 1.93e-14 7.42e-01 5s

    Barrier performed 95 iterations in 4.86 seconds
    Barrier solve interrupted - model solved by another algorithm


    Solved with primal simplex
    Solved in 78302 iterations and 4.87 seconds
    Optimal objective 2.304717600e+07
    0
  • Maliheh Aramon
    Gurobi Staff Gurobi Staff

    Hi Tobias, 

    One point before getting into your question:

    • It seems that you are using Gurobi 9.1.1, we always recommend using the latest version of Gurobi which is 9.1.2 right now.

    The log file shows the default behaviour where primal simplex, dual simplex, and barrier are run simultaneously until one reaches the optimal solution. Typically, the barrier outperforms the other two methods on very large models, however, primal simplex is the winner for your model.

    Simplex methods are sequential methods and having more cores do not help. Therefore, it makes more sense to set the parameter Method = 0 to only use primal simplex and parallelize over the independent LPs. Note that you, yourself, need to implement this in your application. In case you are using Python, you can use multiprocessing with Gurobi to run one LP per core. On your machine with 2 cores, you can run 2 LPs simultaneously, however, on your university's machine, you can run 10 LPs simultaneously. In the ideal scenario, this should give a x5 speedup. 

    Best regards,

    Maliheh

    0
  • Tobias Höschen
    Gurobi-versary
    Conversationalist
    First Question

    Hi Maliheh,

    Thank you for the info, I understand the problem now. Unfortunately I use C and not Python, so I guess I have to give parallelizing a go..

    I only have one followup question: depending on the problem size I create, the barrier is the fastest (this is also the case when I average all running times by all methods). If I set the Method = 0, is there any speed up I could get without parallelizing the code?

    Best,

    Tobias

    0
  • Maliheh Aramon
    Gurobi Staff Gurobi Staff

    Hi Tobias, 

    Simplex algorithms are sequential methods and are not amenable to parallelization. Therefore, increasing the number of cores does not have any impact. 

    Since the barrier algorithm outperforms (on average) the simplex methods on your models, you can consider setting Method = 2 to only use barrier. The parallelization helps the barrier algorithm. However, if you are looking for a noticeable speedup, parallelizing over independent LPs is likely a more promising avenue to explore. 

    Best regards,

    Maliheh

     

    0

Please sign in to leave a comment.