メインコンテンツへスキップ

Same model, same parameters, but different solutions

回答済み

コメント

1件のコメント

  • Maliheh Aramon
    • Gurobi Staff

    Hi Arthur, 

    The first inconsistency is that all runs except one ran out of memory. I suspect the concurrent LP optimization at the root node caused a memory spike, as the logs show slight differences in how the relaxation was handled right before crashing.

    Since all three runs show the same fingerprint for the model, we expect that running the model on the same machine with the same parameters leads to the same solution path. You have set both the MemLimit and SoftMemLimit parameters, with the latter set to a higher value. Setting the MemLimit=30 limits the total amount of memory available to Gurobi to 30 GB. This indicates that if more memory is required, Gurobi will fail with an out-of-memory error. The setting SoftMemLimit=64 is irrelevant here. Why did you set both MemLimit and SoftMemLimit parameters? 

    The ordering in the barrier algorithm appears to be memory-intensive, and it appears that the dual simplex managed to solve the root relaxation to optimality in one of the runs slightly before hitting the memory limit (note that the concurrent method for solving the LP relaxation is non-deterministic). To avoid the issue for this model, you need to either increase the amount of memory available to Gurobi or set the Method parameter to 1, forcing the Gurobi Optimizer to solve the root LP relaxation with the dual simplex. 

    The second inconsistency is that one specific run found a better solution for another instance. I think the other runs suffered from hardware delays or cluster load, as the "Node 0" log for the best run includes several lines completely missing from the slower ones. Since that successful run was executed on a different day, it probably got lucky and landed on a more free or cooler CPU core.

    The solution paths in all three runs are the same. The difference stems from the amount of time that it takes to explore the tree. For example, the gap 1.4% (see below) was reached at 501 seconds in the first run, but it took 1718 and 1741 seconds to reach the same point in the second and the third runs, respectively. Your conjecture is likely correct. For the second and third runs, the machine appears to be oversubscribed. Do you know how many Gurobi jobs were running concurrently on the same machine with the second and the third run?

    - First run

        0     0 53166.4853    0  150 53779.0000 53166.4853  1.14%     -  501s

    - Second run

         0     0 53166.4853    0  150 53779.0000 53166.4853  1.14%     - 1718s

    - Third run

         0     0 53166.4853    0  150 53779.0000 53166.4853  1.14%     - 1741s

     

    Best regards,

    Maliheh

    0

サインインしてコメントを残してください。