Big variance of performance with same randomseed
AnsweredI thought this would be an interesting question for the community as it could also benefit others.
We were (by accident) running the same model on two separate machines (both machines have the exact same hardware specification, same version of Python, gurobipy and Gurobi) with the same random seed. It caught my attention that one of the machines didn't find a solution at all after an hour, while the other machine found an initial solution after just a few minutes.
I decided to test if I could replicate this. After a software update and restarting both machines, we were running the same model several times with the same random seed on both machines. The result was somewhat surprising: both the time to find an initial solution as well as the mip gap after an hour was varying considerably (even on the same machine), as you can see in the attached screenshot.
I had assumed that given the same randomseed, number of threads and hardware (and no other things running at the same time), we should be able to expect roughly the same performance (+/- a few percent).
Any experience with this kind of problem? Maybe a hardware problem? (the servers are about 4 years old and have been pushed to their limits on a consistent basis, so it can definitely be that there is a problem with cooling/ overheating). Unfortunately I don't have physical access to the machines right now, but I did run the Intel Processor Diagnostic Tool and everything looked normal. The CPU is a Xeon E5-2690 v4 and both servers have 128 GB of memory (which should have been more than enough in this case).
-
We are going to discuss this in an internal ticket.
Thanks,
Matthias0
Please sign in to leave a comment.
Comments
1 comment