How do you run grbtune on multiple machines?
AnsweredFrom today's auto-tuning discussion, one panelist mentioned that he ran grbtune on 10 machines in parallel and was able to go through some 300+ parameter combinations vs. the 33 parameter combinations when running serially. The grbtune manual page does not say how this can be done.
What is the command line for running grbtune in parallel on multiple machines? Is this the same command if one machine is multi-core?
For example, I have one machine with 16 cores. What command line would I use to run grbtune in parallel one 4 "machines" where each 'machine' would run with 4 threads? And what is the command if I want to use 16 "machines" in parallel with 1 thread each?
-
Hi Lance,
We do not recommend to run a distributed tuning run on a single machine, unless you limit the Threads parameter. Still, the results may have very large variation due to Thread communication and background processes. Thus, we strongly recommend to run distributed tuning on multiple machines.
Nevertheless, here is a brief explanation of how to run a distributed tuning run on a single machine with 2 workers.
First, create a folder for each worker
> mkdir worker1
> mkdir worker2Initialize each of the worker directories, i.e., copy the default configuration files and runtimes to the folders.
> cd worker1
> grb_rs init
> cd ../worker2
> grb_rs initNext, start the workers
> cd worker1
> grb_rs --port=12345 --workerThe current terminal is now occupied by worker1 and you should see a line stating
Using data directory <path>/worker1/data
Open a second terminal and proceed
> cd worker2
> grb_rs --port=12346 --workerThe second terminal is now occupied by worker2 and you should see a line stating
Using data directory <path>/worker2/data
Open a third terminal and start a tuning run
> grbtune WorkerPool=localhost:12345,localhost:12346 TuneJobs=2 Threads=1 /Library/gurobi912/mac64/examples/data/glass4.mps
You should see a line stating
Started distributed worker on <hostid>.local:12345
Started distributed worker on <hostid>.local:12346
Distributed tuning: launched 2 distributed worker jobsWhen working on multiple machines, you must have access to an (academic) floating site licences provided by your university or department. The process for multiple machines is the same, except that you have to provide actual IP addresses instead of just \(\texttt{localhost}\). You can find more details on cluster management in our documentation on Forming a Cluster.
Best regards,
Jaromił1 -
I was trying this out today, and it is not working for me. I cannot get the workers to start.I have a Dell XPS 15 Intel i9 core with 8 cores (16 virtual cores) running Windows 10 professional.There is no "grb_rs.exe" in the gurobi distribution, but there is a "grb_ts.exe" and a "grb.rs". grb.rs won't run, and grb_ts acts like it is an installation program or a software update.I tried running grb_ts and grb.ts both in a Cygwin terminal and in a DOS command window.Under Cygwin, grb_ts reports "Permission denied" when I try to run it. grb.rs reports "Exec format error".Under DOS, grb_ts opens a system dialog saying "Do you want to allow this app from an unknown publisher to make changes to your device?"--I tried both allowing and disallowing it, but there appears to be no change, and certainly a worker does not start up. grb.rs is an unknown file as far as DOS is concerned.Note that grbtune runs fine in the Cygwin terminal on Windows. So, I can run tuning trials with that in serial, but I really want to be able to run these in parallel to explore more options.Any thoughts on why grb_ts is not running for me?In particular, I want to find a set of parameters that will let the primal simplex/root relaxation solve more quickly.0
-
WRT the root relaxation, I have found that Heuristics=0.5 cuts the IP portion of the solution time in about half (to get to a solution with a MIPGap < 1%). Now, the root relaxation takes about half the total solution time (and half for the IP). So, I'm guessing that I'll get more improvement in total solution time if there is something I can do about the relaxation solution time. This is where I am hoping that grbtune in parallel can help me. Given that my larger problems take 12 hours to run with 5-7 hours of that in the relaxation, chopping off time there would be very helpful. (I using a small 6-7 minute model with grbtune in serial at the moment, where I'm only getting a small fraction of a percent improvement from the parameter searching the serial thread is able to explore.)
0 -
\( \texttt{grb_rs.exe} \) is included in the separate Gurobi Remote Services package. You will see a download link for Remote Services if you scroll down a bit on the downloads page.
\( \texttt{grb_ts.exe} \) starts a token server for use with a floating license, so it unfortunately won't help here.
0
Please sign in to leave a comment.
Comments
4 comments