Solve MIP using distributed computing on cluster - how to with academic license?

Comments

13 comments

  • Silke Horn

    Does the cluster have an academic site license? This would also enable distributed algorithms. If not, you can ask your cluster administrator to contact us for such a license.

    In order to run distributed MIP, you need to run "grb_rs --worker" on the worker nodes and then specify the parameters WorkerPool and DistributedMIPJobs in your Gurobi code. In JuMP, you should be able to set Gurobi parameters; e.g. during model initialisation:

    model = Model(with_optimizer(Gurobi.Optimizer, WorkerPool="worker1,worker2", DistributedMIPJobs=2))
    0
    Comment actions Permalink
  • Sebastian Gonzato

    Regarding the site licence - I am able to run Gurobi on the cluster, is this the same as having an academic site licence?

    I am not 100% sure how to setup the worker nodes, but I will contact my administrator about this.

    Thank you for the speedy reply!

    0
    Comment actions Permalink
  • HPCInfo VSC-KUL

    Thanks Silke

    I am one of the cluster admin people contacting you.
    For the moment, we do not have Gurobi module installed systemwide. I think Sebastian has a personal license from his own personal account. I will take this request into our team, and get back to you if we have questions.

    Kind regards
    Ehsan

    0
    Comment actions Permalink
  • HPCInfo VSC-KUL

    Sebastian Gonzato: Please verify whether or not you have a distributed site license. Thanks should be apparently in place before you would be able to take any further steps.

    Silke Horn: I have a question regarding the syntax of the "Model" class attributes:

    - Does DistributedMPIJobs specify the number of slave compute nodes that are involved in a specific batch job?

    - The "worker1" and "worker2" are the hostnames of the slave compute nodes? Or they are fixed terms, to be defined elsewhere?

    I would be grateful if you elaborate.

    Regards

    Ehsan

    0
    Comment actions Permalink
  • Silke Horn

    Hi Ehsan,

    Yes, DistributedMIPJobs specifies the number of worker/slave nodes.

    The parameter WorkerPool takes a comma-separated list of hostnames of the worker nodes. In the example, "worker1" and "worker2" should be replaced by names of the worker nodes that the master node can resolve.

    Please have a look at this page for instructions on how to request an academic site license.

    - Silke

    0
    Comment actions Permalink
  • Sebastian Gonzato

    Hey Silke,

    Assuming I have an academic site license (I'm fairly sure this the case) and an installation of Gurobi (obtained from here), do I need to do setup anything else? The Gurobi documentation talks a lot about actually setting a cluster, but I already have access to this (and probably also to a way of specifying workers to Gurobi.)

    In any case I will try this out and get back to this.

    Seb

    0
    Comment actions Permalink
  • Silke Horn

    Hi Seb,

    You can find out what license type you have by checking the contents of the gurobi.lic file on the cluster machines. It should say TOKENSERVER=<your cluster's token server> for a site license client.

    You then also need the Remote Services installed and grb_rs --worker running on the cluster machines that you intend to use as worker nodes.

    The documentation you mention is mostly about setting up compute server pools, which is not relevant for your use case.

    Silke

    0
    Comment actions Permalink
  • Sebastian Gonzato

    Then indeed I have a site license. I will have to ask to install Remote Services as well then.

    0
    Comment actions Permalink
  • Sebastian Gonzato

    Dear Silke,

    My sysadmin and I are still struggling with this. My sysadmin installed Gurobi 9.0.2 server, and I thought that this was / included Gurobi Remote Services, but apparently it doesn't since there is no grb_rs command in the linux64/bin folder.

    1. Where's a link to the remote services download? This documentation doesn't provide it.
    2. Do you require an academic site license to download Gurobi remote services? (The sysadmin of my department obtained an academic site license, but he is not the same sysadmin of the supercomputer I want to use.)
    3. The Gurobi Remote Services / Cloud documentation are a bit confusing, since most of it assumes you will install the Gurobi Cluster Manager. Would I be correct in saying that if you already have a cluster with a cluster manager (we use Torque and Moab if I'm not mistaken) then this is not necessary, and literally all you have to be do is be able to run grb_rs --worker on each worker node and then specify the works using the WorkerPool argument?
    4. To run a Distributed MIP job does 1 node need to be the master node that communicates with the worker nodes? If so, how do you do specify a master node?
    5. Are there any complete examples (e.g. PBS job script + Python files) of how to setup an MIP problem in Python and solve it using the Distributed MIP algorithm of Gurobi? I think this would help me and my sysadmin a lot, but I had a hard time finding it.

    Thank you for any possible help,

    Seb

    0
    Comment actions Permalink
  • Gwyneth Butera

    Hi Seb - 

    To install Remote Services, please have your system administrator download and install gurobi_server9.0.2a_linux64.tar.gz from the "Download Center". No license is required to download the package.

    More information is available here

    https://www.gurobi.com/documentation/9.0/remoteservices/compute_servers_and_distri.html#sec:RSMNodeTypes

    https://www.gurobi.com/documentation/9.0/remoteservices/distributed_algorithms2.html

    0
    Comment actions Permalink
  • Sebastian Gonzato

    Thank you Gwyneth, obviously we didn't have remote services installed, just Gurobi optimizer. Hopefully I can get this to work now...

    0
    Comment actions Permalink
  • Sebastian Gonzato

    Dear Gwyneth,

    I tried to run the grb_rs --worker command and got the following output:

    info  : Gurobi Remote Services starting...
    info  : Platform is linux
    info  : Version is 9.0.2 (build v9.0.2rc0a)
    info  : Worker mode is limited to 1 job, no queue
    info  : Node address is r26i13n01
    info  : Node FQN is r26i13n01.genius.hpc.kuleuven.be
    info  : Node has 36 cores
    info  : Data directory data does not exist, will use default
    info  : Using data directory /vsc-hard-mounts/leuven-data/331/vsc33168/gurobi_server902/linux64/bin/data
    info  : Node ID is afcbd499-997e-4316-bc4b-4e4b085f105b
    info  : Available runtimes: [8.0.0 8.0.1 8.1.0 8.1.1 9.0.0 9.0.1 9.0.2]
    info  : Accepting worker registration on port 36005...
    info  : Public root is /vsc-hard-mounts/leuven-data/331/vsc33168/gurobi_server902/linux64/resources/grb_rs/public
    info  : Starting API server (HTTP) on port 80...
    error : Gurobi Remote Services terminated: Cannot start server, was grb_rs already started?: listen tcp4 :80: bind: permission denied

    I assume this is something to do with the fact that I don't have permissions regarding port 80 and should talk to my sysadmin about this, would that be correct?

    Also for a self managed cluster, is this documentation relevant about authenticating into the Gurobi cluster: https://www.gurobi.com/documentation/9.0/remoteservices/verification2.html ? Similarly this part about connecting nodes in a self managed cluster: https://www.gurobi.com/documentation/9.0/remoteservices/connecting_nodes.html - does this also apply?

    0
    Comment actions Permalink
  • Matthias Miltenberger

    Hi Sebastian,

    You can use any available port of your choice by specifying it via the command line option --port=xxxx when starting grb_rs or by specifying the port in the grb_rs.cnf configuration file. Port 80 is the default port but usually it's only available for sysadmins.

    Cheers,
    Matthias

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk