Our Remote Services products, i.e. Compute Server and Instant Cloud, are based on the client-server paradigm. This means the model formulation and the model solution do not happen on the same machine. In fact, when working with Remote Services, the interactive session with the Gurobi Optimizer is a remote session instead of a local session.
While the benefits in terms of flexibility, scalability, and modularity have been discussed elsewhere, this article focuses on the performance aspect of this setup. In particular, the client-server separation requires communication between these components, which causes latency.
The key question to answer is: How big is the impact of network latency on the overall runtime of a given approach that includes Gurobi Optimizer? The remainder of this article presents a non-exhaustive list of aspects to consider when answering this question, as well as suggestions to troubleshoot and fix potential latency issues.
Latency and Bandwidth
Latency and bandwidth are measurable properties of the network connection between two machines. The bandwidth caps the transmission rate (e.g. MB/s) while the latency adds a certain delay (in milliseconds) to the Round-Trip Time (RTT) of a network packet between the client and the server. Local network connections usually have high bandwidth and low latency (>1000 MB/s and < 1 ms, respectively). On the other hand, an internet connection between locations across the globe can have a much lower bandwidth, and experience a latency between 50 and 200+ ms.
Lack of bandwidth is usually rare in modern network environments but it should be noted that large models can also benefit from higher bandwidth when the model data is transferred from the client to the server (usually in a consolidated network message when the optimization is started).
Low impact due to latency
Latency may be negligible with respect to the overall time needed to solve a problem when:
- There are no user-defined callback functions, which modify the behavior of our solver while it is running.
- The client and server machines are either in the same physical local network (when using Compute Server) or in the same cloud provider region (when using Instant Cloud).
High impact due to latency
Latency may significantly contribute to the overall runtime when:
- Callbacks are heavily used. Note that, precisely because of this reason, some callbacks are already disabled when using Remote Services.
- Gurobi is called very frequently to solve many —usually small— models. While this does not impact latency per se, it means that in absolute terms more time will be spent communicating.
- The client and server machines are running in different networks that are connected over the internet.
- The client and server machines are connected wirelessly at some point.
Of course, incurring in any of these items may not directly translate into high latency. However the more factors compound (e.g. using WiFi and solving many small models), the more likely it is to experience latency issues.
How to detect latency issues
If you suspect you are experiencing high latency:
- Run your software using a local Gurobi license to compare performance side-by-side with the remote setup (i.e., measure the overhead). Please contact the Gurobi support team to obtain a temporary license for this.
- Use the grbcluster tool to measure network latency.
How to fix latency issues
Strategies to reduce latency include:
- Verify the selected region is as close as possible to your application when using the Instant Cloud.
- Avoid any communication with Gurobi Remote Services through a wireless connection.
- Reduce the number of calls to the Gurobi library, when possible. You can check statistics about Compute Server, such as the number of network messages, at the end of the log file. Please note that Gurobi already tries to minimize the number of messages automatically through efficient caching, but feel free to contact the Gurobi support team to discuss further ideas.
- Remove unnecessary network overhead (such as proxies, VPN connections, and software firewalls, etc.), when possible.