Numerical Issues with Large QIP

Hey Gurobi Team,

I'm trying to solve a fairly large quadratic integer program (lots of binary variables) but it's getting stuck at Root simplex log... Here is the code:

import numpy as np
import pandas as pd
import csv
import geopy.distance
import random
from itertools import product
from collections import defaultdict

import gurobipy as gp
from gurobipy import GRB

params = ...
env = gp.Env(params=params)

required_home_games = 29
required_away_games = 29
timesteps = [t for t in range(100)]
df = pd.read_csv('teams_2019.csv')
distances = {c:{ci:
                 geopy.distance.geodesic((df.Lat[i], df.Lon[i]), (df.Lat[j], df.Lon[j])).km / 1000.
                 for j, ci inenumerate(df.Code)}
                 for i, c in enumerate(df.Code)}
teams = df.Code.to_list()
random.seed(1)
unavailable_days = {team: set([0] + random.sample(timesteps, 30)) for team in teams}
required_games = {team: {opp: 2 if team != opp else 0 for opp in teams} for team in teams}
nl = set(product(teams, teams, timesteps))
nha = set(product(teams, timesteps))

m = gp.Model(name="NBA Scheduling", env=env)
m.params.timelimit = 360*60 # time limit in seconds
# m.Params.MIPFocus = 1
m.Params.Heuristics = 0.2 # default is 5% of time
m.Params.NodeMethod = 2
m.Params.NoRelHeurTime = 5*60

l = m.addVars(nl, name="Locations", vtype=GRB.BINARY)
g = m.addVars(nl, name="Games", vtype=GRB.BINARY)
f = m.addVars(nl, lb=-1, ub=1, name="FTE_Score", vtype=GRB.INTEGER)
h = m.addVars(nha, name="Home", vtype=GRB.BINARY)
a = m.addVars(nha, name="Away", vtype=GRB.BINARY)
b = m.addVars(nha, name="B2B", vtype=GRB.BINARY)

m.addConstr((g.sum() == required_home_games * len(teams)), name="Total_home_games")
m.addConstrs((g.sum(i, "*", t) + g.sum("*", i, t) <= 1 for t in timesteps for i in teams), name="One_game_per_day")
for i in teams:
    m.addConstr((l[i, i, 0] == 1), name=f"{i}_Starting_location")
    m.addConstr((h.sum(i, '*') == required_home_games), name=f"{i}_Total_home_games")
    m.addConstr((a.sum(i, '*') == required_away_games), name=f"{i}_Total_away_games")
    m.addConstr((gp.quicksum(h[i, t] for t in unavailable_days[i]) == 0), name=f"{i}_Unavailable_days")
    m.addConstr((f.sum(i, '*', '*') - f.sum('*', i, '*') <= 5))
    m.addConstr((f.sum(i, '*', '*') - f.sum('*', i, '*') >= -5))
    for j in teams:
        m.addConstrs((g[i, j, t] <= (h[i, t] + a[j, t] + l[j, i, t]) / 3 for t in timesteps), name=f"{j}_plays_at_{i}")
        m.addConstr((gp.quicksum(g[i, j, t] + g[j, i, t] for t in timesteps) >= required_games[i][j]), name=f"{i}_{j}_Required_games")
        m.addConstrs((b[j, t] - b[i, t] - f[i, j, t] <= 1 - g[i, j, t] for t in timesteps))
        m.addConstrs((b[j, t] - b[i, t] - f[i, j, t] >= g[i, j, t] - 1 for t in timesteps))
        m.addConstrs((f[i, j, t] >= -g[i, j, t] for t in timesteps))
        m.addConstrs((f[i, j, t] <= g[i, j, t] for t in timesteps))
    for t in timesteps:
        m.addConstr((h[i, t] + a[i, t] <= 1), name=f"{i}_Maximum_game_timestep_{t}")
        m.addConstr((l[i, i, t] >= h[i, t]), name=f"{i}_Home_games_{t}")
        m.addConstr((gp.quicksum(l[i, j, t] for j in teams if j != i) >= a[i, t]), name=f"{i}_Away_games_{t}")
        m.addConstr((gp.quicksum(l[i, j, t] for j in teams) == 1), name=f"{i}_Unique_location_{t}")       

        if t != 0:
            m.addConstr((b[i, t] >= h[i, t] + a[i, t] + h[i, t - 1] + a[i, t - 1] - 1))
            m.addConstr((b[i, t] <= (h[i, t] + a[i, t] + h[i, t - 1] + a[i, t - 1]) / 2))
        if t < timesteps[-1] - 2:
            m.addConstr((b[i, t] + b[i, t + 1] + b[i, t + 2] + b[i, t + 3] <= 1))

obj = gp.quicksum(distances[k][j] * l[i, k, t - 1] * l[i, j, t]for t in timesteps[1:]
                  for j in teams for k in teams for i in teams) # add small preference for less b2b
m.setObjective(obj, GRB.MINIMIZE)
m.optimize()

I've attached pictures of the output below and here's a link to the 'teams_2019.csv' file. Is there anything I can do to help improve performance and find feasible solutions?

Another question I had was if you think it's better to use binary variables instead of having quadratic objective terms, i.e., add a binary z[i, j, k, t] variable for each l[i, k, t - 1] * l[i, j, t] and a constraint z[i, j, k, t] >= l[i, k, t - 1] + l[i, j, t] - 1, and replace l[i, k, t - 1] * l[i, j, t] with z[i, j, k, t] in the objective? This would turn the QIP into an IP but I'm not sure if there will be an improvement in performance.

I also had an issue with memory - I'm wanting to set it to `timesteps = [t for t in range(177)]` (it's currently at 100) but this leads to insufficient RAM. If I reformulate the program as an IP, would the memory be affected significantly? Is there any way of reformulating the program to reduce the memory required?

Thanks for the help!

(continues on for a long time..)

Jaromił Najman

Gurobi Staff

2024年04月09日 07:05

Hi Derek,

Could you please share the model such that we can have a closer look? You can use the write method to generate an MPS file.

Note that uploading files in the Community Forum is not possible but we discuss an alternative in Posting to the Community Forum.

Best regards,
Jaromił

Derek Long

2024年04月09日 13:25 編集済み

Hi Jaromił,

Thanks for your response. Here's the model with t=1,...,100 and the model with t=1,...,177 (which is the actual problem I am aiming to solve but cannot currently optimize due to insufficient RAM). If you have a chance to run the latter model, would you be able to tell me how much RAM I would need for it?

Kind regards,

Derek

2024年04月10日 07:46 編集済み

Hi Derek,

I was able to successfully run model with t=1,..,100 on a machine with 4 cores and 32GB of RAM. I ran the model for ~2 hours with default settings and it used at most ~8GB of RAM.

I was also able to successfully run model with t=1,..,177 on the same machine, i.e., a machine with 4 cores and 32GB of RAM. I ran the model for ~1 hour with default settings and it used at most ~28GB of RAM.

If you are planning to solve these models, I would recommend switching to a powerful machine, maybe a cloud machine.

2024年04月10日 12:53

Great, were you able to find a feasible solution to those models within that time?

Kind regards,
Derek

2024年04月10日 14:19

No, unfortunately not. Because each LP relaxation solve takes a very long time, I think that one good approach would be to run the NoRelHeurTime for a very long time, probably at least a few hours and hope that it finds a feasible solution.

An alternative would be to compute a feasible solution via a custom heuristic. However, this is most often very difficult and maybe even impossible.

Numerical Issues with Large QIP

コメント

お探しのものを見つけられませんでしたか？