Maximum coverage problem, unable to set quantity constraints
AnsweredHey guys,
My question is described as follows:
A set of sets A[0], A[1],A[2],...,A[40] and there are repeated elements between these sets,
These sets together form the unique set B.
Now it is required to select n sets from A to form a set C with no repeated elements. It is required that the number of elements in C exceeds 0.8 of the number of elements in B and the number n is as small as possible. How to achieve it using gurobipy?
In fact, I wrote some programs, but when it comes to counting the data obtained, I can't realize the function.
import gurobipy as gp
from gurobipy import GRB
import numpy as np
A = np.random.randint(0, 200, size=(50, 60))
B = np.unique(A.flatten()).sort()
count_point = len(B)
flag_chosedPoint_each_set = np.zeros((0,200), dtype=int)
for i in range(len(A)):
flag_chosedPoint_each_set[i][A[i]]=1
env = gp.Env()
model = gp.Model(env=self.env)
x = self.model.addVars(len(A), vtype=GRB.BINARY, name='x')
# add constraints
selected = model.addVars(count_point, vtype=GRB.CONTINUOUS, name='selected_flag_each_point')
for i in range(self.num_of_voxel):
selected[i].start = 0
for i in range(self.num_of_voxel):
selected[i] = gp.quicksum(x[j] * flag_chosedPoint_each_set[j][i] for j in range(len(A)))
if selected[i] >= 1:
selected[i] = 1
else:
selected[i] = 0
model.addConstr(sum(selected) >= 0.8 * count_point)
model. Optimize()

Hi Victor,
I'm a little confused by the statement ”set C with no repeated elements” since a set contains no repeated elements by definition. I'm going to assume this means that the A sets you choose cannot contain the same element. If this is not the case then you can ignore the following.
If you think of each of the A sets as a node in a graph, and connect nodes with an edge if the sets share a common element (and therefore cannot both be chosen) then solutions to your problem will be an Independent Set https://en.m.wikipedia.org/wiki/Independent_set_(graph_theory)#Maximum_independent_sets_and_maximum_cliques
The formulation will be very similar to the one here https://vamshij.com/blog/linearoptimization/graphsandintegerprogramming/#independentsetproblem however you will minimize the count of nodes chosen, not maximize, and you will add a single constraint which weights the binary variables by the number of elements in the corresponding set and ensures the weighted sum is at least 0.8*B.
 Riley
1 
Bonjour Riley,@Riley Clement
Thank you for your reply and suggestion. I looked at these solutions and they are all good. However, it's a little different from what I thought. In the end, I have a block code that I don't know transfer it into Gurobi. Can you give me some suggestion?
For 1st question, I am sorry for that because I use list to store data (instead of set), I am very sorry for confusing you with this sentence.
Below I will give you a detailed description of my ideas (the following sets are defined mathematically):Set B=A[0]∨......∨A[40]Set C=A[n1]∨...A[i]..∨A[n2]
1. I create an index table for all elements in B, corresponding to each row of flag_chosedPoint_each_set.2. Created for each subset A[i], there is a flag flag_chosedPoint_each_set[i]3. My constraints are: as few sets as possible, as many points as possible, but there is also a constraint that the number of elements in set C C>0.8B. (The reason is that some collections have too many repeated elements and too few updated ones, I don't think it makes sense)4. The goal of optimization is to minimize the number of sets:x = self.model.addVars(len(A), vtype=GRB.BINARY, name='x')
One of my new constraints is: the repeated elements of all selected A[i] collections are only counted once.
I don't know how to explain it to you, I give code to explain it to you.x[i] indicates whether the A[i] collection is selected or not, and the value is 0 or 1;flag_chosedPoint_each_set[i][j] means: whether point B[j] exists in A[i] set, the value is 0 or 1.For all elements in B, I create a list of statistics:selected = self.model.addVars(len(B), vtype=GRB.BINARY, name='x')
Now count elements based on X"The following should be transferred into Gurobi"# x = self.model.addVars(len(A), vtype=GRB.BINARY, name='x')
for i in range(Len(B)): # count the elements
t = [x[j] * flag_chosedPoint_each_set[j][i] for j in range(len(A))]
t = sum(t)
if t > 0:
selected[i]=1
else:
selected [i]=0The constraints in 3 will become:model.addConstr(sum(selected) >= 0.8 * len(B))
I don't know transfer the code into Gurobi. Can you give me some suggestion?I appreciate your reply.
 Victor0 
Hi Victor,
No problem, I'm going to hand this over to my new colleague ChungKyun to answer as part of his training.
 Riley
0 
Hey Riley @Riley Clement,
Thank you very much. I took Maliheh's suggestion and found that her method will add too many constraints, leading too much time consuming because the Set B has too many point almost 0.6 million. So I want a optimized method.
I have no idea. Here shows some log.
Thank you again.
Academic license  for noncommercial use only  expires 20240606
Set parameter TimeLimit to value 7200
Set parameter Threads to value 16
Gurobi Optimizer version 10.0.2 build v10.0.2rc0 (win64)
CPU model: Intel(R) Core(TM) i910980HK CPU @ 2.40GHz, instruction set [SSE2AVXAVX2]
Thread count: 8 physical cores, 16 logical processors, using up to 16 threads
Optimize a model with 1 rows, 190734 columns and 190692 nonzeros
Model fingerprint: 0x21041648
Model has 381384 general constraints
Variable types: 0 continuous, 190734 integer (190734 binary)
Coefficient statistics:
Matrix range [1e+00, 1e+00]
Objective range [1e+00, 1e+00]
Bounds range [1e+00, 1e+00]
RHS range [2e+05, 2e+05]
GenCon rhs range [1e+00, 1e+00]
GenCon coe range [1e+00, 1e+00]
Found heuristic solution: objective 42.0000000
Presolve removed 0 rows and 0 columns (presolve time = 5s) ...
Presolve removed 0 rows and 0 columns (presolve time = 10s) ...
Presolve removed 0 rows and 0 columns (presolve time = 15s) ...
Presolve removed 0 rows and 8336 columns (presolve time = 20s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 498s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 500s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 505s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 510s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 515s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 520s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 525s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 530s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 535s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 540s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 545s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 550s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 555s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 560s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 565s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 570s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 575s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 580s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 585s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 590s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 595s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 600s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 605s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 610s) ...
Presolve removed 0 rows and 190692 columns (presolve time = 615s) ...
Presolve removed 204603 rows and 201270 columns (presolve time = 620s) ...
Presolve removed 583742 rows and 575918 columns (presolve time = 625s) ...
Presolve added 365703 rows and 0 columns
Presolve removed 0 rows and 7857 columns
Presolve time: 629.16s
Presolved: 365704 rows, 182877 columns, 5521078 nonzeros
Found heuristic solution: objective 4.0000000
Variable types: 0 continuous, 182877 integer (182875 binary)
Deterministic concurrent LP optimizer: primal simplex, dual simplex, and barrier
Showing barrier log only...
Root barrier log...
Ordering time: 0.02s
Barrier statistics:
Dense cols : 41
AA' NZ : 1.181e+06
Factor NZ : 1.244e+06 (roughly 50 MB of memory)
Factor Ops : 3.142e+07 (less than 1 second per iteration)
Threads : 14
Objective Residual
Iter Primal Dual Primal Dual Compl Time
0 1.23039803e+02 1.93591614e+04 2.78e+04 1.32e02 2.84e+01 638s
1 1.43947800e+02 2.38949948e+04 2.11e+04 1.06e01 2.11e+01 638s
2 1.63593635e+02 3.01180612e+04 1.02e+04 1.86e02 1.07e+01 638s
3 4.06364290e+01 3.27790781e+04 6.51e+02 4.02e12 9.24e01 639s
4 2.58089510e+01 2.11142976e+04 2.20e+02 4.34e12 3.36e01 639s
5 1.92638267e+01 1.12157569e+04 5.42e+01 2.11e12 1.19e01 639s
6 1.76512024e+01 7.13921557e+03 2.57e+01 1.29e12 6.80e02 639s
7 1.61030551e+01 2.53175811e+03 1.31e+00 2.60e13 2.16e02 639s
8 1.53270522e+01 9.60781460e+02 6.10e05 8.13e14 8.20e03 639s
9 1.42073470e+01 2.42724001e+02 6.32e07 2.80e14 2.16e03 639s
10 1.26555278e+01 1.39269965e+02 2.71e07 2.02e14 1.28e03 639s
11 1.18108165e+01 8.89959140e+01 2.27e07 1.06e14 8.47e04 639s
12 1.04603145e+01 5.92301532e+01 1.61e07 6.58e15 5.86e04 639s
13 8.09159892e+00 3.49461855e+01 7.44e08 1.00e14 3.62e04 639s
14 6.53633776e+00 2.90847119e+01 4.71e08 8.85e15 2.99e04 639s
15 6.09836012e+00 1.96764925e+01 4.06e08 9.34e15 2.17e04 639s
16 5.16983698e+00 1.65367641e+01 2.92e08 6.21e15 1.82e04 639s
17 4.88000327e+00 1.17526434e+01 2.57e08 8.17e15 1.40e04 640s
18 4.78872033e+00 1.03002495e+01 2.44e08 5.58e15 1.27e04 640s
19 4.59681475e+00 8.92713995e+00 2.19e08 6.59e15 1.14e04 640s
20 4.53715515e+00 7.95186333e+00 2.07e08 5.23e15 1.05e04 640s
21 4.23473967e+00 5.28017938e+00 1.77e08 5.05e15 8.00e05 640s
22 4.03967964e+00 4.01022390e+00 1.58e08 3.80e15 6.76e05 640s
23 3.58926184e+00 3.08462233e+00 1.25e08 2.43e15 5.61e05 640s
24 3.36788220e+00 2.10169795e+00 1.07e08 3.32e15 4.60e05 640s
25 3.12092064e+00 3.12519169e01 8.81e09 3.44e15 2.89e05 640s
26 2.90432001e+00 1.20998725e02 7.33e09 4.14e15 2.43e05 640s
27 2.57008549e+00 6.59184800e01 4.84e09 3.97e15 1.61e05 640s
28 2.36727147e+00 1.01661318e+00 3.59e09 5.12e15 1.14e05 640s
29 2.34498679e+00 1.03102160e+00 3.44e09 5.01e15 1.10e05 640s
30 2.28851858e+00 1.08379304e+00 3.06e09 3.43e15 1.01e05 640s
31 2.18889062e+00 1.26572654e+00 2.31e09 4.57e15 7.76e06 641s
32 2.15115698e+00 1.36123661e+00 1.89e09 3.68e15 6.64e06 641s
33 2.10586122e+00 1.41146883e+00 1.50e09 4.11e15 5.84e06 641s
34 2.09548194e+00 1.52325186e+00 1.33e09 3.93e15 4.81e06 641s
35 2.06622470e+00 1.55799673e+00 1.06e09 2.71e15 4.27e06 641s
36 2.04243125e+00 1.59714393e+00 8.62e10 3.52e15 3.74e06 641s
37 2.03532614e+00 1.62927442e+00 7.95e10 3.20e15 3.41e06 641s
38 2.03374307e+00 1.64034763e+00 7.66e10 6.59e15 3.31e06 641s
39 2.03245293e+00 1.66190936e+00 6.52e10 4.36e15 3.11e06 641s
40 2.01896350e+00 1.68401555e+00 4.87e10 5.34e15 2.81e06 641s
41 2.01373831e+00 1.73455004e+00 4.40e10 3.63e15 2.35e06 641s
42 2.00675254e+00 1.74518780e+00 3.71e10 2.78e15 2.20e06 641s
43 2.00684511e+00 1.75011280e+00 3.36e10 3.95e15 2.16e06 641s
44 1.99936616e+00 1.78967233e+00 2.91e10 4.39e15 1.76e06 641s
45 1.99755628e+00 1.80541452e+00 2.77e10 3.90e15 1.61e06 642s
46 1.98735027e+00 1.81136992e+00 2.28e10 6.07e15 1.48e06 642s
47 1.98464190e+00 1.82459217e+00 2.14e10 5.60e15 1.35e06 642s
48 1.98207837e+00 1.82863398e+00 2.07e10 5.46e15 1.29e06 642s
49 1.96378386e+00 1.83503333e+00 1.53e10 5.99e15 1.08e06 642s
50 1.94690586e+00 1.86783322e+00 1.01e10 6.22e15 6.64e07 642s
51 1.92484365e+00 1.90031480e+00 3.78e11 6.13e15 2.06e07 642s
52 1.91014217e+00 1.90950151e+00 5.15e13 2.43e13 5.38e09 642s
53 1.90971262e+00 1.90971258e+00 1.57e10 2.33e14 2.92e13 642s
54 1.90971261e+00 1.90971261e+00 2.60e12 1.10e14 2.26e17 642s
Barrier solved model in 54 iterations and 642.16 seconds (668.66 work units)
Optimal objective 1.90971261e+00
Root crossover log...
21855 DPushes remaining with DInf 0.0000000e+00 642s
0 DPushes remaining with DInf 6.4243759e04 643s
0 PPushes remaining with PInf 0.0000000e+00 643s
Push phase complete: Pinf 0.0000000e+00, Dinf 6.4243759e04 643s
Root simplex log...
Iteration Objective Primal Inf. Dual Inf. Time
10039 1.9097126e+00 0.000000e+00 6.424376e04 644s
10050 1.9097126e+00 0.000000e+00 0.000000e+00 644s
10050 1.9097126e+00 0.000000e+00 0.000000e+00 644s
Use crossover to convert LP symmetric solution to basic solution...
Root crossover log...
8323 DPushes remaining with DInf 0.0000000e+00 644s
0 DPushes remaining with DInf 0.0000000e+00 647s
2 PPushes remaining with PInf 0.0000000e+00 647s
0 PPushes remaining with PInf 0.0000000e+00 650s
Push phase complete: Pinf 0.0000000e+00, Dinf 2.1070402e15 650s
Root simplex log...
Iteration Objective Primal Inf. Dual Inf. Time
12725 1.9097126e+00 0.000000e+00 0.000000e+00 653s
12725 1.9097126e+00 0.000000e+00 0.000000e+00 656s
Concurrent spin time: 12.46s (can be avoided by choosing Method=3)
Solved with barrier
Root relaxation: objective 1.909713e+00, 12725 iterations, 35.97 seconds (103.86 work units)
Total elapsed time = 672.94s
Nodes  Current Node  Objective Bounds  Work
Expl Unexpl  Obj Depth IntInf  Incumbent BestBd Gap  It/Node Time
0 0 1.90971 0 46456 4.00000 1.90971 52.3%  724s0 
Hi Victor, the problem can be concisely modelled with our python Matrix API. If my colleague CK doesn't find time to answer tomorrow then I'll provide a solution. Please avoid double posting questions on the forum.
Riley
0 
Hi Victor,
Here is my script for the problem you are working on.
import gurobipy as gp
from gurobipy import GRB
import numpy as np
sizeB = 200
sizeAi, sizeAj = 50, 60
A = np.random.randint(0, sizeB, size=(sizeAi, sizeAj))
B = np.arange(sizeB)
e_ij = np.zeros((sizeAi, sizeB))
for i in range(sizeAi):
for j in set(A[i]):
e_ij[i][j] = 1
model = gp.Model()
x = model.addVars(len(A), vtype=GRB.BINARY, name='x')
y = model.addVars(sizeB, vtype=GRB.BINARY, name='y')
for j in range(sizeB):
model.addConstr(y[j] <= gp.quicksum(x[i] * e_ij[i][j] for i in range(sizeAi)))
model.addConstr(y.sum() >= 0.8 * sizeB)
model.setObjective(x.sum(), GRB.MINIMIZE)
model.optimize()
if model.Status == GRB.OPTIMAL:
solX = [i for i in range(len(A)) if x[i].x > 0]
solY = [j for j in range(sizeB) if y[j].x > 0]
print(solX)
print(solY)If you only care about variable x (whether or not a subset, A[i], of A is included in a solution), the model would be enough.
However, if you check the details of variable y, there can be a limitation. Though A[i] is selected, an element in the set of A[i] (in my script, y[j]) could not be included in the final solution.
For example,e_ij =
1 0 1 0 1
0 1 0 1 0
x = [1, 1]
y = [1,1,1,1,0]If you add the following constraints, the issue can be resolved.
model.addConstrs(
x[i] <= y[j]
for i in range(sizeAi)
for j in range(sizeB)
if e_ij[i,j]
)If you have any further questions, please let us know.
0 
Hey ChungKyun, @ChungKyun Han
Thank you for your sharing. It gives the same result with the strong constraints(2 constraints >= &<=) however it is faster.
But I can't understand here why it works even though you give a weak constraint? Because y[j] can be 0 or 1 when gp.quicksum() > 1.
for j in range(sizeB): model.addConstr(y[j] <= gp.quicksum(x[i] * e_ij[i][j] for i in range(sizeAi)))
Could you explain it to me?
Have a nice day
Victor
0 
Hi Victor,
Generally, if you add more constraints, the model becomes bigger. This generally results in an increment in the computation time because of more iterations (though depending on the data and structure of the model, it is not guaranteed).
You can speed up the computation with Gurobi's Lazy parameter, which adds specified constraints only when the solver finds a feasible solution.
The following codes show how you implement lazy constraints.
cons = model.addConstrs(
x[i] <= y[j]
for i in range(sizeAi)
for j in range(sizeB)
if e_ij[i,j]
)
for con in cons.values():
con.Lazy=1If you decrease the computation time further, please refer to the linked article, 'How do I implement lazy constraints in Gurobi?'.
Also, Riley Clement provides a matrix form of the model.
m = gp.Model()
x = m.addMVar(sizeAi, vtype="B")
y = m.addMVar(sizeB, vtype="B")
m.addConstr(y <= x@e_ij)
m.addConstr(y.sum() >= 0.8*sizeB)
cons = m.addConstr(x[:, np.newaxis]*e_ij <= y)
cons.Lazy=1
m.setObjective(x.sum(), gp.GRB.MINIMIZE)
m.optimize()If you have any further requests, feel free to contact us again.
Best regards,
ChungKyun Han0 
Hey ChungKyun @ChungKyun，
Thank you for your aid and suggestion.
It works.
0
Please sign in to leave a comment.
Comments
9 comments