Objective function coefficients don't match the data in a Quadratic problem
AnsweredHello everyone.
I have regular quadratic assignment problem with n locations and n facilities
First I read the distance between the locations and the shpements between the facilities from Excel into dataframe:
d = pd.read_excel('3.xlsx', sheet_name='Sheet1', header=None) # distance
s = pd.read_excel('3.xlsx', sheet_name='Sheet2', header=None) # Shipments
Fac= range(len(s))
Loc= range(len(d))
Then I convert the into a dictionary- I do this because eventually I want to take advantage of the keys to names the locations and facilities:
d = {(i,j): d[i][j] for i in Bay for j in Loc}
s = {(i,j): d[i][j] for i in MP for j in Fac}
Then the formulatin is straightforward:
m = Model("QAP")
m.modelSense = GRB.MINIMIZE
x = m.addVars(Fac, Loc,
vtype=GRB.BINARY,
name="assign")
m.addConstrs(
(x.sum('*',j) == 1 for j in Loc),
"assignLoc")
m.addConstrs(
(x.sum(i,'*') == 1 for i in Fac),
"assignFac")
m.setObjective(quicksum(d[k,l]*s[i,j]*x[i,k]*x[j,l] for i in Fac for k in Loc for j in Fac for l in Loc), GRB.MINIMIZE)
m.update()
m.optimize()
Everything looks fine. But when I write the model I see funny number for the distance and shipments.
For instance I tried the model for a simple case where n = 3 with the following data:
Distance dataframe:
0 65 7
65 0 6
7 6 0
Which gets us this dictionary
{(0, 0): 0, (0, 1): 65, (0, 2): 7, (1, 0): 65, (1, 1): 0, (1, 2): 6, (2, 0): 7, (2, 1): 6, (2, 2): 0}
Similarly for the shipment:
0 13 16
8 0 10
0 4 0
{(0, 0): 0, (0, 1): 65, (0, 2): 7, (1, 0): 65, (1, 1): 0, (1, 2): 6, (2, 0): 7, (2, 1): 6, (2, 2): 0}
When I write the model here is the output:
Minimize
[ 16900 assign[0,0] * assign[1,1] + 1820 assign[0,0] * assign[1,2]
+ 1820 assign[0,0] * assign[2,1] + 196 assign[0,0] * assign[2,2]
+ 16900 assign[0,1] * assign[1,0] + 1560 assign[0,1] * assign[1,2]
+ 1820 assign[0,1] * assign[2,0] + 168 assign[0,1] * assign[2,2]
+ 1820 assign[0,2] * assign[1,0] + 1560 assign[0,2] * assign[1,1]
+ 196 assign[0,2] * assign[2,0] + 168 assign[0,2] * assign[2,1]
+ 1560 assign[1,0] * assign[2,1] + 168 assign[1,0] * assign[2,2]
+ 1560 assign[1,1] * assign[2,0] + 144 assign[1,1] * assign[2,2]
+ 168 assign[1,2] * assign[2,0] + 144 assign[1,2] * assign[2,1] ] / 2
Subject To
assignLoc[0]: assign[0,0] + assign[1,0] + assign[2,0] = 1
assignLoc[1]: assign[0,1] + assign[1,1] + assign[2,1] = 1
assignLoc[2]: assign[0,2] + assign[1,2] + assign[2,2] = 1
assignFac[0]: assign[0,0] + assign[0,1] + assign[0,2] = 1
assignFac[1]: assign[1,0] + assign[1,1] + assign[1,2] = 1
assignFac[2]: assign[2,0] + assign[2,1] + assign[2,2] = 1
Bounds
Binaries
assign[0,0] assign[0,1] assign[0,2] assign[1,0] assign[1,1] assign[1,2]
assign[2,0] assign[2,1] assign[2,2]
End
All these cost- s*d, values are wrong. For instance, take the first term of the objective function:
16900 assign[0,0] * assign[1,1]
From the objective function fromulation the cost of this term is
d[0,1]*s[0,1]
Which from the data above should be: 65*13 = 845 not 16900.
I couldn't figure out the problem but I thought it might be the dictionary I'm using for the data, so I used the data straight from the dataframes. Now the objective function is- to reflect the indexing of the dataframe:
m.setObjective(quicksum(d[k][l]*s[i][j]*x[i,k]*x[j,l] for i in Fac for k in Loc for j in Fac for l in Loc), GRB.MINIMIZE)
I got different results and numbers. Now the model is written as follows:
Minimize
[ 2730 assign[0,0] * assign[1,1] + 294 assign[0,0] * assign[1,2]
+ 2080 assign[0,0] * assign[2,1] + 224 assign[0,0] * assign[2,2]
+ 2730 assign[0,1] * assign[1,0] + 252 assign[0,1] * assign[1,2]
+ 2080 assign[0,1] * assign[2,0] + 192 assign[0,1] * assign[2,2]
+ 294 assign[0,2] * assign[1,0] + 252 assign[0,2] * assign[1,1]
+ 224 assign[0,2] * assign[2,0] + 192 assign[0,2] * assign[2,1]
+ 1820 assign[1,0] * assign[2,1] + 196 assign[1,0] * assign[2,2]
+ 1820 assign[1,1] * assign[2,0] + 168 assign[1,1] * assign[2,2]
+ 196 assign[1,2] * assign[2,0] + 168 assign[1,2] * assign[2,1] ] / 2
Subject To
assignLoc[0]: assign[0,0] + assign[1,0] + assign[2,0] = 1
assignLoc[1]: assign[0,1] + assign[1,1] + assign[2,1] = 1
assignLoc[2]: assign[0,2] + assign[1,2] + assign[2,2] = 1
assignFac[0]: assign[0,0] + assign[0,1] + assign[0,2] = 1
assignFac[1]: assign[1,0] + assign[1,1] + assign[1,2] = 1
assignFac[2]: assign[2,0] + assign[2,1] + assign[2,2] = 1
Bounds
Binaries
assign[0,0] assign[0,1] assign[0,2] assign[1,0] assign[1,1] assign[1,2]
assign[2,0] assign[2,1] assign[2,2]
End
Now for the first term of the objective function
2730 assign[0,0] * assign[1,1]
The coefficient is still not 845 but it's not 16900 anymore.
So I have a few questions here:
- What do you think the reason I am not getting the coefficients correct?
- Even though they are wrong, but why the coefficients as different when using dataframes vs dictionaries? I thought the way I modeled it was just a different representation of the same data
- In both models, the objection functions are divided by 2 (/ 2) at the end. Why is that? Also if I divide the coefficients of both models by half I still don't the correct values. For instance, 16900/2 = 8450 not 845, and 2730/2 is 1365 not 845. Why is the division and why the values are still off?
Thank you for your time in advance. I tried to be very through with the details but I would happily share more if needed
-
Official comment
This post is more than three years old. Some information may not be up to date. For current information, please check the Gurobi Documentation or Knowledge Base. If you need more help, please create a new post in the community forum. Or why not try our AI Gurobot?. -
Hi Ashraf,
You define the \( \texttt{s} \) dictionary using the distances DataFrame instead of the shipments DataFrame:
s = {(i,j): d[i][j] for i in MP for j in Fac}
This explains why the first approach with your intermediate dictionaries produces the incorrect results.
For the second approach that directly uses the DataFrames, note that the coefficient for \( \texttt{assign[0,0]*assign[1,1]} \) shows up in the summation for not only \( i=k=0, j=l=1 \), but also \( i=k=1, j=l=0 \). That is, \( x_{0,0} x_{1,1} = x_{1,1} x_{0,0} \). So, we have
$$\begin{align*} s_{0,1} d_{0,1} x_{0,0} x_{1,1} + s_{1,0} d_{1,0} x_{1,1} x_{0,0} = (13)(65) x_{0,0} x_{1,1} + (8)(65) x_{1,1} x_{0,0} = 1365 x_{0,0} x_{1,1}. \end{align*}$$
This matches the coefficient shown in the model file (\( 2730/2 \)). By convention, the quadratic objective of a model is often written in the form \( \frac{1}{2} x^\top Q x + c^\top x + d \).
Thanks,
Eli
0 -
Thank you so much Eli, and sorry for the mistake I missed.
Is there any sources you are aware of about how quadratic programming is handled in Gurobi? I found few examples here and there, but the last part of your response is an eye opener to something I wasn't aware of.
One last question about data structure. I'm testing small instance of the problem, and therefore haven't notice any difference between using range, list, dictionary, tuplelist, etc. for inputting the data and feed it to the model. But is there a recommended type once I move forward to larger instances?
Again Eli, I really appreciate your help. Thank you
0 -
Hi Ashraf,
Quadratic programs are solved using either barrier or simplex. There are some resources suggested in this community post that you might find useful.
Constructing the model should be a lot quicker than solving the model. The specific choice of data structure shouldn't matter too much. Tuplelists and tupledicts can help improve the readability of some code with their built-in helper methods.
It's much more important to structure your code efficiently. This includes introducing intermediate data structures if you repeatedly make the same calculations, using Model.addVars() instead of many calls to Model.addVar(), leveraging something like quicksum() to build large LinExpr objects, etc.
Thanks,
Eli
0 -
Hello Eli,
I hope you are doing well. Currently I'm running a mid-large instance of this quadratic problem- with 100 Loc and 30 Fac. Building the objective function takes a long time. I broke down the constraints and variable creation into individual cells. When I execute all of that it happens instantaneously. BUT the objective function takes forever
m.setObjective(quicksum(d[k][l]*s[i][j]*x[i,k]*x[j,l] for i in Fac for k in Loc for j in Fac for l in Loc), GRB.MINIMIZE)
Is there a way to speed up that? I'm using dataframes to input the data.
Thank you in advance
0 -
It is probably faster to build the objective expression by creating a QuadExpr object, then adding each quadratic objective term individually with QuadExpr.addTerms(). Afterwards, you can pass this quadratic expression to Model.setObjective().
Additionally, assuming \( d_{kl} = d_{lk} \) and \( s_{ij} = s_{ji} \), you could combine the separate objective terms \( d_{kl} s_{ij} x_{ik} x_{jl} \) and \( d_{lk} s_{ji} x_{jl} x_{ik} \) into the single term \( 2 d_{kl} s_{ij} x_{ik} x_{jl} \) when building the expression. That would save you the trouble of looping over every combination of indices.
0 -
Hello Eli,
Thank you so much. One more question and hopefully the last- sorry for bugging you with this. How do I take advantage of the symmetry here? I gave it a shot but didn't work. Here is what I did:
1- To reduce the size of the input data, I input the data in dictionaries where only one direction for distance and flow is entered. E.g., d(0,1), d(0,2),...d(0,n), d(1,2), d(1,3),....,d(1,n),d(2,3),.......,d(n-1,n). I did the same thing for the flow s.
2- I created the variable the same way as above.
3- I wrote the objective function as follows:
obj=quicksum(d[k,l]*s[i,j]*x[i,k]*x[j,l] for (i,k) in x for (j,l) in x if i<j and k<l)
This results in a different answer- value and solution, than using a full matrix for distance and flow and writing the objective as
obj=quicksum(2*d[k,l]*s[i,j]*x[i,k]*x[j,l] for (i,k) in x for (j,l) in x if i!=j and k!=l)
0 -
The first approach can't be right. Let's say you have 30 facilities and 100 locations. The variable \( x_{0,99} \) would never show up in the objective function. If \( x_{0,99} \) would show up as an \( x_{ik} \) term in your summation (\( i = 0 \), \(k = 99\)), then we cannot satisfy \( k < \ell \). Likewise, if \( x_{0,99} \) would show up as an \( x_{j\ell} \) term (\( j = 0 \), \( \ell = 99 \)), then we cannot satisfy \( i < j \).
Also, note that the second approach (using the full matrix) multiplies the objective function by \( 2 \). Perhaps you misunderstood my previous post, but I don't see why this \( 2 \) should be there.
I think what you want is:
obj = quicksum(2*d[k, l]*s[i, j]*x[i, k]*x[j, l] for (i, k) in x for (j, l) in x if i < j and k != l)
It may look a bit strange, but we can prove its equivalence to your original objective function. Let \( N \) be the set of facilities and \( M \) the set of locations. The original objective is:
$$ \sum_{i \in N} \sum_{\substack{j \in N \colon \\ i \neq j}} \sum_{k \in M} \sum_{\substack{\ell \in M \colon \\ k \neq \ell}} d_{k\ell} s_{ij} x_{ik} x_{j\ell}.$$
We can rewrite the objective function by breaking up the \( i \neq j \) condition into two separate conditions, \( i < j \) and \( i > j \):
$$ \sum_{i \in N} \sum_{\substack{j \in N \colon \\ i < j}} \sum_{k \in M} \sum_{\substack{\ell \in M \colon \\ k \neq \ell}} d_{k\ell} s_{ij} x_{ik} x_{j\ell} + \sum_{i \in N} \sum_{\substack{j \in N \colon \\ i > j}} \sum_{k \in M} \sum_{\substack{\ell \in M \colon \\ k \neq \ell}} d_{k\ell} s_{ij} x_{ik} x_{j\ell}. $$
Next, we change some indexes in the second summation, which doesn't affect the summation at all. Namely, we swap the \( i \)'s with the \( j \)'s, and the \( k \)'s with the \( \ell \)'s:
$$ \sum_{i \in N} \sum_{\substack{j \in N \colon \\ i < j}} \sum_{k \in M} \sum_{\substack{\ell \in M \colon \\ k \neq \ell}} d_{k\ell} s_{ij} x_{ik} x_{j\ell} + \sum_{j \in N} \sum_{\substack{i \in N \colon \\ j > i}} \sum_{\ell \in M} \sum_{\substack{k \in M \colon \\ \ell \neq k}} d_{\ell k} s_{ji} x_{j\ell} x_{ik}. $$
We can rearrange the order of the second summation without affecting the end result:
$$ \sum_{i \in N} \sum_{\substack{j \in N \colon \\ i < j}} \sum_{k \in M} \sum_{\substack{\ell \in M \colon \\ k \neq \ell}} d_{k\ell} s_{ij} x_{ik} x_{j\ell} + \sum_{i \in N} \sum_{\substack{j \in N \colon \\ i < j}} \sum_{k \in M} \sum_{\substack{\ell \in M \colon \\ k \neq \ell}} d_{\ell k} s_{ji} x_{j\ell} x_{ik}. $$
Now, the summation order and indices for both of our big summation expressions match. Because \( d_{k\ell} = d_{\ell k} \), \(s_{ij} = s_{ji}\), and \(x_{ik}x_{j\ell} = x_{j\ell} x_{ik} \), we end up with
$$ 2 \sum_{i \in N} \sum_{\substack{j \in N \colon \\ i < j}} \sum_{k \in M} \sum_{\substack{\ell \in M \colon \\ k \neq \ell}} d_{k\ell} s_{ij} x_{ik} x_{j\ell}. $$
This is the expression used in the \( \texttt{quicksum()} \) function above. But like I mentioned, it's likely faster to build this expression with QuadExpr.addTerms(). E.g.:
obj = QuadExpr()
for (i, k) in x:
for (j, l) in x:
if i < j and k != l:
obj.addTerms(2*d[k, l]*s[i, j], x[i, k], x[j, l])
m.setObjective(obj)0 -
Hello Eli,
I hope you are doing well. I'm sorry about bugging you revisiting this. I have been trying to include a new idea in this problem and wanted to get an opinion in my approach.
So far everything works fine, but there is a condition on some of the facility and how they are placed. I get lists of facilities- they change from run to run, and the facilities within each list should be placed as close as possible to one another. For example, let say I have the following two lists:
list1 = [Fac1, Fac7]
list2 = [Fac6, Fac7, Fac11]Fac1 and Fac7 should be as close are possible to each other, and Fac6, Fac8, and Fac11 close to each other as well. My idea is to penalize the distance between the assignment of the facilities in each list. For the first list, M d ij xfac1 i xfac7 j, where M is a penalty and this term added to the objective function. For the second list we will add the following terms:
M d ij xfac6 i xfac8 j, M d ij xfac6 i xfac11 j, M d ij xfac8 i xfac11 j
I have a few questions here.
- Is this approach good to handle this situation?
- How to mathematically formulate that in Gurobi given that the number of lists and their sizes differ from run to run? Also if this approach is to be formulated, how to avoid using unnecessary terms since the distance is always symmetric. For instance in the first list only M d ij xfac6 i xfac8 j is sufficient and there is no need for the other term M d ji xfac8 i xfac6 j
0 -
Your penalty-based approach seems fine to me, though I never studied QAPs much. Another idea could be to add constraints that all facilities in the list are within some fixed distance of one another.
To formulate this in Gurobi, iterate through each combination of two facilities in your "close facilities" list and add together the corresponding penalty terms. E.g.:
import itertools
M = 1000
fac_list = ['Fac6', 'Fac7', 'Fac11']
penalty = QuadExpr()
for i, j in itertools.combinations(fac_list, 2):
for k in Loc:
for l in Loc:
if k < l:
penalty.addTerms(M*d[k, l], x[i, k], x[j, l])
# Do something with penalty expressionThe check \( \texttt{if k < l} \) avoids unnecessary (repeated) terms. The above code can be easily generalized to the case where you have multiple lists of facilities that should be close.
0 -
Thank you so much!
I really appreciate your help and I hope you have a happy Holidays.
0
Post is closed for comments.
Comments
11 comments