• Gurobi Staff

Hi Ben,

Could you please share your model file? Note that uploading files in the Community Forum is not possible but we discuss an alternative in Posting to the Community Forum.

Best regards,
Jaromił

Hi Jaromił,

The LP file is below. Hopefully this is the model file you meant? I used three additional sets of dummy variables x1, x2 and x2_abs for the conic constraints. Also, x here is the same as u in the picture above.

Thanks,

Ben

\ LP format - for model browsing. Use MPS format to capture full model detail.Maximize  nu[0] + nu[1] + 1.400853545289202 eta - 0.25 x[0,0] - 0.2 x[1,0]   - 0.05 x[1,1]Subject To R0: pi[0] + pi[1] = 1 R1: - 0.5 eta + 0.5 x[0,0] + x1[0,0] = 0 R2: - 0.5 eta + 0.5 x[0,1] + x1[0,1] = 0 R3: - 0.5 eta + 0.5 x[1,0] + x1[1,0] = 0 R4: - 0.5 eta + 0.5 x[1,1] + x1[1,1] = 0 R5: - 0.5 eta - 0.5 x[0,0] + x2[0,0] = 0 R6: - 0.5 eta - 0.5 x[0,1] + x2[0,1] = 0 R7: - 0.5 eta - 0.5 x[1,0] + x2[1,0] = 0 R8: - 0.5 eta - 0.5 x[1,1] + x2[1,1] = 0 R9: 1969.058006650649 pi[0] - nu[0] - 2 eta + z[0,0] >= 0 R10: 1958.116013412645 pi[0] - nu[0] - 2 eta + z[0,1] >= 0 R11: 1954.058006650649 pi[1] - nu[1] - 2 eta + z[1,0] >= 0 R12: 1883.116013412645 pi[1] - nu[1] - 2 eta + z[1,1] >= 0 qc0: [ z[0,0] ^2 + x1[0,0] ^2 - x2_abs[0,0] ^2 ] <= 0 qc1: [ z[0,1] ^2 + x1[0,1] ^2 - x2_abs[0,1] ^2 ] <= 0 qc2: [ z[1,0] ^2 + x1[1,0] ^2 - x2_abs[1,0] ^2 ] <= 0 qc3: [ z[1,1] ^2 + x1[1,1] ^2 - x2_abs[1,1] ^2 ] <= 0Bounds nu[0] free nu[1] free x1[0,0] free x1[0,1] free x1[1,0] free x1[1,1] free x2[0,0] free x2[0,1] free x2[1,0] free x2[1,1] freeGeneral Constraints GC0: x2_abs[0,0] = ABS ( x2[0,0] ) GC1: x2_abs[0,1] = ABS ( x2[0,1] ) GC2: x2_abs[1,0] = ABS ( x2[1,0] ) GC3: x2_abs[1,1] = ABS ( x2[1,1] )End
• Gurobi Staff

Hi Ben,

Yes, thank you for the model.

You say

If I only define u_{s, a, s'} and z_{s, a, s'} for (a, s') such that \hat{P}_{s, a, s'} > 0 then the model will solve with presolve turned on but the objective value is not what it should be.

What should the objective value be then?

I don't understand why you need the absolute value constraints. Both $$\eta, u$$ are $$\geq 0$$, so all $$x_2$$ variables should be $$\geq 0$$ anyway. Could you briefly elaborate?

Another question: Should the value $$\eta - u_{s,a,s'}$$ be non-negative? It has to be if you want that your quadratic constraints are second order cones.

Hi Jaromił,

Firstly, your comment about the $$|x_2|$$ variables is correct and that was an oversight on my part, sorry for that. After seeing your updated comment I think that it should have been $$|x_1|$$ instead of this, since $$x_1$$ can be negative.

With regards to the objective value: this is being run inside and at the end of a value iteration algorithm. In the example I am looking at, I am using the model to extract a policy for the given values $$v$$ after value iteration ends. The value iteration ended because it is no longer possible to improve the values by more than some very small amount. The objective value of the above model should therefore be very close to the value of $$v_s$$. In the results I have seen, especially for state 0, this is not the case. It seems to me that this happens because $$z_{s, a, s'}$$ is not being set to its lower bound, when it should always be set to that as long as the value is non-negative.

As an example, for the above problem I found the objective value to be 1969.05, when $$v_s = 1958.12$$.
The multipliers came out as $$\nu = (1.969\times 10^{3}, 5.6\times 10^{-4})$$ and $$\eta = 0.00647$$). The values of $$z$$ are $$(0.013, 11.11, 0.013, 0.013)$$. The lower bound of the second value (i.e.\ $$z_{s, 0, 1}$$ is $$10.955$$, so it's being set too high. I think that because of the fact that this term is not in the objective function, it is being set too high in order to increase $$\nu_0$$ and consequently the objective function.

Does that  make sense?

Ben

Update - I changed the quadratic constraint to use $$|x_1|$$ instead of $$|x_2|$$ and now the model is giving me the following result:

Solution count 2: 1958.62 1958.12 Optimal solution found (tolerance 1.00e-07)Warning: max constraint violation (2.5000e-01) exceeds toleranceBest objective 1.958616013413e+03, best bound 1.958616013413e+03, gap 0.0000%

This is looking a lot closer as the second value is $$v_s = 1958.12$$. However, it still has found a solution with higher objective value, which it should not be able to...

• Gurobi Staff

I think the following model does what you want:

\ LP format - for model browsing. Use MPS format to capture full model detail.Maximize  nu[0] + nu[1] + 1.400853545289202 eta - 0.25 x[0,0] - 0.2 x[1,0]   - 0.05 x[1,1]Subject To R0: pi[0] + pi[1] = 1 R1: - 0.5 eta + 0.5 x[0,0] + x1[0,0] = 0 R2: - 0.5 eta + 0.5 x[0,1] + x1[0,1] = 0 R3: - 0.5 eta + 0.5 x[1,0] + x1[1,0] = 0 R4: - 0.5 eta + 0.5 x[1,1] + x1[1,1] = 0 R5: - 0.5 eta - 0.5 x[0,0] + x2[0,0] = 0 R6: - 0.5 eta - 0.5 x[0,1] + x2[0,1] = 0 R7: - 0.5 eta - 0.5 x[1,0] + x2[1,0] = 0 R8: - 0.5 eta - 0.5 x[1,1] + x2[1,1] = 0 R9: 1969.058006650649 pi[0] - nu[0] - 2 eta + z[0,0] >= 0 R10: 1958.116013412645 pi[0] - nu[0] - 2 eta + z[0,1] >= 0 R11: 1954.058006650649 pi[1] - nu[1] - 2 eta + z[1,0] >= 0 R12: 1883.116013412645 pi[1] - nu[1] - 2 eta + z[1,1] >= 0 qc0: [ z[0,0] ^2 + x1[0,0] ^2 - x2[0,0] ^2 ] <= 0 qc1: [ z[0,1] ^2 + x1[0,1] ^2 - x2[0,1] ^2 ] <= 0 qc2: [ z[1,0] ^2 + x1[1,0] ^2 - x2[1,0] ^2 ] <= 0 qc3: [ z[1,1] ^2 + x1[1,1] ^2 - x2[1,1] ^2 ] <= 0Bounds nu[0] free nu[1] free x1[0,0] free x1[0,1] free x1[1,0] free x1[1,1] freeEnd

Best regards,
Jaromił

Hi Jaromił,

Isn't this the same but with $$x_2$$ instead of $$|x_2|$$? If so, didn't we decide that $$x_1$$ needs to be non-negative? I have tried the following model

\ LP format - for model browsing. Use MPS format to capture full model detail.Maximize  nu[0] + nu[1] + 1.400853545289202 eta - 0.25 x[0,0] - 0.2 x[1,0]   - 0.05 x[1,1]Subject To R0: pi[0] + pi[1] = 1 R1: - 0.5 eta + 0.5 x[0,0] + x1[0,0] = 0 R2: - 0.5 eta + 0.5 x[0,1] + x1[0,1] = 0 R3: - 0.5 eta + 0.5 x[1,0] + x1[1,0] = 0 R4: - 0.5 eta + 0.5 x[1,1] + x1[1,1] = 0 R5: - 0.5 eta - 0.5 x[0,0] + x2[0,0] = 0 R6: - 0.5 eta - 0.5 x[0,1] + x2[0,1] = 0 R7: - 0.5 eta - 0.5 x[1,0] + x2[1,0] = 0 R8: - 0.5 eta - 0.5 x[1,1] + x2[1,1] = 0 R9: 1969.058006650649 pi[0] - nu[0] - 2 eta + z[0,0] >= 0 R10: 1958.116013412645 pi[0] - nu[0] - 2 eta + z[0,1] >= 0 R11: 1954.058006650649 pi[1] - nu[1] - 2 eta + z[1,0] >= 0 R12: 1883.116013412645 pi[1] - nu[1] - 2 eta + z[1,1] >= 0 qc0: [ z[0,0] ^2 - x2[0,0] ^2 + x1_abs[0,0] ^2 ] <= 0 qc1: [ z[0,1] ^2 - x2[0,1] ^2 + x1_abs[0,1] ^2 ] <= 0 qc2: [ z[1,0] ^2 - x2[1,0] ^2 + x1_abs[1,0] ^2 ] <= 0 qc3: [ z[1,1] ^2 - x2[1,1] ^2 + x1_abs[1,1] ^2 ] <= 0Bounds nu[0] free nu[1] free x1[0,0] free x1[0,1] free x1[1,0] free x1[1,1] freeGeneral Constraints GC0: x1_abs[0,0] = ABS ( x1[0,0] ) GC1: x1_abs[0,1] = ABS ( x1[0,1] ) GC2: x1_abs[1,0] = ABS ( x1[1,0] ) GC3: x1_abs[1,1] = ABS ( x1[1,1] )End

which is the same as yours but with $$|x_1|$$ instead of $$x_1$$. It still gives me the result I posted above (i.e. two solutions and max constraint violation exceeded). Something is still not right here, no?

Thanks,

Ben