Matrix Invertibility and Some Minimums

Answered

March 25, 2024 15:06 Edited

I have two things that I want to be able to efficiently model in Gurobi for my research:

matrix invertibility
getting the smallest and second smallest non-zero element from a list.

I have partial solutions for both, but they both seem like hacks. I want to know if there is a better way to do them.

Matrix invertibility

Problem statement: Let T^{abc} be a constant, integral 3d matrix that is an input to the problem. Let x_a be an integral vector variable in the problem. The problem is to restrict x_a to only take values that lead to M^{ab}, defined as M^{ab}=T^{abc}x_c, being invertible. The dimensions of this matrix are small (think ~5x5)

Likely suboptimal solution: introduce a matrix variable to the problem called Minv. Let Minv have the same size as M. Then, impose constraints that M@Minv[:,i] = e_i. I.e., that the matrix vector product of the i-th column of Minv with M gives the i-th standard basis vector. This has been implemented and works, but it gets a bit slow as the size of the matrix increases to ~7x7 or so.

This works, but I wanted to know if I could do better. I probably could semi-directly instead get the determinant of M, but that seems like it'll be worse.

Minima

Problem statement: Let M be a nxk constant, integral 2D matrix input to the problem. In general, n will be large (~1000) while k will be small (~5). Let x be an integral vector variable in the problem. I want to find the minimum non-zero value, (M@x)[i]. That is, I want 0<(M@x)[i]<=(M@x)[j] for all j. Let u be this minimum value. I then also want to find the second minimum value, (M@x)[k]. That is, I want u<(M@x)[k]<=(M@x)[j] for all j.

For ease of future notation, let Mx:=M@x. All the entries will always be positive.

Suboptimal solution #1: Just create a huge number of indicator constraints:
Mx_i > 0 -> u<=Mx_i for all i
and
Mx_i >u -> v<=Mx_i for all i
This likely would work, but would take O(2000) indicator constraints. This feels somewhat expensive.

Suboptimal solution #2: I want to do something akin to how Max is calculated (see description here). I do not have complete thoughts on this plan using SOS1 constraints, though.

A note: It would be preferable to ultimately have flags for where Mx_i==u and Mx_i==v. That would allow other analyses. So, if there is a method to model such constraints with minimal extra overhead, that would be preferred.

Are there any recommendations for either problem? Thank you very much

Comments

3 comments

Riley Clement

Gurobi Staff

March 29, 2024 12:00
Hi Nate,

impose constraints that M@Minv[:,i] = e_i

You could try using that fact that M@Minv[:,i] = a_ie_i for some non-zero scalar a_i also implies invertibility. I have no reason to expect this to work better but it's worth trying and seeing how it goes:
```
aux = model.addMVar(
    M.shape,
    lb=np.eye(M.shape[0])*0.1,
    ub=np.eye(M.shape[0])*float(1e4),
    name="aux",
)
# aux has the same shape as M but only the variables on the
# diagonal will matter the others have lb = ub = 0

model.addConstr(M@Minv == aux)
```
semi-directly instead get the determinant of M

It is possible to calculate the determinant, see this post. To guarantee invertibility the determinant would need to be non-zero. Given that MIP solvers need to work with tolerances it is not possible to have det != 0 in practice, only abs(det) > epsilon for some small value of epsilon. Whether this is an obstacle for your problem I do not know.

With regards to minima:

0<(M@x)[i]

For the same reason mentioned above you will need to treat this as epsilon<(M@x)[i]

My colleague Mario Ruthmair came up with a model to find the k-th smallest value from a set of variables:

x_i continous: original variables
C continuous: k-th smallest value
s_i binary: variable x_i has a value smaller or equal to C
l_i binary: variable x_i has a value larger or equal to C

Exactly k variables must be <=C and exactly n-k+1 must be >=C. This implies that at least one variable must be equal to C.

For all i = 1,...,n:

x_i <= C + M*(1 - s_i)
x_i >= C - M*(1 - l_i)
s_i + l_i >= 1
sum_i s_i = k
sum_i l_i = n - k + 1

If you introduce binary variables "nz" for each element of Mx to indicate whether the element is 0 or > epsilon, then I think you can adapt the above approach:

sum_i s_i = k + sum(binary variables nz)
sum_i l_i = n - k -sum(binary variables nz) + 1

Yes, it's going to require many new variables and constraints. I do not expect there is a way to avoid this.

- Riley
0
Nate MacFadden

March 29, 2024 14:10
Hi Riley,

Thanks for the great thoughts and references - your reply demonstrates a lot of functionality that I didn't know was possible! Wonderful!

The invertibility sounds really good. It looks like I have 3 main options (the one I laid out, the N@Ninv = c*1, and the determinant method). I will try all of them - thanks!

The minima are also very exciting! I'm a bit nervous about applying this to lists with multiplicities (sorry for not clearly stating that in my original post!). E.g., say the list ended up being [x1, x2, ..., xN] = [0.1, 0.1, 0.2, 0.2, 0.5 , ...]. I fear that this would cause issues along the lines of
1) if I set k=1, then I expect to get 0.1
2) if I set k=2, then I expect to get 0.1
3) if I set k=3, then I expect to get 0.2
Given that I don't, a priori, know the multiplicity of the minimum value in [x1, ..., xN], this seems a bit tricky.

Here I'll state a potential fix. This is based off of the recommendation in your post, and the code for maxima here. Please let me know what you think of it!

Let nz be your "equals 0" indicator variables. I.e., nz_i = (|x_i|<eps).
Let M be a large number.

For all i = 1,...,n:
let s_i >= 0;
let z_i in {0,1};
C = x_i + M*nz_i - s_i;
SOS1(s_i, z_i);

Also let sum_i z_i >=1.

Then I believe that C is the first nonzero minima, excluding 0. The second minima could then be calculated in effectively the same way, just using both nz_i and z_i (i.e., nz_i+z_i in all places that I previously used just nz_i).

My fear is that finding the minimum and second minimum adds to my problem:
- 2*len(x) slack variables,
- 2*len(x) binary variables,
- 2*len(x) equality constraints, and
- 2*len(x) SOS1 constraints
(along with a handful of others). I'm no expert in such programming, so maybe adding ~4000 variables and ~4000 constraints isn't the biggest deal (especially because they all seem ~relatively simple), but I could also imagine this making the problem computationally infeasible.

Do you have thoughts on this? Would this be completely out of scope for such solvers? Or could this be doable? The anticipation is that this will add a lot more cost in solving a model, but it'll enable dramatic cuts in parameter space. The cut-out parameter space has 'bad' solutions, so this would be very nice.

Thank you so much, Riley!
0

Riley Clement

Gurobi Staff

March 29, 2024 22:14

Hi Nate,

There won't be any modification needed for multiplicity. Test out the following for example with k=1,2,3

k = 2

m = gp.Model()
x = m.addVars(5, ub =10)
C = m.addVar()
s = m.addVars(5, vtype="B")
l = m.addVars(5, vtype="B")
y = m.addVars(5, vtype="B") # y_i = 0 <-> x_i >= epsilon, so sum y == # zeros
m.update()

for i in range(5):
    m.addConstr(x[i] <= C + 10*(1-s[i]))
    m.addConstr(x[i] >= C - 10*(1-l[i]))
    m.addConstr(s[i] + l[i] >=1)
    m.addConstr(x[i] <= x[i].ub*(1-y[i]))
    m.addConstr(0.0001*(1-y[i]) <= x[i])
m.addConstr(gp.quicksum(s.values()) == k + gp.quicksum(y.values()))
m.addConstr(gp.quicksum(l.values()) == 5-k-gp.quicksum(y.values())+1)

test = [0.0, 0.0, 0.1, 0.5, 0.3]
for v, t in zip(x.values(), test):
    m.addConstr(v == t)

m.optimize()
print(f"for k={k}, C={C.X}")

The only way of knowing if this approach is practical is by testing, making assumptions without testing is risky when it comes to modelling.

You could also possibly play around with VarHintVal. For example, if you know there will be only a few zero variables then most variables will be >= C and therefore most l = 1 and most s = 0. You could perhaps try combining this with making these two constraints lazy (perhaps Lazy=2?):

m.addConstr(x[i] <= C + 10*(1-s[i]))
m.addConstr(x[i] >= C - 10*(1-l[i]))

It's just a matter of testing (multiple times with either different input values or different values of Seed) and comparing.

- Riley

Please sign in to leave a comment.

Matrix Invertibility and Some Minimums

Comments

Didn't find what you were looking for?