Stoichiometric constraints not working on symbolic regression program
Awaiting user inputThis is going to be a bit long, sorry in advance, but I genuinely hope someone can help me with this project.
The overall idea is to provide the program with concentration data for N species. The program will calculate the derivatives, and use both: derivatives vs. time and concentrations vs. time. The decision variables with be a matrix (N_R x 2N), with some arbitrary N_R number of reactions, each reaction will be written as a vector with 2*N elements. The first N are the coefficients of the reactants, the latter N the coefficients of the reactants. If a coefficients is 0, it means it is not present (whether in the reactant or product side); if it is 1, it means 1 molecule is being consumed or created. Coefficients can be as high as the reaction order, O, on the reactant side; and as high as O*C_max, where C_max is the max stoichiometric value to be studied. The objective is to minimize MSE(calculated_derivatives - predicted_derivatives), there the predicted derivatives are constructed using a symbolic representation:
Pre_der(t,sj) = sum_ri [C(t,ri)*K(ri)*(M(ri,sj+N) - M(ri,sj))]
Where Pre_der is the matrix containing the predicted derivative, with T time steps (t) and N species (sj); M is the reaction matrix with N_R rows (reactions) and 2*N columns (stoichiometric coefficients); K is the vector of rate constants, of length N_R (one per reaction); lastly, C is the matrix of reactant concentrations and is calculated as C(t,ri) = prod_si([si](t)^M[ri,si]), where the concentration of each species ([si](t)) is raised to the power of their coefficient ( in the reactant side and then multiplied.
The issue I am encountering is that even when I provide my system with the right reaction and rate constant as an initial guess, I get the same reaction (wrong) as when I don't. I'm also getting some warnings about my constraints being violated, which I don't get where they are coming from.
Here is my script:
# This version works:
# Produces output
# Implemented reactant concentration
# M is reaction matrix, K is rate constant vector
# Trend has been implemented but not working properly
# Corrected constraint on product coefficients
import sys
import numpy as np
import gurobipy as gp
from gurobipy import GRB
import matplotlib.pyplot as plt
# Initialize kinetic data
k_true = 0.3 # True rate constant for generating data
A0 = 1.0
n_steps = 10
N = 2 # Number of species
A_conc = {0: A0}
B_conc = {0: 1e-12}
# Generate concentration data
for i in range(1, n_steps):
A_conc[i] = A_conc[i-1] - A_conc[i-1] * k_true
B_conc[i] = 1 - A_conc[i]
# Calculate rates (calculated derivatives)
A_rate = {i: -k_true * A_conc[i] for i in range(n_steps)}
B_rate = {i: k_true * A_conc[i] for i in range(n_steps)}
# Plot concs and rates
if 1==2:
plt.figure()
plt.plot(range(n_steps),A_conc.values(), label = "[A]")
plt.plot(range(n_steps),B_conc.values(), label = "[B]")
plt.title("Concentration Plots")
plt.show()
plt.close()
plt.figure()
plt.plot(range(n_steps),A_rate.values(), label = "dA/dt")
plt.plot(range(n_steps),B_rate.values(), label = "dB/dt")
plt.title("Derivative Plots")
plt.show()
plt.close()
sys.exit()
# Gurobi model setup
# constants for the model
O = 2 # max reaction order
S = 2 # max stoich ratio
N_r = 1
k_tol = 1e-6
model = gp.Model()
model.setParam('FeasibilityTol', 1e-7)
model.setParam('Method', -1) # Use the barrier method
model.setParam('LogToConsole', 1)
model.setParam('FuncPieces', 1) # Set the number of pieces
# added mew
model.setParam('NonConvex', 2) # Solve non-convex models
#model.setParam('Heuristics', 0.01) # Spend more time on heuristics
#model.setParam('MIPFocus', 1) # Focus on finding feasible solutions
#model.setParam('ImproveStartTime', 30) # Allow time to improve initial solution
#model.setParam('Cuts', 2) # Use aggressive cutting planes
#model.setParam('Threads', 8) # Use all available CPU threads
#model.setParam('MIPGap', 1e-4) # Allow a small optimality gap
#model.setParam('OptimalityTol', 1e-9) # Tighten optimality tolerance
# Define a continuous decision variable for the rate constant
K = model.addVars(N_r, lb=0, ub=GRB.INFINITY, name="K")
# Define integer decision variables for reaction coefficients R (size 2N)
M = model.addVars(N_r, 2 * N, vtype=GRB.INTEGER, name="M")
# provide init guess
if 1==2:
# Suggested initial guess, for example [[1, 0, 0, 0]]
initial_guess = [[1, 0, 0, 1]] # Adjust this for your actual size of N_r and 2*N
# Set the initial guess for M
for ri in range(N_r):
for ci in range(2 * N):
M[ri, ci].start = initial_guess[ri][ci]
# Set the initial guess for K
K[0].start = 0.3
# Define variables for predicted rates
A_pred_rate = model.addVars(n_steps, name="A_pred_rate")
B_pred_rate = model.addVars(n_steps, name="B_pred_rate")
model.update()
# Constraints on R:
# The sum of the first N elements (reactants) must be at most 2
model.addConstr(gp.quicksum(M[0,i] for i in range(N)) <= O, "reactant_sum_max")
# Auxiliary variable for the sum of reactant coefficients
cr = model.addVar(vtype=GRB.INTEGER, name="cr")
# Set cr to be equal to the sum of the reactant coefficients
model.addConstr(cr == gp.quicksum(M[0,i] for i in range(N)), name="cr_equals_reactant_sum")
# Define binary variable is_zero
is_zero = model.addVar(vtype=GRB.BINARY, name="is_zero")
# Add a constraint using a general constraint indicator to link is_zero with cr == 0
model.addGenConstrIndicator(is_zero, True, cr == 0, name="is_cr_zero")
# The sum of the products must be at most S*Cr, but at most 1 when Cr == 0
model.addConstr(gp.quicksum(M[0,i+N] for i in range(N)) <= is_zero + S * cr, name="product_sum_max")
consider_trends = False
if consider_trends:
# Species trends must be included
# Dictionary with trends, 1 per species:
# trend 0: species A
# first elements: Does A increase?
# Second elements: Does A decrease?
# trend 2: species B
# first elements: Does B increase?
# Second elements: Does B decrease?
DT = {0: [False, True], 1: [True, False]}
# Define a matrix to calculate the difference between coefficients (shape: N_r x N x 2)
DM = model.addVars(N_r, N, 2, vtype=GRB.BINARY, name="DM")
for ri in range(N_r):
for sj in range(N):
# Constraint for increase (inc)
model.addGenConstrIndicator(DM[ri, sj, 0], True, M[ri, N+sj] - M[ri, sj] >= 1, name=f"DA[{ri},{sj},{0}]")
# Constraint for decrease (dec)
model.addGenConstrIndicator(DM[ri, sj, 1], True, M[ri, sj] - M[ri, N+sj] >= 1, name=f"DB[{ri},{sj},{0}]")
# Ensure that both inc and dec cannot be 1 at the same time
model.addConstr(DM[ri, sj, 0] + DM[ri, sj, 1] <= 1, name=f"Species_{sj}_behavior_in_reaction_{ri}")
# Iterate over species and enforce the constraint for each Ti
for si in range(N):
Ti = DT[si]
# For each species, ensure that at least one reaction matches the specified behavior in Ti
# If Ti[0] is True, we expect at least one reaction where the species increases (DM[ri, si, 0] == 1)
# If Ti[1] is True, we expect at least one reaction where the species decreases (DM[ri, si, 1] == 1)
if Ti[0]: # Species must increase in at least one reaction
model.addConstr(gp.quicksum(DM[rj, si, 0] for rj in range(N_r)) >= 1, name=f"Ti_inc_{si}")
if Ti[1]: # Species must decrease in at least one reaction
model.addConstr(gp.quicksum(DM[rj, si, 1] for rj in range(N_r)) >= 1, name=f"Ti_dec_{si}")
# Add constraints for predicted rates
for i in range(n_steps):
#print(M[0,0],M[0,1])
#print(A_conc[i],B_conc[i])
#sys.exit()
log_A_conc = np.log(A_conc[i])
log_B_conc = np.log(B_conc[i])
#print(log_A_conc,log_B_conc)
# Add new auxiliary variables for the logarithmic terms
temp_A_log = model.addVar(lb=-GRB.INFINITY, ub=GRB.INFINITY, name=f"temp_A_log_{i}")
temp_B_log = model.addVar(lb=-GRB.INFINITY, ub=GRB.INFINITY, name=f"temp_B_log_{i}")
# Add constraints for the logarithmic terms
model.addConstr(temp_A_log == M[0,0] * log_A_conc, name=f"Constr_temp_A_log_{i}")
model.addConstr(temp_B_log == M[0,1] * log_B_conc, name=f"Constr_temp_B_log_{i}")
# Add new auxiliary variables for the final concentration terms (exponentials)
temp_C_A = model.addVar(lb=-GRB.INFINITY, ub=GRB.INFINITY, name=f"temp_C_A_{i}")
temp_C_B = model.addVar(lb=-GRB.INFINITY, ub=GRB.INFINITY, name=f"temp_C_B_{i}")
# Use Gurobi's exponential function to compute the final concentrations
model.addGenConstrExp(temp_A_log, temp_C_A, name=f"ExpA_temp_C_A_{i}")
model.addGenConstrExp(temp_B_log, temp_C_B, name=f"ExpA_temp_C_B_{i}")
# Now you can calculate the concentration of reactants
C_r = model.addVar(lb=0, ub=GRB.INFINITY, name=f"Conc_reactants_t{i}")
model.addConstr(C_r == temp_C_A * temp_C_B, name=f"C_r_C_A_C_B_{i}")
temp_Cr_DM_A = model.addVar(lb=-GRB.INFINITY, ub=GRB.INFINITY, name=f"temp_Cr_A_t{i}")
temp_Cr_DM_B = model.addVar(lb=-GRB.INFINITY, ub=GRB.INFINITY, name=f"temp_Cr_B_t{i}")
model.addConstr(temp_Cr_DM_A == C_r * (M[0,N+0]-M[0,0]), name=f"A_Conc_contrib_{i}")
model.addConstr(temp_Cr_DM_B == C_r * (M[0,N+1]-M[0,1]), name=f"B_Conc_contrib_{i}")
# Define pred_rates the constraints
model.addConstr(A_pred_rate[i] == K[0] * temp_Cr_DM_A, name=f"Constr_A_pred_{i}")
model.addConstr(B_pred_rate[i] == K[0] * temp_Cr_DM_B, name=f"Constr_B_pred_{i}")
#sys.exit()
# Objective: Minimize MSE between calculated and predicted derivatives
# Initialize quadratic expressions for mse_A and mse_B
mse_A = gp.QuadExpr()
mse_B = gp.QuadExpr()
# Populate mse_A and mse_B using the predicted rates
for i in range(n_steps):
# Calculate the squared error for A
expr_A = A_rate[i] - A_pred_rate[i] # Assuming A_rate[i] is a known value
mse_A += expr_A * expr_A # Use the quadratic expression for mse_A
# Calculate the squared error for B
expr_B = B_rate[i] - B_pred_rate[i] # Assuming B_rate[i] is a known value
mse_B += expr_B * expr_B # Use the quadratic expression for mse_B
# Set the objective to minimize the sum of MSE for both species
model.setObjective((mse_A + mse_B)/n_steps, GRB.MINIMIZE)
# Output optimization
model.write("model.lp")
# Optimize the model
model.optimize()
# Function to manually calculate the MSE
def calculate_mse(K_value, R_values):
mse = 0.0
for i in range(n_steps):
# Calculate predicted rates using optimized values
C_r = A_conc[i] ** R_values[0] * B_conc[i] ** R_values[1]
A_pred = K_value * A_conc[i] * (R_values[2] - R_values[0]) # A_pred_rate
B_pred = K_value * A_conc[i] * (R_values[3] - R_values[1]) # B_pred_rate
# Calculate squared errors
mse += (A_rate[i] - A_pred) ** 2
mse += (B_rate[i] - B_pred) ** 2
return mse / n_steps # Average MSE
# Output the optimized rate constant and reaction coefficients
if model.status == GRB.OPTIMAL:
print(f"Optimized rate constant K: {K[0].X}")
R_vals = [M[0,i].X for i in range(2 * N)]
print(f"Optimized reaction vector R: {R_vals}")
calculated_mse = calculate_mse(K[0].X, R_vals)
print(f"Calculated MSE: {calculated_mse}")
elif model.status == GRB.INFEASIBLE:
print("Model is infeasible. Check your constraints.")
# Constraint check
if 1==2:
# Check the constraints after optimization
for constr in model.getConstrs():
lhs_value = constr.getAttr("slack") + constr.getAttr("rhs")
rhs_value = constr.getAttr("rhs")
constr_sense = constr.getAttr("sense")
print(f"Constraint {constr.ConstrName}:")
print(f" LHS value: {lhs_value}")
print(f" RHS value: {rhs_value}")
if constr_sense == GRB.LESS_EQUAL:
print(f" Comparison: {lhs_value} <= {rhs_value}")
elif constr_sense == GRB.GREATER_EQUAL:
print(f" Comparison: {lhs_value} >= {rhs_value}")
else:
print(f" Comparison: {lhs_value} == {rhs_value}")
print()
# Plot concs and rates
if 1==2:
# Calc pred der
R_vals = [M[0,i].X for i in range(2 * N)]
#R_vals = [1, 0, 0, 1]
#print(R_vals)
A_pred, B_pred = [],[]
for i in range(n_steps):
# Calculate predicted rates using optimized values
A_pred.append(K[0].X * A_conc[i] * (R_vals[2] - R_vals[0])) # A_pred_rate
B_pred.append(K[0].X * A_conc[i] * (R_vals[3] - R_vals[1])) # B_pred_rate
plt.figure()
plt.plot(range(n_steps),A_rate.values(), "r", label = "dA/dt")
plt.plot(range(n_steps),B_rate.values(), "k", label = "dB/dt")
plt.plot(range(n_steps),A_pred, "r.-", label = "dA/dt")
plt.plot(range(n_steps),B_pred, "k.-", label = "dB/dt")
plt.legend()
plt.title("Calculated Derivative Plots")
plt.show()
plt.close()
-
The issue I am encountering is that even when I provide my system with the right reaction and rate constant as an initial guess, I get the same reaction (wrong) as when I don't.
Gurobi searches for a global solution to the model. A MIP start may help Gurobi find a global solution more quickly, but it does not affect the set of global solutions.
When I run your code, the optimal reaction vector is [1, 0, 1, 1] and the optimal objective value is ~0.017633. Is this not the optimal solution you expect? If so, what do you expect the optimal solution to be? I don't know anything about stoichiometry, but we could figure out from a mathematical modeling perspective why Gurobi isn't returning the solution you expect. If we fix the reaction to [1, 0, 0, 1], the best achievable objective value is ~0.035266.
I'm also getting some warnings about my constraints being violated, which I don't get where they are coming from.
By default, Gurobi versions 11.0.3 and earlier model nonlinear constraints using piecewise-linear approximations. While Gurobi's solution may satisfy all of the constraints of the approximation it builds, that solution may have errors with respect to the underlying nonlinear constraints you are actually modeling. You can read more about this in the documentation for Function Constraints.
Parameters like FuncPieceError, FuncPieceLength, FuncPieceRatio, and FuncPieces give you control over how Gurobi builds these piecewise-linear approximations. I see you are setting FuncPieces=1, though you may need to build tighter approximations if you want smaller constraint violations. Alternatively, I recommend instead simply setting the FuncNonlinear parameter to 1 - this directs Gurobi to handle the nonlinear constraints directly rather than approximating them. Note that FuncNonlinear=1 will be the default behavior starting in Gurobi 12.0.0, scheduled for release in mid-November.
Additionally, the log output displays the following warning:
Warning: Model contains variables with very large bounds participating
in nonlinear terms.
Presolve was not able to compute smaller bounds for these variables.
Consider bounding these variables or reformulating the model.It's very important to provide the tightest possible upper and lower bounds on all variables participating in nonlinear constraints and bilinear terms. This will generally help improve performance. If Gurobi is building piecewise-linear approximations of your nonlinear constraints, tighter variable bounds will limit the domain over which Gurobi has to build those approximations, which in general should lead to more accurate approximations and/or better performance. Are you able to add tight bounds to your \(\texttt{temp}\) variables?
0 -
Hi Eli Towle,
thank you so much for replying.
Sorry I did not make this clearer:# provide init guess if 1==2: # Suggested initial guess, for example [[1, 0, 0, 0]] initial_guess = [[1, 0, 0, 1]] # Adjust this for your actual size of N_r and 2*N # Set the initial guess for M for ri in range(N_r): for ci in range(2 * N): M[ri, ci].start = initial_guess[ri][ci] # Set the initial guess for K K[0].start = 0.3
If I were to allow this part of the script to run I'd be giving the system the actual solution. The correct reaction mechanism is [1,0,0,1] and the correct rate constant is 0.3. Is the system were to predict this exact values the MSE and thus the objective should be very close to 0 (a few sig figs lower than 0.017...). I was trying to say that I do not get that answer even when I allow the system to have the right answer for the MIP start.
Regarding FuncPieces=1, I meant to comment that out, I was changing the value to address the constraint violation issues and I forgot to remove it. When I did increase, even to 2, it'd result on the system taking VERY long and still not finding the answer.
Regarding tighter bounds around the temp variables, I did not think of making changes there because I'm implementing constraints to define their values, I never used them as "decision variables". I made change to the script to tighten their bounds to the temp variables as well as "K", but nothing changed. I'll keep tuning that.
Please let me know if you have any other questions or suggestions.0 -
The correct reaction mechanism is [1,0,0,1] and the correct rate constant is 0.3. Is the system were to predict this exact values the MSE and thus the objective should be very close to 0 (a few sig figs lower than 0.017...).
There is no feasible solution to the model satisfying:
K[0] = 0.3
M[0,0] = 1
M[0,1] = 0
M[0,2] = 0
M[0,3] = 1I fixed these values in the model and computed an irreducible infeasible subsystem (IIS). This is a subset of model constraints and variable bounds that together form an infeasible subsystem, but that subsystem becomes feasible if any individual constraint or bound is removed. That IIS helped explain why this solution is not feasible.
The model includes the following five constraints:
C_r_C_A_C_B_7: Conc_reactants_t7 = temp_C_A_7 * temp_C_B_7
A_Conc_contrib_7: temp_Cr_A_t7 + M[0,0] * Conc_reactants_t7 = M[0,2] * Conc_reactants_t7
Constr_A_pred_7: A_pred_rate[7] = K[0] * temp_Cr_A_t7
ExpA_temp_C_A_7: temp_C_A_7 = EXP ( temp_A_log_7 )
ExpA_temp_C_B_7: temp_C_B_7 = EXP ( temp_B_log_7 )Let's assume for contradiction there exists a feasible solution with \(\texttt{M[0,0]}=1\), \(\texttt{M[0,2]}=0\), and \(\texttt{K[0]}=0.3\). We can plug these three values into the constraints above to simplify the constraints:
C_r_C_A_C_B_7: Conc_reactants_t7 = temp_C_A_7 * temp_C_B_7
A_Conc_contrib_7: Conc_reactants_t7 = - temp_Cr_A_t7
Constr_A_pred_7: A_pred_rate[7] = 0.3 * temp_Cr_A_t7
ExpA_temp_C_A_7: temp_C_A_7 = EXP ( temp_A_log_7 )
ExpA_temp_C_B_7: temp_C_B_7 = EXP ( temp_B_log_7 )Next, let's substitute the definitions of \(\texttt{temp_C_A_7}\) and \(\texttt{temp_C_B_7}\) from the last two constraints into constraint \(\texttt{C_r_C_A_C_B_7}\):
C_r_C_A_C_B_7: Conc_reactants_t7 = EXP ( temp_A_log_7 ) * EXP ( temp_B_log_7 )
A_Conc_contrib_7: Conc_reactants_t7 = - temp_Cr_A_t7
Constr_A_pred_7: A_pred_rate[7] = 0.3 * temp_Cr_A_t7It follows from constraint \(\texttt{Constr_A_pred_7}\) that \(\texttt{temp_Cr_A_t7} = (10 / 3) \cdot \texttt{A_pred_rate[7]}\). We substitute this value into constraint \(\texttt{A_Conc_contrib_7}\) to get
C_r_C_A_C_B_7: Conc_reactants_t7 = EXP ( temp_A_log_7 ) * EXP ( temp_B_log_7 )
A_Conc_contrib_7: Conc_reactants_t7 = - (10 / 3) * A_pred_rate[7]Now, we have two different expressions, both of which are equal to variable \( \texttt{Conc_reactants_t7} \). These expressions must therefore equal each other:
EXP ( temp_A_log_7 ) * EXP ( temp_B_log_7 ) = - (10 / 3) * A_pred_rate[7]
The left-hand side of this constraint is strictly positive, regardless of the values of \(\texttt{temp_A_log_7}\) and \(\texttt{temp_B_log_7}\). However, \(\texttt{A_pred_rate[7]}\) is defined to be a non-negative variable, so the right-hand side of this constraint is non-positive. This is a contradiction. Thus, no solution to the model can satisfy \(\texttt{M[0,0]}=1\), \(\texttt{M[0,2]}=0\), and \(\texttt{K[0]}=0.3\).
Did you intend for the \(\texttt{A_pred_rate}\) and \(\texttt{B_pred_rate}\) variables to be non-negative? The default lower bound on variables is 0. The fix might be as simple as adding the keyword argument \(\texttt{lb=-GRB.INFINITY}\) to the corresponding Model.addVars() calls.
0 -
Eli Towle It was indeed that simple! Thank you so so much for your help.
I did not know that IIS was a thing! Great to have another tool at my disposal.0
Please sign in to leave a comment.
Comments
4 comments