Estimate a Coefficient from Objective Function so Optimization Result Fits Data (Tuning Model)
OngoingI am working on a problem that is a modified version of a two-knapsack knapsack problem. I am able to find the optimal choices by using Gurobi. However, I would like to estimate a coefficient that is added linearly to my objective function so that the results most closely match my data. I will describe the problem below, but the thoughts I have had so far:
-
Brute force/use grid search to try several parameter values by reoptimizing each time. This, however, would be computationally too expensive to be feasible.
-
Use Gurobi's multiple scenarios option. However, this suffers from a similar problem, as the coefficient can take any real value (although it realistically will be between roughly [-10,10]). This option also feels too computationally expensive/would lack precision unless I could try different scenarios based on the results of previous scenarios (ie to focus on trying values closest to the results with the best outcomes.
-
Setting up a bilevel optimization problem by minimizing the sum of squared errors, where the values come from the first-level optimization problem. Then use a Kuhn-Tucker/Lagrangean transformation. However, I have been having trouble with this approach.
-
Any other options that people can think of would be incredibly helpful!
The problem involves many schools needing to be placed into one of two knapsacks, X or Y. By being placed in either knapsack, they will get value from data, wi, and will get no value by being out of the knapsacks. Knapsack Y must have an average weight (average of the w's of the schools in the knapsack) above 0.7, and knapsack X must have an average weight below 0.7. Additionally, the value a school gives from being in knapsack Y is 0.7. However, the value from being in knapsack X is the average weight (again, the average of all w's in the knapsack). This problem I can solve in Gurobi.
However, I want to assume that a school has a value from being in either knapsack, theta. I am interested in the minimum theta required for it to be optimal for a school to be placed in a knapsack (or maximum such that it won't be in a knapsack). I want to estimate theta such that my model's prediction if a school is in any knapsack most closely matches the reality from the data. I would be satisfied estimating a theta for each school or one shared theta for the entire problem.
Below is my formulation of the single-level and bilevel problems. A is the average weight of the X knapsack, Xi and Yi represent being placed in a knapsack, and for each school, their sum has to be less than or equal to 1 (can only go in at most one knapsack). Pi represents "participating" or being placed into either knapsack. Pi is observed in my data, and I am trying to minimize the squared error from comparing my model's prediction of going into a knapsack (Xi + Yi) to the data, Pi. As Pi, Xi, and Yi are binary, equation 2a represents the sum of squared errors.
Does anyone have any thoughts on how I could go about estimating theta to match my data? Either a method to reformulate the problem, a trick Gurobi can do, tips for the Kuhn-Tucker reformulation, or something different would be greatly appreciated. Thanks ahead of time!
-
Hi Jack,
I don't have any advice regarding estimating theta, hopefully someone else will, but I wanted to quickly point out that your original problem can be formulated without non-linear constraints:
\begin{eqnarray}
\max && \sum_{i=1}^n w_iX_i + 0.7Y_i + (X_i + Y_i)\theta\\
s.t. &&\\
&& \sum_{i=1}^n (0.7 - w_i)X_i \geq 0\\
&& \sum_{i=1}^n (w_i - 0.7)Y_i \geq 0\\
&& X_i + Y_i \leq 1, \quad \forall i\\
\end{eqnarray}There aren't any tricks here, just rearranging the constraints.
- Riley
0
Please sign in to leave a comment.
Comments
1 comment