Column Generation Explanation
AnsweredHello,
I have a general question about column generation for largescale systems. I understand the Master problem and the Restricted master problem for the subset of variables. After I set up the RMP, I find the duals of the system. I then set up the pricing problem and find the minimum reduced cost to determine next entering variable into the system.
My question is about the subsets of variables. The pricing problem is only selecting the next column from the subset of variables, at what point is the subset modified to cover the other variables? My problem has roughly 114 million decision variables, with 113 binary. In my initial subset I include about 230,000 DVs and can iterate through that fairly quickly.
Thanks in advance,
Taylor

I am not sure whether I understand your question correctly, but maybe the following explanation is of help.
First, Gurobi does not support branchandprice, so you cannot solve an MIP to optimality by adding variables with negative reduced costs on the fly.
What you can do is to solve the LP relaxation of your model with column generation. Here, you think of solving your big problem with the 114 million variables, but you implicitly fix many of them to zero (by just not adding them to the model). This model with those variables "fixed to zero" is called the restricted master problem (RMP). You solve this LP. Now, if all of those variables that you fixed to zero (by not having them in the model) have nonnegative (for a minimization problem) reduced costs, then the RMP solution is optimal also for the full problem.
In order to check whether there exists a variable with negative reduced costs, you have to solve the pricing problem. This is very problem specific and depends on the application. Often, this boils down to solving a shortest path problem in a graph with the edge weights being the duals of the RMP solution. But of course, there are also applications in which the pricing problem is completely different.
Now, if there is a variable with negative reduced costs, you have to add at least one of them to the RMP and solve the RMP again to find a new solution to the RMP. Then, iterate the whole process.
But of course, if you only add just one variable to the RMP in each iteration, the whole process may be very slow as you may need a lot of iterations. For this reason, it is often beneficial to add many negative reduced cost variables to the RMP in each iteration. Whether you are actually able to find multiple such variables in your pricing procedure depends on your application and the pricing algorithm that you have.
Best regards,
Tobias
1 
Tobias,
Thank you for the quick reply and excellent explanation. A couple additional questions. So my question comes up with the pricing problem then. All of these "fixed to zero" variables must now checked by solving the pricing problem correct?
My problem is a resource allocation problem. I use DW decomp to reduce the number of constraints and end up with tons of binary variables representing different portfolios of resources that can be implemented. The master problem and RMP are then essentially a set partitioning formulation where a binary variable equals 1 if that portfolio is selected to be in the solution.
The dual of the LP relaxation is straight forward enough. But having a pricing problem with 114 million variables seems impossible. Please let me know if i'm on track with the idea of the pricing problem or if i'm making it too difficult.
Thank you,
Taylor
0 
Hi Taylor,
column generation really is an art. Typically, you cannot just say "these are too many variables, let's ignore some and price them in later". This approach would be very similar to the socalled "partial pricing" strategy, that one can find in primal simplex codes. Another related technique is the socalled "sifting algorithm". So, if you just do that (solve RMP, calculate reduced costs of all fixed variables by multiplying the dual solution to the matrix), then there is usually no point in doing it yourself. Just add all variables right away to the LP and hope that the LP solver deals with this appropriately. Gurobi uses partial pricing and sifting automatically, if it thinks that these techniques will be useful given the dimensions of the constraint matrix.
The ideal case for column generation is that you have a way to find a variable of negative reduced costs by means of a very fast (usually combinatorial) algorithm. As I said, the typical case is the shortest path algorithm. In many routing or logistics column generation applications, a variable represents a path through a network. The constraints relate to the nodes and/or the edges in the network. Then, one can often represent the pricing problem as a shortest path problem with the duals of the constraints used as length of the edges in the network. From all the millions of paths that are present in the network, a shortest path algorithm can efficiently find a shortest one, which yields a variable of smallest reduced costs.
DantzigWolfe decomposition is often powerful if you can exploit the structure of the individual blocks in your pricing problem. Again, this often means to apply a shortest path or some other combinatorial algorithm. But as I said, it is very dependent on the structure of the problem. There is no allfitsone solution, it is highly problem dependent.
For this reason, I also cannot give you an answer on how to solve your pricing problem.
Regards,
Tobias
0 
Tobias,
Your explanations have been extremely helpful and I think definitely pointed me in the right direction.
Thank you!
Taylor
0 
My question is, while the RMP is a minimization problem and a new variable with its corresponding coefficients is going to be added to the model, how will the constraints and objective function be changed?
0 
Hi Shahrzad,
In gurobipy you would use Column().
Most of our other APIs have similar functionality.
 Riley
0 
Thank you Riley for your response. To clarify:
I have a model that is infeasible, and I'm trying to solve it using the TwoPhase method. At the end of Phase One, the summation of artificial variables is not zero, which I'm trying to reduce to zero by solving the pricing model and adding new variables. If the objective value of the newly generated variable in the pricing model is
sub_objective_value
, should I update the objective function of Phase One (of the TwoPhase model) as follows?If the objective function of Two phase is:
def set_phase1_objective(model, R):
phase1_obj = LinExpr()
for r in range(len(R)):
phase1_obj += R[r]
model.setObjective(phase1_obj, GRB.MINIMIZE)Updated objective function would be:
def set_phase1_objective(model, R):
phase1_obj = LinExpr()
for r in range(len(R)):
phase1_obj += R[r]
phase1_obj += new_var * sub_objective_value.getValue()model.setObjective(phase1_obj, GRB.MINIMIZE)
0 
Hi Shahrzad,
If it's a matter of updating an existing objective then I would use the obj parameter when defining a new variable. This will add the variables with the coefficient defined by obj, eg
>> import gurobipy as gp
>> m = gp.Model()
>> x = m.addVar(name="var_x")
>> m.setObjective(2*x, gp.GRB.MAXIMIZE)
>> y = m.addVar(vtype="B", obj=3, name="var_y")
>> m.update()
>> print(m.getObjective())
2.0 var_x + 3.0 var_yOr the following is more or less equivalent:
>> y = m.addVar(vtype="B", name="var_y")
>> y.obj = 3Either way though you would want to avoid recreating the objective from scratch each time.
 Riley
0 
Thank you so much Riley for your great comment and guidance.
0
Please sign in to leave a comment.
Comments
9 comments