Column Generation Explanation
AnsweredHello,
I have a general question about column generation for largescale systems. I understand the Master problem and the Restricted master problem for the subset of variables. After I set up the RMP, I find the duals of the system. I then set up the pricing problem and find the minimum reduced cost to determine next entering variable into the system.
My question is about the subsets of variables. The pricing problem is only selecting the next column from the subset of variables, at what point is the subset modified to cover the other variables? My problem has roughly 114 million decision variables, with 113 binary. In my initial subset I include about 230,000 DVs and can iterate through that fairly quickly.
Thanks in advance,
Taylor

I am not sure whether I understand your question correctly, but maybe the following explanation is of help.
First, Gurobi does not support branchandprice, so you cannot solve an MIP to optimality by adding variables with negative reduced costs on the fly.
What you can do is to solve the LP relaxation of your model with column generation. Here, you think of solving your big problem with the 114 million variables, but you implicitly fix many of them to zero (by just not adding them to the model). This model with those variables "fixed to zero" is called the restricted master problem (RMP). You solve this LP. Now, if all of those variables that you fixed to zero (by not having them in the model) have nonnegative (for a minimization problem) reduced costs, then the RMP solution is optimal also for the full problem.
In order to check whether there exists a variable with negative reduced costs, you have to solve the pricing problem. This is very problem specific and depends on the application. Often, this boils down to solving a shortest path problem in a graph with the edge weights being the duals of the RMP solution. But of course, there are also applications in which the pricing problem is completely different.
Now, if there is a variable with negative reduced costs, you have to add at least one of them to the RMP and solve the RMP again to find a new solution to the RMP. Then, iterate the whole process.
But of course, if you only add just one variable to the RMP in each iteration, the whole process may be very slow as you may need a lot of iterations. For this reason, it is often beneficial to add many negative reduced cost variables to the RMP in each iteration. Whether you are actually able to find multiple such variables in your pricing procedure depends on your application and the pricing algorithm that you have.
Best regards,
Tobias
1 
Tobias,
Thank you for the quick reply and excellent explanation. A couple additional questions. So my question comes up with the pricing problem then. All of these "fixed to zero" variables must now checked by solving the pricing problem correct?
My problem is a resource allocation problem. I use DW decomp to reduce the number of constraints and end up with tons of binary variables representing different portfolios of resources that can be implemented. The master problem and RMP are then essentially a set partitioning formulation where a binary variable equals 1 if that portfolio is selected to be in the solution.
The dual of the LP relaxation is straight forward enough. But having a pricing problem with 114 million variables seems impossible. Please let me know if i'm on track with the idea of the pricing problem or if i'm making it too difficult.
Thank you,
Taylor
0 
Hi Taylor,
column generation really is an art. Typically, you cannot just say "these are too many variables, let's ignore some and price them in later". This approach would be very similar to the socalled "partial pricing" strategy, that one can find in primal simplex codes. Another related technique is the socalled "sifting algorithm". So, if you just do that (solve RMP, calculate reduced costs of all fixed variables by multiplying the dual solution to the matrix), then there is usually no point in doing it yourself. Just add all variables right away to the LP and hope that the LP solver deals with this appropriately. Gurobi uses partial pricing and sifting automatically, if it thinks that these techniques will be useful given the dimensions of the constraint matrix.
The ideal case for column generation is that you have a way to find a variable of negative reduced costs by means of a very fast (usually combinatorial) algorithm. As I said, the typical case is the shortest path algorithm. In many routing or logistics column generation applications, a variable represents a path through a network. The constraints relate to the nodes and/or the edges in the network. Then, one can often represent the pricing problem as a shortest path problem with the duals of the constraints used as length of the edges in the network. From all the millions of paths that are present in the network, a shortest path algorithm can efficiently find a shortest one, which yields a variable of smallest reduced costs.
DantzigWolfe decomposition is often powerful if you can exploit the structure of the individual blocks in your pricing problem. Again, this often means to apply a shortest path or some other combinatorial algorithm. But as I said, it is very dependent on the structure of the problem. There is no allfitsone solution, it is highly problem dependent.
For this reason, I also cannot give you an answer on how to solve your pricing problem.
Regards,
Tobias
0 
Tobias,
Your explanations have been extremely helpful and I think definitely pointed me in the right direction.
Thank you!
Taylor
0
Please sign in to leave a comment.
Comments
4 comments