Principal component analysis in Gurobi as a non convex problem. Inaccurate solution
AnsweredI am programming PCA in Gurobi python as a non convex optimization problem. I know that PCA has an analytical solution much easier to calculate but I am interested in programming this as a first step in further research. I think I have programmed it correctly but the solution I achieve using Gurobi is far from the true PCA solution. I am wondering if I am doing everything correctly, and if so, if there is a way to improve the solution.
The PCA as a maximization problem solves:
$$max_w (w^t X^tX w) \quad s.t. \w\_2=1 $$
So far this is my code:
import gurobipy as gp
from gurobipy import GRB
import numpy as np
from sklearn.datasets import load_boston
from itertools import product
from sklearn.decomposition import PCA
boston = load_boston()
x = boston.data
x = x[:, [0, 2, 4, 5, 6, 7, 10, 11, 12]] # select noncategorical variables
n, p = x.shape
# Comparison against actual PCA
pca = PCA(n_components=9)
t_pca = pca.fit_transform(x) # scores
p_pca = pca.components_.T # loadings
pca_gurobi = gp.Model()
weights = pca_gurobi.addVars(p, lb=GRB.INFINITY, name="weights") # pca loading
Quad = np.dot(x.T, x)
obj_pca = (1/n)*sum(Quad[i, j] * weights[i] * weights[j] for i, j in product(range(p), repeat=2))
pca_gurobi.addConstr(sum(weights[i]**2 for i in range(p)) == 1)
pca_gurobi.setObjective(obj_pca, GRB.MAXIMIZE)
pca_gurobi.params.NonConvex = 2 # Let gurobi know that constraint is non convex
# Try to improve the results giving the actual solution as a warm start for the algorithm.
warm_start = True
if warm_start:
for i in range(p):
weights[i].start = p_pca[i, 0]
pca_gurobi.optimize()
weights_pca_gurobi = np.array([weights[i].X for i in range(p)])
print(weights_pca_gurobi)
print(p_pca[:, 0])
Here I am using a well known dataset freely available in Python. Observe that I even plug the true solution as a warm start value for Gurobi, but even so, the solution is much worse. This code provides the following solutions:
# Gurobi solution
[7.58726104e03 2.71715230e02 7.74163052e04 1.64739499e02
1.74272628e01 1.04578219e02 4.72430072e02 9.82475074e01
3.10925661e02]
# Objective value achieved:
140361.54
# True solution
[ 3.71319330e02 2.81315994e02 5.06471961e04 1.04531379e03
9.29430232e02 7.19608202e03 4.37890979e03 9.94094915e01
2.99068438e02]
# Objective value achieved:
176900.33
Additionally, I have a second question on this regard. I would like to include an L1 norm as an additional constrain, this is, the new problem should be:
$$max_w (w^t X^tX w) \quad s.t. \w\_2=1, \quad \w\_1 \leq t $$
where t is a parameter whose value is predefined by the researcher. I have been struggling with this for a while now, trying to use the abs_ helper function but I dont get to solve it. I do not know if this should better be in a second post but since both questions regard the same problem...

Official comment
Dear Álvaro,
Recall that PCA is not scale invariant. It's very important to check whether a specific implementation centers and/or scales the data matrix before finding the principal components. As stated in Scikitlearn's website: "The input data is centered but not scaled for each feature before applying the SVD (Singular Value Decomposition)."
Therefore, you must subtract the sample average from each column as follows:
x = x  np.mean(x, 0)
Having this change in place, you'll see that the optimal solution found by Gurobi for the first principal component agrees with Scikitlearn.
Please sign in to leave a comment.
Comments
1 comment