Principal component analysis in Gurobi as a non convex problem. Inaccurate solution
AnsweredI am programming PCA in Gurobi python as a non convex optimization problem. I know that PCA has an analytical solution much easier to calculate but I am interested in programming this as a first step in further research. I think I have programmed it correctly but the solution I achieve using Gurobi is far from the true PCA solution. I am wondering if I am doing everything correctly, and if so, if there is a way to improve the solution.
The PCA as a maximization problem solves:
$$max_w (w^t X^tX w) \quad s.t. \|w\|_2=1 $$
So far this is my code:
import gurobipy as gp
from gurobipy import GRB
import numpy as np
from sklearn.datasets import load_boston
from itertools import product
from sklearn.decomposition import PCA
boston = load_boston()
x = boston.data
x = x[:, [0, 2, 4, 5, 6, 7, 10, 11, 12]] # select non-categorical variables
n, p = x.shape
# Comparison against actual PCA
pca = PCA(n_components=9)
t_pca = pca.fit_transform(x) # scores
p_pca = pca.components_.T # loadings
pca_gurobi = gp.Model()
weights = pca_gurobi.addVars(p, lb=-GRB.INFINITY, name="weights") # pca loading
Quad = np.dot(x.T, x)
obj_pca = (1/n)*sum(Quad[i, j] * weights[i] * weights[j] for i, j in product(range(p), repeat=2))
pca_gurobi.addConstr(sum(weights[i]**2 for i in range(p)) == 1)
pca_gurobi.setObjective(obj_pca, GRB.MAXIMIZE)
pca_gurobi.params.NonConvex = 2 # Let gurobi know that constraint is non convex
# Try to improve the results giving the actual solution as a warm start for the algorithm.
warm_start = True
if warm_start:
for i in range(p):
weights[i].start = p_pca[i, 0]
pca_gurobi.optimize()
weights_pca_gurobi = np.array([weights[i].X for i in range(p)])
print(weights_pca_gurobi)
print(p_pca[:, 0])
Here I am using a well known dataset freely available in Python. Observe that I even plug the true solution as a warm start value for Gurobi, but even so, the solution is much worse. This code provides the following solutions:
# Gurobi solution
[-7.58726104e-03 -2.71715230e-02 -7.74163052e-04 -1.64739499e-02
-1.74272628e-01 -1.04578219e-02 -4.72430072e-02 -9.82475074e-01
-3.10925661e-02]
# Objective value achieved:
140361.54
# True solution
[ 3.71319330e-02 2.81315994e-02 5.06471961e-04 -1.04531379e-03
9.29430232e-02 -7.19608202e-03 4.37890979e-03 -9.94094915e-01
2.99068438e-02]
# Objective value achieved:
176900.33
Additionally, I have a second question on this regard. I would like to include an L1 norm as an additional constrain, this is, the new problem should be:
$$max_w (w^t X^tX w) \quad s.t. \|w\|_2=1, \quad \|w\|_1 \leq t $$
where t is a parameter whose value is predefined by the researcher. I have been struggling with this for a while now, trying to use the abs_ helper function but I dont get to solve it. I do not know if this should better be in a second post but since both questions regard the same problem...
-
Official comment
Dear Álvaro,
Recall that PCA is not scale invariant. It's very important to check whether a specific implementation centers and/or scales the data matrix before finding the principal components. As stated in Scikit-learn's website: "The input data is centered but not scaled for each feature before applying the SVD (Singular Value Decomposition)."
Therefore, you must subtract the sample average from each column as follows:
x = x - np.mean(x, 0)
Having this change in place, you'll see that the optimal solution found by Gurobi for the first principal component agrees with Scikit-learn.
-
Official comment
This post is more than three years old. Some information may not be up to date. For current information, please check the Gurobi Documentation or Knowledge Base. If you need more help, please create a new post in the community forum. Or why not try our AI Gurobot?.
Post is closed for comments.
Comments
2 comments