GCP-OPT Examples with Amino Acids Dataset

>> Tensor Toolbox >> CP Decompositions >> GCP-OPT >> GCP-OPT and Amino Acids Dataset

Setup
CP-ALS
GCP with Gaussian
GCP with Gaussian and Missing Data
GCP with ADAM
GCP with Gamma (terrible!)
GCP with Huber + Lower Bound
GCP with Beta

Setup

We use the well known amino acids dataset for some tests. This data has some negative values, but the factorization itself should be nonnegative.

rng('default'); %<- Setting random seed for reproducibility of this script

% Load the data
load(fullfile(getfield(what('tensor_toolbox'),'path'),'doc','aminoacids.mat'))

clear M fit

vizopts = {'PlotCommands',{@bar,@(x,y) plot(x,y,'r'),@(x,y) plot(x,y,'g')},...
    'BottomSpace',0.1, 'HorzSpace', 0.04, 'Normalize', @(x) normalize(x,'sort',2)};

CP-ALS

Just a reminder of what CP-ALS does.

cnt = 1;

tic, M{cnt} = cp_als(X,3,'printitn',10); toc

fit(cnt) = 1 - norm(full(M{cnt})-X)/norm(X);
fprintf('Fit: %g\n', fit(cnt));

viz(M{cnt},'Figure',cnt,vizopts{:});

CP_ALS:
 Iter 10: f = 9.072207e-01 f-delta = 5.0e-02
 Iter 20: f = 9.713716e-01 f-delta = 6.0e-04
 Iter 30: f = 9.742235e-01 f-delta = 1.3e-04
 Iter 32: f = 9.744253e-01 f-delta = 9.3e-05
 Final f = 9.744253e-01 
Elapsed time is 0.111475 seconds.
Fit: 0.974425

GCP with Gaussian

We can instead call the GCP with the Gaussian function.

cnt = 2;
M{cnt} = gcp_opt(X,3,'type','Gaussian','printitn',10);

fit(cnt) = 1 - norm(full(M{cnt})-X)/norm(X);
fprintf('Fit: %g\n', fit(cnt));

viz(M{cnt},'Figure',cnt,vizopts{:});

GCP-OPT-LBFGSB (Generalized CP Tensor Decomposition)

Tensor size: 5 x 201 x 61 (61305 total entries)
Generalized function Type: Gaussian
Objective function: @(x,m)(m-x).^2
Gradient function: @(x,m)2.*(m-x)
Lower bound of factor matrices: -Inf
Optimization method: lbfgsb
Max iterations: 1000
Projected gradient tolerance: 6.131

Begin Main loop
Iter    10, f(x) = 1.606769e+08, ||grad||_infty = 4.59e+06
Iter    20, f(x) = 1.688031e+06, ||grad||_infty = 1.64e+05
Iter    30, f(x) = 1.448515e+06, ||grad||_infty = 1.75e+04
Iter    40, f(x) = 1.445322e+06, ||grad||_infty = 3.71e+03
Iter    50, f(x) = 1.445120e+06, ||grad||_infty = 1.88e+03
Iter    60, f(x) = 1.445110e+06, ||grad||_infty = 1.18e+02
Iter    66, f(x) = 1.445110e+06, ||grad||_infty = 2.15e+01
End Main Loop

Final objective: 1.4451e+06
Setup time: 0.05 seconds
Main loop time: 0.21 seconds
Outer iterations: 66
Total iterations: 142
L-BFGS-B Exit message: CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH.
Fit: 0.974951

GCP with Gaussian and Missing Data

What if some data is missing?

cnt = 3;

% Proportion of missing data
p = 0.35;

% Create a mask with the missing entries set to 0 and everything else 1
W = tensor(double(rand(size(X))>p));

% Fit the model, using the 'mask' option
M{cnt} = gcp_opt(X.*W,3,'type','Gaussian','mask',W,'printitn',10);

fit(cnt) = 1 - norm(full(M{cnt})-X)/norm(X);
fprintf('Fit: %g\n', fit(cnt));

viz(M{cnt},'Figure',cnt,vizopts{:});

GCP-OPT-LBFGSB (Generalized CP Tensor Decomposition)

Tensor size: 5 x 201 x 61 (61305 total entries)
Missing entries: 21604 (35%)
Generalized function Type: Gaussian
Objective function: @(x,m)(m-x).^2
Gradient function: @(x,m)2.*(m-x)
Lower bound of factor matrices: -Inf
Optimization method: lbfgsb
Max iterations: 1000
Projected gradient tolerance: 6.131

Begin Main loop
Iter    10, f(x) = 1.218345e+07, ||grad||_infty = 1.11e+06
Iter    20, f(x) = 1.215698e+06, ||grad||_infty = 9.32e+04
Iter    30, f(x) = 9.243577e+05, ||grad||_infty = 5.99e+03
Iter    40, f(x) = 9.236897e+05, ||grad||_infty = 7.31e+02
Iter    50, f(x) = 9.236578e+05, ||grad||_infty = 1.62e+02
Iter    60, f(x) = 9.236571e+05, ||grad||_infty = 1.14e+02
Iter    62, f(x) = 9.236571e+05, ||grad||_infty = 1.34e+01
End Main Loop

Final objective: 9.2366e+05
Setup time: 0.01 seconds
Main loop time: 0.20 seconds
Outer iterations: 62
Total iterations: 133
L-BFGS-B Exit message: CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH.
Fit: 0.974844

GCP with ADAM

We can also use stochastic gradient, though it's pretty slow for such a small tensor.

cnt = 4;

% Specify 'opt' = 'adam'
M{cnt} = gcp_opt(X,3,'type','Gaussian','opt','adam','printitn',1,'fsamp',5000,'gsamp',500);

fit(cnt) = 1 - norm(full(M{cnt})-X)/norm(X);
fprintf('Fit: %g\n', fit(cnt));

viz(M{cnt},'Figure',cnt,vizopts{:});

GCP-OPT-ADAM (Generalized CP Tensor Decomposition)

Tensor size: 5 x 201 x 61 (61305 total entries)
Generalized function Type: Gaussian
Objective function: @(x,m)(m-x).^2
Gradient function: @(x,m)2.*(m-x)
Lower bound of factor matrices: -Inf
Optimization method: adam
Max iterations (epochs): 1000
Iterations per epoch: 1000
Learning rate / decay / maxfails: 0.001 0.1 1
Function Sampler: uniform with 5000 samples
Gradient Sampler: uniform with 500 samples

Begin Main loop
Initial f-est: 2.376253e+09
Epoch  1: f-est = 1.562517e+09, step = 0.001
Epoch  2: f-est = 1.184782e+09, step = 0.001
Epoch  3: f-est = 9.455418e+08, step = 0.001
Epoch  4: f-est = 7.906016e+08, step = 0.001
Epoch  5: f-est = 6.720333e+08, step = 0.001
Epoch  6: f-est = 5.629400e+08, step = 0.001
Epoch  7: f-est = 4.587569e+08, step = 0.001
Epoch  8: f-est = 3.660608e+08, step = 0.001
Epoch  9: f-est = 2.845412e+08, step = 0.001
Epoch 10: f-est = 2.102683e+08, step = 0.001
Epoch 11: f-est = 1.430214e+08, step = 0.001
Epoch 12: f-est = 8.715441e+07, step = 0.001
Epoch 13: f-est = 4.704066e+07, step = 0.001
Epoch 14: f-est = 2.347157e+07, step = 0.001
Epoch 15: f-est = 1.150061e+07, step = 0.001
Epoch 16: f-est = 6.137558e+06, step = 0.001
Epoch 17: f-est = 4.016841e+06, step = 0.001
Epoch 18: f-est = 3.055556e+06, step = 0.001
Epoch 19: f-est = 2.555754e+06, step = 0.001
Epoch 20: f-est = 2.198882e+06, step = 0.001
Epoch 21: f-est = 1.909576e+06, step = 0.001
Epoch 22: f-est = 1.725434e+06, step = 0.001
Epoch 23: f-est = 1.634091e+06, step = 0.001
Epoch 24: f-est = 1.572912e+06, step = 0.001
Epoch 25: f-est = 1.558587e+06, step = 0.001
Epoch 26: f-est = 1.547261e+06, step = 0.001
Epoch 27: f-est = 1.540582e+06, step = 0.001
Epoch 28: f-est = 1.553539e+06, step = 0.001, nfails = 1 (resetting to solution from last epoch)
Epoch 29: f-est = 1.533630e+06, step = 0.0001
Epoch 30: f-est = 1.534586e+06, step = 0.0001, nfails = 2 (resetting to solution from last epoch)
End Main Loop

Final f-est: 1.5336e+06
Setup time: 0.03 seconds
Main loop time: 25.05 seconds
Total iterations: 30000
Fit: 0.974923

GCP with Gamma (terrible!)

We can try Gamma, but it's not really the right distribution and produces a terrible result.

cnt = 5;

Y = tensor(X(:) .* (X(:) > 0), size(X));
M{cnt} = gcp_opt(Y,3,'type','Gamma','printitn',25);

fit(cnt) = 1 - norm(full(M{cnt})-X)/norm(X);
fprintf('Fit: %g\n', fit(cnt));

viz(M{cnt},'Figure',cnt,vizopts{:});

Warning: Using 'Gamma' type but tensor X
is not nonnegative 

GCP-OPT-LBFGSB (Generalized CP Tensor Decomposition)

Tensor size: 5 x 201 x 61 (61305 total entries)
Generalized function Type: Gamma
Objective function: @(x,m)x./(m+1e-10)+log(m+1e-10)
Gradient function: @(x,m)-x./((m+1e-10).^2)+1./(m+1e-10)
Lower bound of factor matrices: 0
Optimization method: lbfgsb
Max iterations: 1000
Projected gradient tolerance: 6.131

Begin Main loop
Iter    25, f(x) = 3.083247e+05, ||grad||_infty = 2.12e+03
Iter    50, f(x) = 3.061446e+05, ||grad||_infty = 1.57e+03
Iter    75, f(x) = 3.042764e+05, ||grad||_infty = 3.33e+03
Iter   100, f(x) = 3.030117e+05, ||grad||_infty = 2.58e+03
Iter   125, f(x) = 3.020461e+05, ||grad||_infty = 2.82e+03
Iter   150, f(x) = 3.013471e+05, ||grad||_infty = 2.13e+03
Iter   175, f(x) = 2.999417e+05, ||grad||_infty = 7.65e+03
Iter   200, f(x) = 2.978982e+05, ||grad||_infty = 2.87e+04
Iter   225, f(x) = 2.963939e+05, ||grad||_infty = 5.74e+03
Iter   250, f(x) = 2.954331e+05, ||grad||_infty = 9.36e+03
Iter   275, f(x) = 2.943538e+05, ||grad||_infty = 3.05e+03
Iter   300, f(x) = 2.931734e+05, ||grad||_infty = 3.03e+03
Iter   325, f(x) = 2.924773e+05, ||grad||_infty = 3.29e+03
Iter   350, f(x) = 2.916153e+05, ||grad||_infty = 2.73e+03
Iter   375, f(x) = 2.906693e+05, ||grad||_infty = 1.14e+04
Iter   400, f(x) = 2.894365e+05, ||grad||_infty = 2.82e+03
Iter   425, f(x) = 2.886059e+05, ||grad||_infty = 3.85e+03
Iter   450, f(x) = 2.878322e+05, ||grad||_infty = 3.86e+03
Iter   475, f(x) = 2.871707e+05, ||grad||_infty = 8.92e+03
Iter   500, f(x) = 2.868003e+05, ||grad||_infty = 8.36e+03
Iter   525, f(x) = 2.863173e+05, ||grad||_infty = 6.77e+03
Iter   550, f(x) = 2.857842e+05, ||grad||_infty = 5.46e+03
Iter   575, f(x) = 2.852310e+05, ||grad||_infty = 4.58e+03
Iter   600, f(x) = 2.847018e+05, ||grad||_infty = 2.46e+03
Iter   625, f(x) = 2.844237e+05, ||grad||_infty = 2.23e+03
Iter   650, f(x) = 2.840324e+05, ||grad||_infty = 2.68e+03
Iter   675, f(x) = 2.831906e+05, ||grad||_infty = 6.25e+03
Iter   700, f(x) = 2.815585e+05, ||grad||_infty = 3.00e+03
Iter   725, f(x) = 2.806378e+05, ||grad||_infty = 3.05e+03
Iter   750, f(x) = 2.800926e+05, ||grad||_infty = 2.42e+03
Iter   775, f(x) = 2.798322e+05, ||grad||_infty = 2.42e+03
Iter   800, f(x) = 2.795942e+05, ||grad||_infty = 2.36e+03
Iter   825, f(x) = 2.793289e+05, ||grad||_infty = 2.59e+03
Iter   850, f(x) = 2.791985e+05, ||grad||_infty = 2.53e+03
Iter   875, f(x) = 2.789471e+05, ||grad||_infty = 3.60e+03
Iter   900, f(x) = 2.785491e+05, ||grad||_infty = 3.91e+03
Iter   925, f(x) = 2.781030e+05, ||grad||_infty = 4.35e+03
Iter   950, f(x) = 2.776572e+05, ||grad||_infty = 1.69e+03
Iter   975, f(x) = 2.773212e+05, ||grad||_infty = 3.80e+03
Iter  1000, f(x) = 2.769849e+05, ||grad||_infty = 1.81e+03
End Main Loop

Final objective: 2.7698e+05
Setup time: 0.02 seconds
Main loop time: 3.52 seconds
Outer iterations: 1000
Total iterations: 2156
L-BFGS-B Exit message: UNRECOGNIZED EXIT FLAG
Fit: 0.403596

GCP with Huber + Lower Bound

Huber works well. By default, Huber has no lower bound. To add one, we have to pass in the func/grad/lower information explicitly. We can use gcp_fg_setup to get the func/grad parameters.

cnt = 6;

% Call helper function tt_gcp_fg_setup to get the function and gradient handles
[fh,gh] = tt_gcp_fg_setup('Huber (0.25)');

% Pass the func/grad/lower explicitly.
M{cnt} = gcp_opt(X,3,'func',fh,'grad',gh,'lower',0,'printitn',25);

fit(cnt) = 1 - norm(full(M{cnt})-X)/norm(X);
fprintf('Fit: %g\n', fit(cnt));

viz(M{cnt},'Figure',cnt,vizopts{:});

GCP-OPT-LBFGSB (Generalized CP Tensor Decomposition)

Tensor size: 5 x 201 x 61 (61305 total entries)
Generalized function Type: user-specified
Objective function: @(x,m)(x-m).^2.*(abs(x-m)<0.25)+(0.5.*abs(x-m)-0.0625).*(abs(x-m)>=0.25)
Gradient function: @(x,m)-2.*(x-m).*(abs(x-m)<0.25)-(0.5.*sign(x-m)).*(abs(x-m)>=0.25)
Lower bound of factor matrices: 0
Optimization method: lbfgsb
Max iterations: 1000
Projected gradient tolerance: 6.131

Begin Main loop
Iter    25, f(x) = 4.493693e+05, ||grad||_infty = 9.33e+03
Iter    50, f(x) = 8.860955e+04, ||grad||_infty = 8.62e+03
Iter    75, f(x) = 8.304119e+04, ||grad||_infty = 3.40e+03
Iter    91, f(x) = 8.303986e+04, ||grad||_infty = 3.39e+03
End Main Loop

Final objective: 8.3040e+04
Setup time: 0.03 seconds
Main loop time: 0.21 seconds
Outer iterations: 91
Total iterations: 187
L-BFGS-B Exit message: CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL.
Fit: 0.973484

GCP with Beta

This is also pretty bad, which gives an idea of the struggle of choosing the wrong distribution. It can work a little bit, but it's clearly the wrong objective.

cnt = 7;

M{cnt} = gcp_opt(X,3,'type','beta (0.75)','printitn',25);

fit(cnt) = 1 - norm(full(M{cnt})-X)/norm(X);
fprintf('Fit: %g\n', fit(cnt));
viz(M{cnt},'Figure',cnt,vizopts{:});

Warning: Using 'beta' type but tensor X
is not nonnegative 

GCP-OPT-LBFGSB (Generalized CP Tensor Decomposition)

Tensor size: 5 x 201 x 61 (61305 total entries)
Generalized function Type: beta (0.75)
Objective function: @(x,m)(1.33333).*(m+1e-10).^(0.75)-(-4).*x.*(m+1e-10).^(-0.25)
Gradient function: @(x,m)(m+1e-10).^(-0.25)-x.*(m+1e-10).^(-1.25)
Lower bound of factor matrices: 0
Optimization method: lbfgsb
Max iterations: 1000
Projected gradient tolerance: 6.131

Begin Main loop
Positive dir derivative in projection 
Using the backtracking step
Iter    25, f(x) = 9.861901e+06, ||grad||_infty = 9.24e+15
Iter    40, f(x) = 9.818012e+06, ||grad||_infty = 8.54e+15
End Main Loop

Final objective: 9.8180e+06
Setup time: 0.01 seconds
Main loop time: 1.21 seconds
Outer iterations: 40
Total iterations: 142
L-BFGS-B Exit message: CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH.
Fit: 0.56936