nAG数値計算ライブラリ
> 最適化アルゴリズムExample集
> 非線形データフィッティングにおけるロバスト回帰
非線形データフィッティング-NLDF
このExampleでは、非線形回帰問題を最小二乗法とロバスト回帰の両方を用いて解いています。データは \(y = f(t;A,w) := A \cdot t \cdot \sin(-w \cdot t)\) を用いて生成され、ランダムノイズが加えられています。4つのデータ点(4, 12, 16, 20番目の残差)は外れ値となっています。最小二乗法とSmoothL1損失関数を用いたロバスト回帰の結果を比較することで、ロバスト回帰の外れ値に対する頑健性を示しています。
目的関数:
| タスク | 式 |
|---|---|
| Minimize | \(\sum_{i=1}^{24} \rho(r_i(x))\) |
- \(\rho\)は損失関数で、最小二乗法の場合は\(\rho(r) = r^2\)、SmoothL1の場合は\(\rho(r) = \begin{cases} 0.5r^2, & \text{if } |r| \leq 1 \\ |r| - 0.5, & \text{otherwise} \end{cases}\)
- \(r_i(x) = y_i - f(t_i; x_1, x_2)\)は\(i\)番目のデータ点における残差
決定変数:
| 変数 | 範囲 |
|---|---|
| \(x_1\) | \([-1, \infty)\) |
| \(x_2\) | \([0, 1]\) |
- \(x_1\): モデルパラメータ\(A\)を表す
- \(x_2\): モデルパラメータ\(w\)を表す
制約条件:
| 制約 | 式 |
|---|---|
| 線形制約 | \(x_1 + x_2 \in [0.2, 1]\) |
- モデルパラメータ\(A\)と\(w\)の和が0.2以上1以下であるという線形制約条件が課されています。
Exampleの実行コマンド:
python -m naginterfaces.library.examples.opt.handle_solve_nldf_ex
ソースコード表示コマンド:
python -c "import inspect; from naginterfaces.library.examples.opt import handle_solve_nldf_ex; print(''.join(inspect.getsourcelines(handle_solve_nldf_ex)[0]))"
出力結果例:
naginterfaces.library.opt.handle_solve_nldf Python Example Results.
First solve the problem using least squares loss function
E04GN, Nonlinear Data-Fitting
Status: converged, an optimal solution found
Final objective value 4.590715E+01
Primal variables:
idx Lower bound Value Upper bound
1 -1.00000E+00 9.43732E-02 inf
2 0.00000E+00 7.74046E-01 1.00000E+00
---------------------------------------------------------
Now solve the problem using SmoothL1 loss function
E04GN, Nonlinear Data-Fitting
Status: converged, an optimal solution found
Final objective value 1.294635E+01
Primal variables:
idx Lower bound Value Upper bound
1 -1.00000E+00 9.69201E-02 inf
2 0.00000E+00 7.95110E-01 1.00000E+00
マニュアル:
ソース:
#!/usr/bin/env python3
"``naginterfaces.library.opt.handle_solve_nldf`` Python Example."
# nAG Copyright 2022.
# pylint: disable=invalid-name
import numpy as np
from naginterfaces.base import utils
from naginterfaces.library import opt
def main():
"""
Example for :func:`naginterfaces.library.opt.handle_solve_nldf`.
General nonlinear data-fitting with constraints.
Solve a nonlinear regression problem using
both least squares and robust regression.
>>> main()
naginterfaces.library.opt.handle_solve_nldf Python Example Results.
First solve the problem using least squares loss function
E04GN, Nonlinear Data-Fitting
Status: converged, an optimal solution found
Final objective value 4.590715E+01
<BLANKLINE>
Primal variables:
idx Lower bound Value Upper bound
1 -1.00000E+00 9.43732E-02 inf
2 0.00000E+00 7.74046E-01 1.00000E+00
---------------------------------------------------------
Now solve the problem using SmoothL1 loss function
E04GN, Nonlinear Data-Fitting
Status: converged, an optimal solution found
Final objective value 1.294635E+01
<BLANKLINE>
Primal variables:
idx Lower bound Value Upper bound
1 -1.00000E+00 9.69201E-02 inf
2 0.00000E+00 7.95110E-01 1.00000E+00
"""
print(
'naginterfaces.library.opt.handle_solve_nldf Python Example Results.'
)
def cb_lsqfun(x, nres, inform, data):
rx = np.zeros(nres,dtype=float)
vec1 = data["vec1"]
vec2 = data["vec2"]
for i in range(nres):
rx[i] = (vec2[i] - x[0]*vec1[i]*np.sin(-x[1]*vec1[i]))
return rx, inform
def cb_lsqgrd(x, nres, rdx, inform, data):
vec1 = data["vec1"]
nvar = len(x)
for i in range(nres):
rdx[i*nvar] = -vec1[i]*np.sin(-x[1]*vec1[i])
rdx[i*nvar+1] = vec1[i]**2*x[0]*np.cos(x[1]*vec1[i])
return inform
# Data for model y = f(t;A,w) := A*t*sin(-w*t)
# Data generated using A = 0.1 and w = 0.8 and added random noise
# Residual 4, 12, 16 and 20 are outliers
data = {}
data["vec1"] = [1.0E+00,2.0E+00,3.0E+00,4.0E+00,5.0E+00,6.0E+00,7.0E+00,
8.0E+00,9.0E+00,1.0E+01,1.1E+01,1.2E+01,1.3E+01,1.4E+01,
1.5E+01,1.6E+01,1.7E+01,1.8E+01,1.9E+01,2.0E+01,2.1E+01,
2.2E+01,2.3E+01,2.4E+01]
data["vec2"] = [0.0523,-0.1442,-0.0422,1.8106,0.3271,0.4684,0.4593,
-0.0169,-0.7811,-1.1356,-0.5343,-3.0043,1.1832,1.5153
,0.7120,-2.2923,-1.4871,-1.7083,-0.9936,-5.2873,
1.7555,2.0642,0.9499,-0.6234]
# Create a handle for the problem:
nvar = 2
handle = opt.handle_init(nvar=nvar)
# Define the residuals structure:
nres = 24
opt.handle_set_nlnls(handle, nres)
# Define the bounds on the variables:
opt.handle_set_simplebounds(handle, bl=[-1.0, 0.], bu=[1.e20, 1.0])
# Define the linear constraint
opt.handle_set_linconstr(
handle,
bl=[0.2],
bu=[1.],
irowb=[1,1],
icolb=[1,2],b=[1.,1.],
)
# Set the loss function and some printing options
for option in [
'NLDF Loss Function Type = L2',
'Print Level = 1',
'Print Options = No',
'Print solution = yes'
]:
opt.handle_opt_set(handle, option)
# Use an explicit I/O manager for abbreviated iteration output:
iom = utils.FileObjManager(locus_in_output=False)
# Call the solver
print("First solve the problem using least squares loss function")
x = [0.3, 0.7]
opt.handle_solve_nldf(
handle, cb_lsqfun, cb_lsqgrd, x, nres,
data=data, io_manager=iom,
)
print('---------------------------------------------------------')
# Use robust nonlinear regression and resolve the model
print('Now solve the problem using SmoothL1 loss function')
x = [0.3, 0.7]
opt.handle_opt_set(handle, 'NLDF Loss Function Type = smoothL1')
opt.handle_solve_nldf(
handle, cb_lsqfun, cb_lsqgrd, x, nres,
data=data, io_manager=iom,
)
if __name__ == '__main__':
import doctest
import sys
sys.exit(
doctest.testmod(
None, verbose=True, report=False,
optionflags=doctest.ELLIPSIS | doctest.REPORT_NDIFF,
).failed
)
