非線形データフィッティングにおけるロバスト回帰

nAG数値計算ライブラリ > 最適化アルゴリズムExample集 > 非線形データフィッティングにおけるロバスト回帰

非線形データフィッティング-NLDF

このExampleでは、非線形回帰問題を最小二乗法とロバスト回帰の両方を用いて解いています。データは \(y = f(t;A,w) := A \cdot t \cdot \sin(-w \cdot t)\) を用いて生成され、ランダムノイズが加えられています。4つのデータ点(4, 12, 16, 20番目の残差)は外れ値となっています。最小二乗法とSmoothL1損失関数を用いたロバスト回帰の結果を比較することで、ロバスト回帰の外れ値に対する頑健性を示しています。

目的関数:

タスク	式
Minimize	\(\sum_{i=1}^{24} \rho(r_i(x))\)

\(\rho\)は損失関数で、最小二乗法の場合は\(\rho(r) = r^2\)、SmoothL1の場合は\(\rho(r) = \begin{cases} 0.5r^2, & \text{if } |r| \leq 1 \\ |r| - 0.5, & \text{otherwise} \end{cases}\)
\(r_i(x) = y_i - f(t_i; x_1, x_2)\)は\(i\)番目のデータ点における残差

決定変数:

変数	範囲
\(x_1\)	\([-1, \infty)\)
\(x_2\)	\([0, 1]\)

\(x_1\): モデルパラメータ\(A\)を表す
\(x_2\): モデルパラメータ\(w\)を表す

制約条件:

制約	式
線形制約	\(x_1 + x_2 \in [0.2, 1]\)

モデルパラメータ\(A\)と\(w\)の和が0.2以上1以下であるという線形制約条件が課されています。

Exampleの実行コマンド:

python -m naginterfaces.library.examples.opt.handle_solve_nldf_ex

ソースコード表示コマンド:

python -c "import inspect; from naginterfaces.library.examples.opt import handle_solve_nldf_ex; print(''.join(inspect.getsourcelines(handle_solve_nldf_ex)[0]))"

出力結果例:

naginterfaces.library.opt.handle_solve_nldf Python Example Results.
First solve the problem using least squares loss function
 E04GN, Nonlinear Data-Fitting
 Status: converged, an optimal solution found
 Final objective value  4.590715E+01
 Primal variables:
   idx   Lower bound       Value       Upper bound
     1  -1.00000E+00    9.43732E-02         inf
     2   0.00000E+00    7.74046E-01    1.00000E+00
---------------------------------------------------------
Now solve the problem using SmoothL1 loss function
 E04GN, Nonlinear Data-Fitting
 Status: converged, an optimal solution found
 Final objective value  1.294635E+01
 Primal variables:
   idx   Lower bound       Value       Upper bound
     1  -1.00000E+00    9.69201E-02         inf
     2   0.00000E+00    7.95110E-01    1.00000E+00

マニュアル:

handle_solve_nldfのマニュアル

ソース:

#!/usr/bin/env python3
"``naginterfaces.library.opt.handle_solve_nldf`` Python Example."

# nAG Copyright 2022.

# pylint: disable=invalid-name

import numpy as np

from naginterfaces.base import utils
from naginterfaces.library import opt

def main():
    """
    Example for :func:`naginterfaces.library.opt.handle_solve_nldf`.

    General nonlinear data-fitting with constraints.

    Solve a nonlinear regression problem using
    both least squares and robust regression.

    >>> main()
    naginterfaces.library.opt.handle_solve_nldf Python Example Results.
    First solve the problem using least squares loss function
     E04GN, Nonlinear Data-Fitting
     Status: converged, an optimal solution found
     Final objective value  4.590715E+01
    <BLANKLINE>
     Primal variables:
       idx   Lower bound       Value       Upper bound
         1  -1.00000E+00    9.43732E-02         inf
         2   0.00000E+00    7.74046E-01    1.00000E+00
    ---------------------------------------------------------
    Now solve the problem using SmoothL1 loss function
     E04GN, Nonlinear Data-Fitting
     Status: converged, an optimal solution found
     Final objective value  1.294635E+01
    <BLANKLINE>
     Primal variables:
       idx   Lower bound       Value       Upper bound
         1  -1.00000E+00    9.69201E-02         inf
         2   0.00000E+00    7.95110E-01    1.00000E+00
    """

    print(
        'naginterfaces.library.opt.handle_solve_nldf Python Example Results.'
    )

    def cb_lsqfun(x, nres, inform, data):
        rx = np.zeros(nres,dtype=float)
        vec1 = data["vec1"]
        vec2 = data["vec2"]

        for i in range(nres):
            rx[i] = (vec2[i] - x[0]*vec1[i]*np.sin(-x[1]*vec1[i]))

        return rx, inform

    def cb_lsqgrd(x, nres, rdx, inform, data):
        vec1 = data["vec1"]
        nvar = len(x)

        for i in range(nres):
            rdx[i*nvar] = -vec1[i]*np.sin(-x[1]*vec1[i])
            rdx[i*nvar+1] = vec1[i]**2*x[0]*np.cos(x[1]*vec1[i])

        return inform

    # Data for model y = f(t;A,w) := A*t*sin(-w*t)
    # Data generated using A = 0.1 and w = 0.8 and added random noise
    # Residual 4, 12, 16 and 20 are outliers
    data = {}
    data["vec1"] = [1.0E+00,2.0E+00,3.0E+00,4.0E+00,5.0E+00,6.0E+00,7.0E+00,
                    8.0E+00,9.0E+00,1.0E+01,1.1E+01,1.2E+01,1.3E+01,1.4E+01,
                    1.5E+01,1.6E+01,1.7E+01,1.8E+01,1.9E+01,2.0E+01,2.1E+01,
                    2.2E+01,2.3E+01,2.4E+01]
    data["vec2"] = [0.0523,-0.1442,-0.0422,1.8106,0.3271,0.4684,0.4593,
                    -0.0169,-0.7811,-1.1356,-0.5343,-3.0043,1.1832,1.5153
                    ,0.7120,-2.2923,-1.4871,-1.7083,-0.9936,-5.2873,
                    1.7555,2.0642,0.9499,-0.6234]

    # Create a handle for the problem:
    nvar = 2
    handle = opt.handle_init(nvar=nvar)

    # Define the residuals structure:
    nres = 24
    opt.handle_set_nlnls(handle, nres)

    # Define the bounds on the variables:
    opt.handle_set_simplebounds(handle, bl=[-1.0, 0.], bu=[1.e20, 1.0])

    # Define the linear constraint
    opt.handle_set_linconstr(
        handle,
        bl=[0.2],
        bu=[1.],
        irowb=[1,1],
        icolb=[1,2],b=[1.,1.],
    )

    # Set the loss function and some printing options
    for option in [
            'NLDF Loss Function Type = L2',
            'Print Level = 1',
            'Print Options = No',
            'Print solution = yes'
    ]:
        opt.handle_opt_set(handle, option)

    # Use an explicit I/O manager for abbreviated iteration output:
    iom = utils.FileObjManager(locus_in_output=False)

    # Call the solver
    print("First solve the problem using least squares loss function")
    x = [0.3, 0.7]
    opt.handle_solve_nldf(
        handle, cb_lsqfun, cb_lsqgrd, x, nres,
        data=data, io_manager=iom,
    )
    print('---------------------------------------------------------')

    # Use robust nonlinear regression and resolve the model
    print('Now solve the problem using SmoothL1 loss function')
    x = [0.3, 0.7]
    opt.handle_opt_set(handle, 'NLDF Loss Function Type = smoothL1')
    opt.handle_solve_nldf(
        handle, cb_lsqfun, cb_lsqgrd, x, nres,
        data=data, io_manager=iom,
    )

if __name__ == '__main__':
    import doctest
    import sys
    sys.exit(
        doctest.testmod(
            None, verbose=True, report=False,
            optionflags=doctest.ELLIPSIS | doctest.REPORT_NDIFF,
        ).failed
    )