id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc
646	Issues with the bending angle 1dvar minimizer	Ian Culverwell	Ian Culverwell	"Chris Marquardt (EUM) has identified some issues with the 1dvar minimisers in ROPP. Here is what he says.

= Minimiser issues in ROPP 9.0 =

This note reports on some results I have obtained with the vanilla bending angle 1DVar from ROPP v9.0 (and also a modified version I prepared.) In general, all runs were performed using a 1DVar
configuration based on ROPP's default '''ecmwf_bangle_1dvar.cf'''. The only difference common to all tests I ran is that the covariance methods were chosen differently from the default settings,
mostly because RSFC was the only method supporting ECMWF's current 137 levels (and I don't remember the reason for chosing a different observation covariance method, to be honest):
{{{
    #bg_covar_method  = VSFC   # Variable Sigmas, Fixed Correlations
    bg_covar_method  = RSFC    # Relative Sigmas, Fixed Correlations

    #obs_covar_method = VSDC   # Variable Sigmas, Diagonal Correlations
    obs_covar_method = FSFC    # Fixed Sigmas, Fixed Correlations
}}}
In all experiments, shown below, I use the first profile we got from Metop-C (available in the
ROPP bending angle file '''data/eum/first.nc'''; that is also how I discovered the problem. The experiments described below can be carried out by running the scripts '''run_1dvar-00''',
'''run_1dvar-01''', '''run_1dvar-02''', '''run_1dvar-03''', and '''run_1dvar-00''', respectively.  

== MINROPP ==

Using MINROPP as minimiser and running this setup on the first Metop-C profile gives the following output (which can be reproduced with the configuration file '''ecmwf_bangle_1dvar-01.cf''':

{{{
    ---------------------------------------------------------------------
                       ROPP Bending Angle 1D-Var
    ---------------------------------------------------------------------

    INFO (from ropp_1dvar_bangle):  Reading configuration file ecmwf_bangle_1dvar.cf.

    INFO (from ropp_1dvar_bangle):  Reading observation data for profile 1 from the file
           first.nc.

    INFO (from ropp_1dvar_bangle):  Reading background data for profile 1 from the file
           ../ecmwf/2018/11/13/GRAS_1B_M03_20181113100752Z_20181113100945Z_N_C_20181113121522Z_G22_NN.fw.nc.

    INFO (from ropp_qc_cutoff):  33 points rejected for being above maximum impact height of 50.0 km.

    INFO (from ropp_qc_bgqc):  Background quality control lets all bending angle values pass.

    INFO (from ropp_1dvar_solve):  Using background error covariance matrix for preconditioning.

    INFO (from ropp_1dvar_cost):  Using absolute decrease of cost function and relative change
      of state vector as additional convergence criteria.

          n_iter =    1   J = 97.039            max(relative change in state) =  -
          n_iter =    2   J = 79.446            max(relative change in state) = 0.42320
    INFO (from ropp_1dvar_cost):  Convergence assumed to be achieved as the cost function is increasing

    INFO (from ropp_1dvar_cost):  Minimization of cost function successfully finished (according to additional
        convergence criteria). Number of required iterations: 3.

    INFO (from ropp_1dvar_bangle):  Writing 1DVar retrieval for profile 1 to the file
           first-1dvar-default.nc.
}}}

The main point in the above log is that the iteration already stops after 2 or 3 iterations. With the LEVMARQ minimiser, however, a 1DVar for the same profile results in the following output:

{{{
    ---------------------------------------------------------------------
                       ROPP Bending Angle 1D-Var
    ---------------------------------------------------------------------

    INFO (from ropp_1dvar_bangle):  Reading configuration file ecmwf_bangle_1dvar.cf.

    INFO (from ropp_1dvar_bangle):  Reading observation data for profile 1 from the file
           first.nc.

    INFO (from ropp_1dvar_bangle):  Reading background data for profile 1 from the file
           ../ecmwf/2018/11/13/GRAS_1B_M03_20181113100752Z_20181113100945Z_N_C_20181113121522Z_G22_NN.fw.nc.

    INFO (from ropp_qc_cutoff):  33 points rejected for being above maximum impact height of 50.0 km.

    INFO (from ropp_qc_bgqc):  Background quality control lets all bending angle values pass.

          n_iter =    1   J = 97.039            max(relative change in state) =  -
          n_iter =    2   J = 32.400            max(relative change in state) = 0.13905
          n_iter =    3   J = 32.382            max(relative change in state) = 0.16073
          n_iter =    4   J = 32.378            max(relative change in state) = 0.12803

    INFO (from ropp_1dvar_levmarq):  Convergence assumed to be achieved as the cost function did not change by more
        than 0.10000 for the last 2 iterations.

    INFO (from ropp_1dvar_bangle):  Writing 1DVar retrieval for profile 1 to the file
           first-1dvar-default.nc.
}}}

i.e. we get quite a different cost function compared to the achieved by MINROPP (32.4 vs 79.4) - and obviously, the MINROPP minimiser isn't even near the minimum of the cost function.

It is possible to force MINROPP to do more iterations, of course. Again running MINROPP, but this time with
{{{
    conv_check_n_previous      = 5              # Minimum number of iterations required
}}}
(where the default value is 2 iterations; see '''ecmwf_bangle_1dvar-02.cf''') results in the following

{{{
    INFO (from ropp_1dvar_cost):  Using absolute decrease of cost function and relative change
      of state vector as additional convergence criteria.

          n_iter =    1   J = 97.039            max(relative change in state) =  -
          n_iter =    2   J = 79.446            max(relative change in state) = 0.42320
          n_iter =    3   J = 16746.            max(relative change in state) = 71.630
          n_iter =    4   J = 76.892            max(relative change in state) = 71.347
          n_iter =    5   J = 76.486            max(relative change in state) = 0.11244E-01
          n_iter =    6   J = 372.96            max(relative change in state) = 9.9179
          n_iter =    7   J = 53.391            max(relative change in state) = 8.7327
          n_iter =    8   J = 49.452            max(relative change in state) = 0.30239
          n_iter =    9   J = 43.141            max(relative change in state) = 0.74608
          n_iter =   10   J = 43.110            max(relative change in state) = 0.25568
          n_iter =   11   J = 43.023            max(relative change in state) = 0.12759
          n_iter =   12   J = 43.016            max(relative change in state) = 0.75272E-02
          n_iter =   13   J = 42.915            max(relative change in state) = 0.76997E-01
          n_iter =   14   J = 43.524            max(relative change in state) = 0.47670
          n_iter =   15   J = 42.887            max(relative change in state) = 0.41171
          n_iter =   16   J = 42.879            max(relative change in state) = 0.58047E-02
          n_iter =   17   J = 42.767            max(relative change in state) = 0.10087
          n_iter =   18   J = 43.014            max(relative change in state) = 0.34790
          n_iter =   19   J = 42.726            max(relative change in state) = 0.28058
          n_iter =   20   J = 42.710            max(relative change in state) = 0.14024E-01
          n_iter =   21   J = 42.560            max(relative change in state) = 0.13914
          n_iter =   22   J = 42.400            max(relative change in state) = 0.64642E-01
    INFO (from ropp_1dvar_cost):  Convergence assumed to be achieved as the cost function is increasing
}}}

This again confirms that MINROPP in the default configuration stops the iteration prematurely, mainly because it stops the iteration as soon as the cost function increases. I agree that this
is meaningful criterion for stopping iterations, but only if the minmizer implements a reliable line search along the gradient as part of the minimisation step (which MINROPP apparently
doesn't). Forcing the number of iterations to be at least 50 (e.g., using '''ecmwf_bangle_1dvar-03.cf'''), one can see that MINROPP is approaching the minimum, albeit rather slowly, and with multiple spikes in the cost function:

{{{
    INFO (from ropp_1dvar_cost):  Using absolute decrease of cost function and relative change
      of state vector as additional convergence criteria.

          n_iter =    1   J = 97.039            max(relative change in state) =  -
          n_iter =    2   J = 79.446            max(relative change in state) = 0.42320
          n_iter =    3   J = 16746.            max(relative change in state) = 71.630
          n_iter =    4   J = 76.892            max(relative change in state) = 71.347
          n_iter =    5   J = 76.486            max(relative change in state) = 0.11244E-01
          n_iter =    6   J = 372.96            max(relative change in state) = 9.9179
          n_iter =    7   J = 53.391            max(relative change in state) = 8.7327
          n_iter =    8   J = 49.452            max(relative change in state) = 0.30239
          n_iter =    9   J = 43.141            max(relative change in state) = 0.74608
          n_iter =   10   J = 43.110            max(relative change in state) = 0.25568
          n_iter =   11   J = 43.023            max(relative change in state) = 0.12759
          n_iter =   12   J = 43.016            max(relative change in state) = 0.75272E-02
          n_iter =   13   J = 42.915            max(relative change in state) = 0.76997E-01
          n_iter =   14   J = 43.524            max(relative change in state) = 0.47670
          n_iter =   15   J = 42.887            max(relative change in state) = 0.41171
          n_iter =   16   J = 42.879            max(relative change in state) = 0.58047E-02
          n_iter =   17   J = 42.767            max(relative change in state) = 0.10087
          n_iter =   18   J = 43.014            max(relative change in state) = 0.34790
          n_iter =   19   J = 42.726            max(relative change in state) = 0.28058
          n_iter =   20   J = 42.710            max(relative change in state) = 0.14024E-01
          n_iter =   21   J = 42.560            max(relative change in state) = 0.13914
          n_iter =   22   J = 42.400            max(relative change in state) = 0.64642E-01
          n_iter =   23   J = 188.86            max(relative change in state) = 6.7741
          n_iter =   24   J = 42.325            max(relative change in state) = 6.6345
          n_iter =   25   J = 42.317            max(relative change in state) = 0.25400E-02
          n_iter =   26   J = 43.298            max(relative change in state) = 0.61042
          n_iter =   27   J = 42.012            max(relative change in state) = 0.54511
          n_iter =   28   J = 41.982            max(relative change in state) = 0.43726E-01
          n_iter =   29   J = 41.951            max(relative change in state) = 0.80449E-01
          n_iter =   30   J = 41.936            max(relative change in state) = 0.12898E-01
          n_iter =   31   J = 41.765            max(relative change in state) = 0.11462
          n_iter =   32   J = 10026.            max(relative change in state) = 55.447
          n_iter =   33   J = 41.752            max(relative change in state) = 55.395
          n_iter =   34   J = 41.748            max(relative change in state) = 0.19056E-03
          n_iter =   35   J = 1476.0            max(relative change in state) = 21.022
          n_iter =   36   J = 41.556            max(relative change in state) = 20.884
          n_iter =   37   J = 41.431            max(relative change in state) = 0.57102E-02
          n_iter =   38   J = 57.723            max(relative change in state) = 2.4647
          n_iter =   39   J = 36.069            max(relative change in state) = 2.2406
          n_iter =   40   J = 35.672            max(relative change in state) = 0.17386
          n_iter =   41   J = 35.431            max(relative change in state) = 0.25593
          n_iter =   42   J = 35.399            max(relative change in state) = 0.10832E-01
          n_iter =   43   J = 34.818            max(relative change in state) = 0.19488
          n_iter =   44   J = 35.400            max(relative change in state) = 0.47172
          n_iter =   45   J = 34.808            max(relative change in state) = 0.42144
          n_iter =   46   J = 34.807            max(relative change in state) = 0.36675E-03
          n_iter =   47   J = 34.437            max(relative change in state) = 0.23499
          n_iter =   48   J = 38.018            max(relative change in state) = 1.1354
          n_iter =   49   J = 34.405            max(relative change in state) = 1.0396
          n_iter =   50   J = 34.403            max(relative change in state) = 0.99782E-03
    INFO (from ropp_1dvar_solve):  Minimization of cost function successfully finished (according to additional
        convergence criteria). Number of required iterations: 50.
}}}

I was considering to modify the code to disable stopping the iteration in case of an increasing cost function, but found the code in MINROPP a bit convoluted; I therefore decided to stick with
the LEVMARQ minimiser.

== LEVMARQ ==

When looking into the code, I found that the LEVMARQ also stops as soon as the cost function starts to increase because of the main iteration is controlled by the statement (in line 177 of
the file '''ropp_1dvar/common/ropp_1dvar_levmarq.f90''')
{{{
    DO WHILE(J <= J_min)
}}}
Replacing this with
{{{
    DO WHILE(J <= 2.*J_min)
}}}
and reducing the convergence criteria from
{{{
    conv_check_max_delta_state = 0.1            # State vector must change less than this * bg error
    conv_check_max_delta_J     = 0.1            # Cost function must change less than this
}}}
to
{{{
    conv_check_max_delta_state = 0.0001         # State vector must change less than this * bg error
    conv_check_max_delta_J     = 0.0001         # Cost function must change less than this
}}}
results in the following iteration behaviour ('''ecmwf_bangle_1dvar-04.cf''' - **note that this requires a modified version of the LEVMARQ minimiser**, see above):

{{{
    INFO (from ropp_qc_bgqc):  Background quality control lets all bending angle values pass.

          n_iter =    1   J = 97.039            max(relative change in state) =  -
          n_iter =    2   J = 32.400            max(relative change in state) = 0.13905
          n_iter =    3   J = 32.382            max(relative change in state) = 0.16073
          n_iter =    4   J = 32.378            max(relative change in state) = 0.12803
          n_iter =    5   J = 32.376            max(relative change in state) = 0.12886
          n_iter =    6   J = 32.378            max(relative change in state) = 0.12885
          n_iter =    7   J = 32.416            max(relative change in state) = 0.12841
          n_iter =    8   J = 32.482            max(relative change in state) = 0.98499E-01
          n_iter =    9   J = 32.559            max(relative change in state) = 0.16935E-01
          n_iter =   10   J = 32.593            max(relative change in state) = 0.80768E-03
          n_iter =   11   J = 32.596            max(relative change in state) = 0.88798E-05
          n_iter =   12   J = 32.596            max(relative change in state) = 0.88887E-07
          
    INFO (from ropp_1dvar_levmarq):  Convergence assumed to be achieved as the state vector did not change by more
        than 0.10000E-03 relative to the assumed background errors for the last 2 iterations.
}}}

One may increase the minimum number of iterations required for convergence and will then find that additional iterations remain at a cost function value of 32.596. Note that this final cost
function value is not the minimum of the cost function; that appears to be more around 32.376.  The difference in this case is probably irrelevant in practice, but I still wonder why the LEVMARQ minimiser of ROPP is arriving at a false minimum if it clearly has been closer to the real one before. May that point towards an undiscovered issues in the calculation of the gradient/adjoint?

== Other observations ==

 - The last experiment requires a modification to the ROPP source code, because the minimizer loop in the released ROPP versions stops as soon as the cost function value increases. There is nothing in the output of the LEVMARQ scheme that makes the user aware, however; in case of MINROPP, the logging output tells the user the correct reason why the minimisation was terminated.

 - All four sample scripts also run a bending angle 1DVar on a second test case ('''data/eum/first-tropical.nc'''), one of the first tropical occultations observed by Metop-C. All issues are found in this example as well.  
"	defect	closed	major	11.0	ropp_1dvar	9.0	fixed	minROPP, Levenberg-Marquardt	Johannes K. Nielsen frcm