Opened 7 years ago

Closed 5 years ago

#522 closed defect (fixed)

Account for analysis/validity time issue in ROM SAF bgr files

Reported by: Ian Culverwell Owned by: Ian Culverwell
Priority: normal Milestone: 10.0
Component: ropp_1dvar Version: 9.0
Keywords: Cc:

Description

As discovered by Zhu / Julie Yang in Helpdesk enquiry 363 (http://www.romsaf.org/view_admin_enquiry.php?id=363), corresponding atm and bgr files from the ROM SAF archive (e.g. atm20170415_034828_M02_1200631416_N0023_XXXX.nc and bgr20170415_034828_M02_1200631416_N0023_XXXX.nc) cannot be run through the ROPP 1DVAR code, because the forecast range has not been accounted for in the calculation of the validity time of the background field. The quality control routines throw out the profile because

"ERROR (from ropp_qc_genqc): Temporal separation between 
observations and background profile exceeds upper limit: 1.08000E+04 s."

The observation time is 03:48 on 15/04/2017:

ncks -H -Q -vday,hour,minute,msec atm20170415_034828_M02_1200631416_N0023_XXXX.nc

day[0]=15 
hour[0]=3 
minute[0]=48 
msec[0]=217 

The background comes from a T+18 forecast from an analysis time of 12Z 14/04/2017:

ncks -H -Q -vbg_day,bg_hour,bg_minute,bg_fcperiod bgr20170415_034828_M02_1200631416_N0023_XXXX.nc

bg_day[0]=14 
bg_fcperiod[0]=18 
bg_hour[0]=12 
bg_minute[0]=0 

So the background field is valid at 06Z 15/04/2017, which is within the usual 3 hours of the observation time. But bg_fcperiod has not been accounted for, which is why the code thinks the background fields are not recent enough to be used.

There is a note about this in ropp_fm/common/ropp_fm_roprof2state.f90, the code that converts ROprof data into state vectors:

! Fix to sort out DMI variable labelling issue - VT time vs analysis time (cf.
! ROPP ticket #253). Also increased max. temporal separation to 3 hours.
! This should no longer (ROPP8.0, 2014) be needed, as DMI now set bg%hour etc to the validity time.
! IF ( isinrange(ro_data%bg%fcperiod, ro_data%bg%range%fcperiod) .AND. &
!   ro_data%processing_centre == 'DMI Copenhagen' .AND. &
!   ro_data%bg%source == 'ECMWF' ) THEN
!     x%time = x%time + 3600.0 * ro_data%bg%fcperiod
! ENDIF

DMI have confirmed that this change hasn't in fact been made to the bgr files yet, so we should temporarily reinstate it. We will also need to change

ro_data%processing_centre == 'DMI Copenhagen'

to

(ro_data%processing_centre == 'DMI Copenhagen') .OR. (ro_data%processing_centre == 'DMI (ROM SAF)')

as apparently the latter change has been made (see #346).

When the bgr time has been changed in the files, we will need to revisit the problem to make sure that the code can cope with both bgr time conventions. We might, for instance, use the day,hour etc parameters in the background file, rather than the bg_day, bg_hour etc, because the former seem to match those in the atm file:

ncks -H -Q -vday,hour,minute,msec bgr20170415_034828_M02_1200631416_N0023_XXXX.nc

day[0]=15
hour[0]=3 
minute[0]=48 
msec[0]=217 

Why has this not been considered before? (Perhaps the bgr times were only matched to the atm times recently?)

Ticket #253 also refers to this problem.

Attachments (6)

atm20170415_034828_M02_1200631416_N0023_XXXX.nc (549.1 KB ) - added by Ian Culverwell 6 years ago.
atm20170415_034828_M02_1200631416_N0023_XXXX.nc
bgr20170415_034828_M02_1200631416_N0023_XXXX.nc (25.1 KB ) - added by Ian Culverwell 6 years ago.
bgr20170415_034828_M02_1200631416_N0023_XXXX.nc
atm_20160101_000028_METB_G021_R_2305_0010.nc (196.6 KB ) - added by Ian Culverwell 6 years ago.
atm_20160101_000028_METB_G021_R_2305_0010.nc
bgf_20160101_000028_METB_G021_R_2305_0010.nc (162.7 KB ) - added by Ian Culverwell 6 years ago.
bgf_20160101_000028_METB_G021_R_2305_0010.nc
bgo_20160101_000028_METB_G021_R_2305_0010.nc (166.9 KB ) - added by Ian Culverwell 6 years ago.
bgo_20160101_000028_METB_G021_R_2305_0010.nc
test3.sh (4.8 KB ) - added by Ian Culverwell 6 years ago.
test3.sh

Download all attachments as: .zip

Change history (21)

by Ian Culverwell, 6 years ago

atm20170415_034828_M02_1200631416_N0023_XXXX.nc

by Ian Culverwell, 6 years ago

bgr20170415_034828_M02_1200631416_N0023_XXXX.nc

comment:1 by Ian Culverwell, 6 years ago

Adding the example obs and bgr file to the ticket.

comment:2 by Ian Culverwell, 6 years ago

The problem with the last idea is that the day/hour/minute times in the atm and bgr files are likely to match exactly or not at all, depending on the state of the software that generated them. In other words, a comparison of the two would not depend on the actual contemporaneity of the observation and the background, but on whether someone remembered to set a switch in the code (which would give an exact match) or forgot to (which could easily give rise to a large difference). So hold that thought for now.

comment:3 by Ian Culverwell, 6 years ago

Made a change to account for this at r5580. This assumes that ROM SAF bg files produced before 01/01/2020 use the old definition of bg time (i.e. = analysis time), while those produced after use the newer definition (i.e. bg time = validity time = analysis time + forecast length).

The ROPP Development Guide has been updated at r5581 to remind us to check this date before each release.

comment:4 by Ian Culverwell, 6 years ago

Unfortunately this change messes up some of the ropp_1dvar make tests, so these have been fixed (by setting bg_fcperiod=0 in the reference datasets, so they work with either convention) in r5584.

comment:5 by Ian Culverwell, 6 years ago

r5585 contains a small bug-fix to ropp_io_compare_fields.f90, which was discovered while investigating this problem.

comment:6 by Ian Culverwell, 6 years ago

r5586 fixes some typos.

comment:7 by Ian Culverwell, 6 years ago

This needs to be discussed by the ROPP CG, so leave open for now.

comment:8 by Ian Culverwell, 6 years ago

Milestone: 9.110.0

As agreed with DMI in some CG6 post-match discussions, a better solution might be to say something like

If |t_obs - t_bgr| > config%genqc_max_time_sep  (~ usual 3 hours) then
  
  Issue a warning " |DT| > tolerance ... testing whether adding lead time will bring obs and bgr time close enough"

  Add forecast lead time to t_bgr

  If |t_obs - t_bgr| > config%genqc_max_time_sep  (~ usual 3 hours) then

    Issue error message "|DT| still > tolerance"

  Else

    Issue a warning "|DT| < tolerance if we add lead time to bgr time .. continuing on that basis"

    Issue a diagnostic message giving t_obs, t_bgr, bg_fcperiod, tolerance, ... 

  Endif

Endif

Agreed to leave the code alone for now, but to kick the ticket up to ROPP10.0. So do so.

comment:9 by Ian Culverwell, 6 years ago

Stig wondered how the new RE1-reprocessed climate data records, unleashed on an innocent public on 19/2/2019, would fare, so he sent me two of the backgrounds and an obs file (attached, as is the test script). Results follow.

ERA-I forecast file bgf_20160101_000028_METB_G021_R_2305_0010.nc

This works OK because

bg_fcperiod[0]=-9.9999e+07

(deliberately, for time-interpolated files like this), which the range-checking converts to

ro_data%bg%fcperiod = 999.900024414062

(inaccurately because it incorrectly sets it to 999.9, not 999.9_wp).

The fix is then not applied because it's bracketed by

    IF ( isinrange(ro_data%bg%fcperiod, ro_data%bg%range%fcperiod ) .AND. &
         ( ro_data%processing_centre == 'DMI Copenhagen' .OR. &
           ro_data%processing_centre == 'DMI (ROM SAF)' ) .AND. &
         ro_data%bg%source == 'ECMWF' ) THEN

and

ro_data%bg%range%fcperiod = 0.000000000000000E+000 24.0000000000000

so the isinrange condition is never met.

Fortunately this particular dataset has closely matching obs and background times, so the QC check doesn't detect a mismatch in the times.

I'm uneasy with the use of ropp_MDFV for bg_fcperiod. I think the code generally assumes bg_fc_period is valid, i.e. we don't test it with if (bg_fcperiod > ropp_MDTV) then ... else ... endif constructions (or if (bg_fcperiod < 999.9) then ... constructions). I think I might prefer to see bg_fcperiod set to zero for interpolated fields, and bg_day/hour/minute set equal to the validity time of the forecast. (But then what's the difference between bg_day/bg_hour/... and day/hour/... ?)

comment:10 by Ian Culverwell, 6 years ago

ECMWF forecast file bgo_20160101_000028_METB_G021_R_2305_0010.nc

This one fails in ROPP9.1 because

bg_day[0]=1

bg_fcperiod[0]=12

bg_hour[0]=0

bg_minute[0]=0

bg_month[0]=1

bg_year[0]=2016

(in other words, presumably, this came from a T+12 forecast which started from an analysis at 12Z 31/12/2015)

and therefore the fix in roprof2state is applied, and we end up with

ERROR (from ropp_qc_genqc):  Temporal separation between observations and background profile of 4.31715E+04 s exceeds upper limit of 1.08000E+04 s.

(12 X 3600 = 43200)

But, as discussed at the beginning of the ticket, we need the fix for NRT files (for now), so leave in ROPP9.1.

In the long term, I don't see any way of avoiding a check on both bg_time and bg_time+fcperiod, as suggested above.

What a mess.

by Ian Culverwell, 6 years ago

atm_20160101_000028_METB_G021_R_2305_0010.nc

by Ian Culverwell, 6 years ago

bgf_20160101_000028_METB_G021_R_2305_0010.nc

by Ian Culverwell, 6 years ago

bgo_20160101_000028_METB_G021_R_2305_0010.nc

by Ian Culverwell, 6 years ago

Attachment: test3.sh added

test3.sh

comment:11 by Ian Culverwell, 5 years ago

First things first: set the default value of bg%fcperiod to its precise, correct value of 999.9_wp in ropp_io_types.f90, ropp_io_free.f90 and ropp_io_rangecheck.f90at r6020.

comment:12 by Ian Culverwell, 5 years ago

r6020 does at least have the desired effect of setting

ro_data%bg%fcperiod = 999.900000000000

rather than

ro_data%bg%fcperiod = 999.900024414062 

in roprof2state.

comment:13 by Ian Culverwell, 5 years ago

Initially ropp_1dvar_refrac failed to work on bgf_20160101_000028_METB_G021_R_2305_0010.nc, because

ERROR (from ropp_1dvar_covar_bg):  Background standard error deviations requested on a per profile basis, 
    but background data file did not contain sigmas.

This was eventually traced to a problem in the covariance file ropp_bg_ses_ecmwf_error_corr_L60.nc. It contains the values

netcdf ropp_bg_ses_ecmwf_error_corr_L60 {
  dimensions:
    n_state = 121 ;
    nbins = 1 ;

  variables:
    double sigma(nbins,n_state) ;

  data:
    sigma = 
    9.96920996838687e+36, 9.96920996838687e+36, 9.96920996838687e+36, 
...
    9.96920996838687e+36 ;

in its sigma field, and when ropp_1dvar_bgr_error acts on this, the line

      ro_data%Lev2c%press_sfc_sigma = MIN(sigma(2*ro_data%Lev2b%npoints + 1,ilat), 1000.0_wp)

results in

  variables:
    float press_sfc_sigma(dim_unlim) ;
      press_sfc_sigma:long_name = "Estimated error (1-sigma) for surface pressure" ;
      press_sfc_sigma:units = "hPa" ;
      press_sfc_sigma:valid_range = 0., 5. ;

  data:
    press_sfc_sigma = 1000 ;

appearing in the output file. This is because 1000 Pa is outside the valid range of [0, 5] hPa, so ncdf_putvar.f90 declines to carry out the appropriate rescaling from Pa to hPa (although 10 hPa would in fact still be out of range). ropp_1dvar_refrac range-checks on input, and therefore sets press_sfc_sigma to ropp_MDFV. And when ropp_fm_roprof2state.f90 reads such an out of range value, the line

    IF (ro_data%Lev2c%press_sfc_sigma > 0.0_wp) THEN

      j = 2*x%n_lev + 1
      IF (x%use_logp) THEN
        x%cov%d((j*(j+1))/2) = ( ro_data%Lev2c%press_sfc_sigma /    &
                                 ro_data%Lev2c%press_sfc )**2
      ELSE
        x%cov%d((j*(j+1))/2) = ro_data%Lev2c%press_sfc_sigma**2
      END IF

    ELSE

      x%cov_ok = .FALSE.

    END IF

results in x%cov_ok being set to .FALSE.. This, finally, results in the failure described at the head of this comment.

comment:14 by Ian Culverwell, 5 years ago

The situation is different for ROPP9.1, because before r5887 the maximum press_sfc_sigma in ropp_1dvar_bgr_error was 100 Pa, not 1000 Pa. This is within range, and resulted in

  variables:
    float press_sfc_sigma(dim_unlim) ;
      press_sfc_sigma:long_name = "Estimated error (1-sigma) for surface pressure" ;
      press_sfc_sigma:units = "hPa" ;
      press_sfc_sigma:valid_range = 0., 5. ;

  data:
    press_sfc_sigma = 1 ;

being output. 1 hPa is within the valid range, and therefore all is well.

It therefore seems prudent to increase the default maximum press_sfc_sigma to 10 hPa in ropp_io_types.f90. This has been done at r6023, and the IO UG has been updated at r6024. It doesn't cause any problems in the automatic tests, which suggests we never encounter press_sfc_sigma > 5 hPa in them. #536 has also been updated.

This is not enough to get ropp_1dvar_refrac to work on this problem, however, as the input bgr file, bgf_20160101_000028_METB_G021_R_2305_0010.nc, from which the profile-by-profile sigmas are read (because bg_covar_method = VSFC) has

variables:
    float press_sfc_sigma(dim_unlim) ;
      press_sfc_sigma:long_name = "Estimated error (1-sigma) for surface pressure" ;
      press_sfc_sigma:units = "hPa" ;
      press_sfc_sigma:valid_range = 0., 5. ;

  data:
    press_sfc_sigma = -9.9999e+07 ;

press_sfc_sigma is reset (to 100 or 1000) by ropp_1dvar_add_bgr_error.f90, but its valid range is inherited from this original file. When we correct this (e.g. with

cp bgf_20160101_000028_METB_G021_R_2305_0010.nc bgf_20160101_000028_METB_G021_R_2305_new_0010.nc

ncatted -avalid_range,press_sfc_sigma,m,f,'0, 10' bgf_20160101_000028_METB_G021_R_2305_new_0010.nc

), this results in x%cov = .TRUE., and ropp_1dvar_refrac carries out a 1dvar retrieval. (The differing values of press_sfc_sigma between ROPP9.1 (1 hPa) and ROPP10.0 (10 hPa) naturally lead to slightly different retrievals.)

We see exactly the same problem with the ECMWF forecast file bgo_20160101_000028_METB_G021_R_2305_0010.nc. The same remedy, of increasing the valid range of press_sfc_sigma to [0, 10] hPa, works - in the sense of giving the same result as ROPP9.1. Both fail to produce a 1dvar retrieval for the 'Temporal separation' reason cited above (#522#comment:10).

comment:15 by Ian Culverwell, 5 years ago

Resolution: fixed
Status: newclosed

An attempt to finally treat this issue properly, has been made at r6033.

  • The original problem, using {atm, bgr}20170415_034828_M02_1200631416_N0023_XXXX.nc, for which atm_time = 03:48Z 15/4/2017 and bgr_time = 12:00Z 14/4/2017 with fcperiod=18, still works (although I had to change valid_range of press_sfc_sigma to [0, 10] hPa to get it to run at ROPP10.0, as above.)
  • The ERA-I time-interpolated problem, using {atm, bgf}_20160101_000028_METB_G021_R_2305_0010.nc, for which atm_time = 00:28Z 01/01/2016 and bgr_time = 00:00Z 0101/2016 with fcperiod=-9.9999e+07, still works, because fcperiod is still excluded from the calculation of bg%time (although I had to change valid_range of press_sfc_sigma to [0, 10] hPa to get it to run at ROPP10.0, as above.)
  • The ECMWF forecast, using {atm, bgo}_20160101_000028_METB_G021_R_2305_0010.nc, for which atm_time = 00:28Z 01/01/2016 and bgr_time = 00:00Z 0101/2016 with fcperiod=12, now works: instead of
    ERROR (from ropp_qc_genqc):  Temporal separation between observations and background profile of 4.31715E+04 s exceeds upper limit of 1.08000E+04 s.
    
  • as we found at ROPP9.1, we now get
    INFO (from ropp_qc_genqc):  Background time:  12:00:00.000Z 01/01/2016
    INFO (from ropp_qc_genqc):  Observation time: 00:00:28.469Z 01/01/2016
     
    WARNING (from ropp_qc_genqc):  Temporal separation of 11.992 h between observations and background profile exceeds upper limit of 3.000 h.
       Subtracting bg_fcperiod of 12.000 h from background time and rechecking.
    

and it carries on to do a retrieval.

This could probably do with a bit more testing - ask DMI, and look at the results of the test folder - but for now we can close the ticket.

Note: See TracTickets for help on using tickets.