Interpolation of forcing data

Production of forcing files
Clare Enright
March 2014
Chapter 1
Introduction
This document describes the reformatting of forcing data from a variety of sources
and models into the formats required by NEMO.
1.1
Output
NEMO use two styles of forcing files which are described in the NEMO book; CLIO
and CORE. Generally CLIO is associated with earlier versions of NEMO but this is
our local convention; either style can be used with (all? or the ones we are now
using?) NEMO versions. CORE forcing expects each variable in a separate file while
CLIO expects all variables other than the wind stresses to be in a single file. (Don’t
know if this is compulsory or again our local convention....)
The following chapters firstly describe how to modify and run the basic IDL
program followed by a description of the steps taken for several new datsets. Where
these steps have resulted in the prodcution of new and potentially useful scripts this
is emphasised.
1.2
CDO or IDL?
There is no evidence to suggest the interpolation by one program is any better
than the other. The CDO interpolation is generally more convenient: it can be
called from a script which extracts data from multi-year files and does other manipulation through the NCO routines. However, CDO will not rotate the wind vectors to the ORCA grid; from NEMO v3.3 (maybe 3.2 for all I know) this can be
left to NEMO but for earlier versions this must be done in advance. This manual describes the use of IDL to do this, but it can also be done with the program
/gpfs/data/greenocean/make forcing/cmip5/HadGEM2-CORE/rotate.f90 As it is
quite difficult to see by eye whether the rotation has been done there should be a
note kept of the rare occasions this has NOT been done. Clare: Get IDL to add an
attribute that it HAS been done
NOTE The scripts provided try to be applicable in general but do make assumptions about the datasets. When used on a new model or experiment always check
that the correct timesteps are being extracted and that the appropriate scaling has
been done to give the units you require.
The present status is as follows:
• CMIP5 CORE: CDO used for scalars and to extract winds from multi-year files.
IDL used to interpolate winds
• CMIP5 CLIO: CDO scripts only used to extract data from multi-year files. IDL
interpolates all variables.
• ncep: The IDL programs are used to extend the ncep forcing files each year.
This chapter has not been updated so contains much detail of historic interest
only.
• ERA Interim: The IDL programs are used to extend the ncep forcing files each
year.
• JPL winds: The wind speeds are converted to stresses in Fortran and the IDL
program used for the interpolation.
• IPCC: results from the earlier (2007) CMIP models were pre-processed using
Fortran to combine model results with the inter-annual variability of the ncep
values.
• Blended CORE-style forcing datasets: (1) COREv2 (Coordinated Ocean References Experiments version 2) and CORE-NYF (Normal Year Forcing) and
(2) DFS4.3 (DRAKKAR Atmospheric Forcing Set v4.3). These data are bilinearly interpolated using CDO. CDO bicubic interpolation is used for winds so
rotation pairing (under NEMO3.3) needs to be switched on.
Chapter 2
Running the IDL program
The basic IDL program is in /gpfs/data/greenocean/make forcing/idl code. The
routines here were written to produce CLIO style forcing from ncep files; other subdirectories to make forcing hold those files which have been modified for particular
datasets. The files which generally need modification are:
• For each variable to be interpolated the relevant file naminterp....pro must
tailored for the input dataset. Commonly edited fields are:
1. nscal: set to 1 for vector input
2. ninterp: 0 for bilinear, 1 for bicubic interpolation (we have commonly
used bilinear for scalars and bicubic for vectors)
3. data name: the name of the variable
The following fields are generally common to all variables in a dataset so
may conveniently be separated out into a different file (see /gpfs/data/greenocean/mak
4. jpiatm,jpjatm,jpkatm: number of points in x,y,z
5. dx,dy: grid size in x and y directions
6. south lat: southernmost latitude in input
7. lon shift: number of grid points to shift westwards to give longitudes
from -180 to 180.
8. north pole: =1 if north pole included in input
9. key vari: generally this should be set to 1 to use the scale and offset
from the input netcdf file. However, if set to zero setting add offset
and scale factor appropriately provides a convenient way to change units.
Note that if you use this you must take care to also include the scaling from the netcdf file and that this can change from year to year.
10. nreverse: =0 for latitudes from south to North (CMIP5, JPL datasets) and
1 of North to South (NCEP, ECMWF datasets)
The naminterp... files contain many other flags and settings which are worth
browsing. They are messy to read; many settings are in if-then-else blocks
where only the true condition has been set appropriately.
• init path general.pro must be edited to set the input and output directories
and the input and output filenames. It also includes flags for each variable:
set interp to 1 to interpolate that variable and output to 1 to include the interpolated data in the combined file required by CLIO forcing.
• force.pro is the main program. It will loop through the variables specified
in init path general.pro and carry out the specified operations. It also includes calls to windspeed.pro (which calculated the speed from the u and
v components) and sathum.pro which calculates the saturated fraction from
interpolated values of pressure, air temperature and humidity.
The IDL code can be run on grace by the following commands: module add idl
idl
And at the IDL prompt:
@init interp
force
Chapter 3
CMIP5 CORE
These forcing files are in /gpfs/env/data/greenocean/forcing/cmip5 and the modified scripts are in /gpfs/data/greenocean/make forcing/cmip5/. The force.pro
script has been modified to loop through multiple years.
• cdo extract interp general.csh extracts a year’s data and interpolates scalars
to the NEMO grid.
• years from name.tcl extracts information from the filename and needs modifying for each new model and experiment.
• cdo extract vecs.sh extracts the winds from a multi-year file so that they
can be interpolated by IDL
Some datasets have required some special handling (the following comes from
the README files provided):
• HadGEM2-ES uses a 360 day calendar, the years start 1/12 and end 30/11
and precipitation was not available as monthly records so the daily files had
to be averaged. For scalars:
1. cdo extract interp hadgem2.csh; Modified from general case for file
name format and number of records in a year
2. cdo extract interp hadgem2 split years.csh; As above but extracts
December data from one file and rest of year from a second file.
3. hadgem2 split years.csh: Calls cdo extract interp hadgem2 split years.csh
for the split years (which occur every decade)
4. hadgem2 extra days.csh Increases number of days to 365 by copying
every 60th day
For vectors:
1. hadgem2 cdo extract vecs.sh Modified from general case for file name
format and number of records in a year
2. cdo extract vecs hadgem2 split years.csh As above but extracts December data from one file and rest of year from a second file.
3. hadgem2 vecs split years.csh Calls cdo extract interp hadgem2 split years.csh
for the split years (which occurs every decade)
4. hadgem2 vecs extra days.csh Increases number of days to 365 by copying every 60th day
For precipitaion (pr and prsn)
1. nc mon ave.sh Calculates monthly average from daily data
• MPI-ESM-LR
1. cdo extract interp general.csh modified to read in years from name.tcl
from working directory rather than /gpfs/env/e198/Programs/cdo interp/years from
Added cdo del29feb to remove 366 day years (gregorian calendar - MPIM)
2. years from name.tcl has been modified to include mpilr
Note that:
1. huss comes from the 1000 hPa level of the 3D file (these look reasonable)
as no 2-D huss file was reported
2. cdo uas and cdo vas are bicubic interpolated (CDO) winds which will
work in NEMO3.3 with rotation pairing on
3. idl winds were interpolated by IDL. The grid dimensions are T63 so the
CSIRO IDL input files (naminterpuvflx CSIRO.pro) were used.
Chapter 4
CORE-style blended datasets
Coordinated Ocean References Experiments (COREs, Large and Yeager, 2004; Large
and Yeager, 2009; Griffies et al., 2009) version 2 forcing data are in
/gpfs/env/data/greenocean/forcing/core2 for the period 1948 - 2007. DRAKKAR
Atmospheric Forcing Set v4.3 (DFS4.3; Brodeau et al. 2010;
http://people.su.se/ lbrod/dir/dfs4.html) data are in
/gpfs/env/data/greenocean/forcing/dfs4/orca2 for the period 1948 - 1996. Both
datasets use a bulk formulation for calculating surface boundary conditions based
upon the same set of atmospheric variables, however they vary in terms of data
sources used and the corrections applied to fields so as to limit imbalance in model
heat and freshwater budgets (see Brodeau et al. 2010). Additionally the COREv2
NYF (Normal Year Forcing) dataset is available in
/gpfs/env/data/greenocean/Forcev33. This is an idealised annual climatological
cycle of all fields required to force an OGCM, which can be integrated over as many
years as required to investigate the approach to equilibrium.
Notes for using these data:
• The
precipitation
forcing
data
file
stored
in
/gpfs/env/data/greenocean/Forcev33/ncar precip.15JUNE2009 orca2.nc
has temporal dimensions of 1:1460 (consistent with 6 hourly data. However only the first 12 timesteps contain data (the rest are NaNs).
• NEMO3.3 must be used as the wind vectors need to be rotated to the ORCA2
grid by turning Rotation Pairing on in namelist when setting the CORE bulk
formulae (namsbc core). This is because CDO has been used to interpolate
winds (remapbic).
• When using DFS4.3 ln 2m must be set to .true. as air temperature and
humidity are referenced at 2m rather than 10m (as for COREv2 and CORENYF).
Chapter 5
CMIP5 CLIO
The forcing files are in /gpfs/env/data/greenocean/forcing/nemo v3.1/cmip5,
and the modified scripts are in /gpfs/data/greenocean/make forcing/cmip5/.
They were produced using the IDL program as this had been used previously for
the production of CLIO files.
• cdo extract general.csh extracts the data into individual years. Note that
to change the variable name (best done here) or set the start and end years to
extract you MUST specify these on the command line or the defaults will be
used.
• force.pro loops through the extracted years.
• cdo extract uas to uflx.csh calculates the wind stresses and outputs them
to annual files:
cdo extract uas to uflx.csh src file first year required last year required input variable name output variable name
For modifications necessary to produce HadGEM2-ES and MPI-ESM-LR forcing
files, see CMIP5 CORE chapter.
Chapter 6
ncep
6.1
Required data
Data is available for download from http://www.cdc.noaa.gov/cdc/data.ncep.reanalysis.html
each April for the previous year. Two forcing files have been used in the past which
differ in the following ways:
• ncep force bulk nnnn.nc uses the ORCA grid
• ncep force bulk nnnn.nc requires downloaded pressure, sea surface temperature
and specific humidity data. The ncep bulk nnnn.nc uses the pressure and specific humidity only to calculate a different measure of humidity.
• ncep force bulk nnnn.nc: temperatures are in degrees Kelvin rather than Celsius
• ncep force bulk nnnn.nc: total cloud cover is in per cent rather than a fraction
The following files are required:
1. air.2m.gauss.2004.nc
2. shum.2m.gauss.2004.nc
3. pres.sfc.gauss.2004.nc
4. uwnd.10m.gauss.2004.nc
5. vwnd.10m.gauss.2004.nc
6. tcdc.eatm.gauss.2004.nc
7. prate.sfc.gauss.2004.nc
8. skt.sfc.gauss.2004.nc; for ncep force bulk nnnn.nc only
9. uflx.sfc.gauss.2004.nc
10. vflx.sfc.gauss.2004.nc
File
interp mask
air.2m
prate.sfc
pres.sfc
tcdc.eatm
wspd.10m
uflx.sfc
vflx.sfc
bilin
bilin
bilin
bilin
bilin
bicub
bicub
t
t
t
t
t
u,v
u,v
dx
dy
1.875
1.875
1.875
1.875
1.875
1.875
1.875
1.904
1.904
1.904
1.904
1.904
1.904
1.904
lon
shift
96
96
96
96
96
96
96
west
lon
-178.125
-178.125
-178.125
-178.125
-178.125
-178.125
-178.125
north
pole
1
1
1
1
1
1
1
scale
offset
.01
204.5
1e-7 .0032765
10
367650
.001
32.765
1
0
-.001
28.765
-.001
28.765
Table 6.1: Input parameters for interpolation
Note: If the used column =“yes” then the scale and offset values from the table are used. The air temperature units should be
changed from Kelvin to Celcius and the Total cloud cover from percentage to fraction; each year check that the scale and offset
in the above table are correct. If it equals “file” then the values in the input netCDF file are used. For ncep force bulk nnnn.nc
tempperatures should be in degrees Kelvin so the offset should be 477.65 (as in the input file). The total cloud cover should be
scaled as in the file and NOT as in above table. Note that the windspeed scale and offset are applied in windspeed.pro whatever
the value of key vari in naminterpwind.pro.
6.2
Processing required; present status
.5
• Windspeed is calculated from uwnd and vwnd (= (u2 + w 2 ) ). This must be
done on the regular grid; i.e. before the interpolation step. IDL program
windspeed.pro; this is called as the first step in force.pro. Note that it takes a
minute or so to run so if it is necessary to run force again then its worth commenting out the line calling windspeed. Another minor problem with this is
that windspeed writes the output file to output dir - not unreasonably. However interpolation expects its input files to be in input dir - which is equally
reasonable! So copy (or move) the file wspd.10m.gauss.nnnn.nc from the
output directory to the input directory while the first interpolation is running.
• Times need converting from hours since 1/1/1 to a count of seconds from
zero. This is done by outputing times to the final file counting up from zero
in steps of 86400.
• All files need interpolating to the new grid (ORCA R2 for 2004). Input parameters to the IDL program INTERP (available from http://www.lodyc.jussieu.fr/opa)
are summarised in Table ??; any scaling and offest calculations are carried out
here. For consistency with previous years it is not anticipated that these will
be changed except for the differences between the requirements to produce
ncep force bulk and ncep bulk detailed above and:
– nitend changed from 365 to 366 for leap years
• For ncep bulk nnnn.nc the humidity is calculated from the air temperature,
specific hunidity and pressure values from the first three files above. IDL
program sathum.pro - called from force.pro when all interpolation has been
completed.
humidity = (shum ∗ pres)/(.622 + .378 ∗ shum)/dewpt
(6.1)
dewpt = 611 ∗ 10(7.5(T −273.15)/(T −35.86))
(6.2)
where
used?
yes
file
file
yes
file
yes
yes
.622 = Mean molecular weight of water / Mean mol. wt. of dry air
611 = Saturation vapour pressure at 0C
• All scalar files must be combined to form ncep bulk nnnn.nc. not yet done.
Notes:
• Of the two input files Keith has provided he has used a bilinear interpolation for the scalar example and bicubic for the vector. I’ve used the bilinear
interpolation for all scalars..
• The value for west lon of 178.125; I don’t understand this. The orginal data
range from 0 to 358.125◦ . The value of 96 for lon shift gives the best fit
between data and the land mask so is correct. 96 ∗ 1.875 = 180. So I’d have
thought west lon should be 180; used 178.125 to be consistent with Keith’s
results.
6.3
Process
The intention was to combine the above steps into a single IDL run while leaving
the INTERP package unchanged. This has largely been possible but interpolation
now includes a new file containing common variables common force and does not
run init path or naminterp.
The new IDL program is called force. To set up a new run init path general.pro
must be edited; a structure is set up for each variable which sets the file names
for input and output and sets flags to request interpolation and/or output to the
ncep file. For each variable to be interpolated a file naminterpvarname.pro must
be written; in general these will remain the same from year to year excpet that
nitend=365 must be set to 366 for leap years. Note that setting the interp flag to
zero means that individual files can be excluded from the interpolation in later runs
if the results are satisfactory.
If the humidity is to be calculated then the line calling sathum in force must
not be commented out.
When the interpolation is complete the results required (i.e. those with the
output flag set) are copied to the ncep file.
Chapter 7
ECMWF Interim
CLIO forcing files have been prepared for the ECMWF Interim analysis data. Data
can be downloaded from http://data-portal.ecmwf.int/data/d/interim daily
(to be replaced by http://apps.ecmwf.int/datasets/data/interim full daily)
From the Surface Level analysis data (all times and step 0): On the next page
Variable
10 metre U wind component
10 metre V wind component
2 metre temperature
Surface pressure
Total cloud cover
name
units
10u or u10
ms− 1
10v or v10
ms− 1
2t or t2m
degK
sp
Pa
tcc
f raction
request 1.5◦ resolution.
From the Surface level, times 0:0 and 12:00, step 3,6,9,12: On the next page
Variable
Instantaneous X surface stress
Instantaneous Y surface stress
Total precipitation
Name Units
iewss Nm− 2
insss Nm− 2
tp
m
request 1.5◦ resolution.
From Pressure level; level 1000, all times: On the next page request 1.5◦ resoVariable
Specific humidity
Name Units
q
kgkg − 1
lution.
It is convenient to download the data in three steps so you have one file containing the first five variables, the next file containing the next three etc. The data
can then be extracted into one file per variable as expected by the IDL interpolation
program using e.g.: ncks -v t2m -o airtemp2m 2008.nc five.nc
Note that some variable names changed in 2008: air temperature from 2t to
t2m, u wind from 10u to u10 and v wind from 10v to v10.
7.1
Processing data
windspeed.pro must be run to calcuate the windspeed from the u and v compnents.
The precipitation values downloaded are the accumulated preciptation in metres over a 3 hour period. prate.pro is run to calculate the average daily rate in
kgm−2 s−1 by:
tp[l=4]−tp[l=1]
+ tp[l=8]−tp[l=5]
9∗3600
9∗3600
∗ ρwater
(7.1)
Prate (day) =
2
The daily mean is calculated for windspeed, air temperature, pressure, cloud
cover and humidity after interpolation to the ORCA grid.
These IDL routines are combined into a single program force.pro. To run this
for a new year the following steps must be taken:
• set year correctly throughout init path general.pro
• check the number of time-steps is correct in the naminterp files
• set scale factor and add offset in naminterpair.pro. scale factor should be
the value in the air temperature file, subtract 278.16 from value in file for
add offset to convert to degC.
• check variable names in force.pro, windspeed.pro,naminterpair.pro are
correct for the year you are running as these have changed at least once in the
ECMWF Interim files.
Chapter 8
JPL
These winds can be downloaded via anonymous ftp from podaac.jpl.nasa.gov; browse
http://podaac.jpl.nasa.gov/ to see what is available. Data are in allData/ccmp/L3.0/flk
(equivalent to OceanWinds/ccmp/L3.0/flk). The datset appears to end in 2012; flk
stands for first look and there is mention of a “last look” becoming available but in
March 2014 I couldn’t find any estimate of when this might be.
• /gpfs/datagreenocean/make forcing/jpl/setlink.csh sets links to all data
files
• jpltoyear reads the four wind speeds for each day of the year from the files
linked above and takes the mean. This is converted to wind stress and output
to ustress.nnnn.nc or vstress.nnnn.nc. jpltoyear in must be edited for each
year and the number of days set correc tly.
• Use IDL program is used to interpolate to the orca grid. In
/gpfs/data/greenocean/make forcing/jpl are versions of init path general.pro
and force.pro which have been simplified from the general versions and
naminterpuvflx.pro which has been edited for this dataset.
init path general.pro needs the year setting correctly and the number of days
is required in naminterpuvflx.pro.
Chapter 9
IPCC data
9.1
Introduction
It was required to produce files equivalent to the ncep forcing files for the years
2007 to 2065. Data from an IPCC model was used as the basis for the new files.
The predicted change in the values (the anomaly) was calculated by subtracting the
monthly mean for years 2000-2015 from the running mean calculated over 15 years
for each year. To introduce a natural inter-annual variablity the ncep values from
1948 to 2006 were used. See Section “Anomaly calculation” below for full details.
9.1.1 Required data
This data was downloaded from http://www-pcmdi.llnl.gov/ipcc/info-for-analysts.php.
The data chosen was from hadgem1, simulation sresa2, run1 as this includes changes
to stratospheric ozone (the only other model I could find including this was ECHAM5 MPI OM).
The files downloaded are:
1. hus A1 2000 Jan to 2049 Dec.nc
2. hus A1 2050 Jan to 2099 Nov.nc
3. pr A1 2000 Jan to 2099 Nov.nc
4. psl A1 2000 Jan to 2099 Nov.nc
5. tas A1 2000 Jan to 2099 Nov.nc
6. tauu A1 2000 Jan to 2099 Nov.nc
7. tauv A1 2000 Jan to 2099 Nov.nc
8. uas A1 2000 Jan to 2099 Nov.nc
9. vas A1 2000 Jan to 2099 Nov.nc
There is no 2d specific humidity data available. The two 3D files have specific
humidity values at 16 different pressure levels. I wrote shscale.pro to interpolate/extrapolate humidity values at the pressure values in the psl file from the bottom two humidity values from the hus files. Note that the grid for the humidity
data is different to that for the air temperature and pressure; I have ignored this
in shscale. This method was NOT used in the production of the final files. Alternatively calculate the saturation fraction using the lowest humidity value and
pressure=100000Pa; instead of shscale.pro run sh2d.pro to output the data you
require into a single file for input to the interpolation program - sh2d.pro must be
run twice as explained below. sathum needed minor modification to use a pressure of 100,000Pa instead of reading values from file - this is the present version of
sathum in the ipcc directory - the old lines are commented out.
sh2d.pro must be run twice;
• copy namsh2d1.pro to namsh2d
• run sh2d (idl, @init interp, sh2d)
• copy output file sph A1 2000 Jan to 2049 Dec.nc to sph A1 2000 Jan to 2065 Dec.nc
• copy namsh2d2.pro to namsh2d
• run sh2d (idl, @init interp, sh2d) - this concatentates results for last 15 years
to the output file
The output file from sh2d can be interpolated with the other variables. Then the
saturation fraction is calculated with sathum.pro. Note that the variable name
output from sh2d and the interpolation program is humidity - this is NOT the same
humidity measure as in ncep files. The output from program sathum is called
satfrac - this IS the same humidity measure as in ncep files!
9.2
Interpolation
There are two slightly different grids used in these files so care must be taken with
the input parameters to the interpolation - see Table ??. Note that the two grids are
used for tauu and tauv; the interpolation program uses that from tauv.
File
clt
hus
pr
psl
tas
tauu
tauv
uas
vas
west
0
.9375
0
0
0
.9375
0
.9375
.9375
east
358.125
359.0625
358.125
358.125
358.125
359.0625
358.125
359.0625
359.0625
south
-90
-89.375
-90
-90
-90
-90
-89.375
-89.375
-89.375
north
90
89.375
90
90
90
90
89.375
89.375
89.375
nx
192
192
192
192
192
192
192
192
192
ny
145
144
145
145
145
145
144
144
144
scale offset
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
Table 9.1: Description of input grids
The scaling factors required are summarised in Table ??. The latitudes in the
input file are in the same order as meshmask so the flag nreverse is set to 0 (and
not 1 as for the ncep data).
File
scale
offset
clt
.001
0
hus use T in dec C
pr
1
0
psl
1
0
tas
1
273.15
tauu
-1.
0
tau
-1.
0
uas
1
0
vas
1
0
Table 9.2: Scale and offsets to use to match the output to NCEP data; all these must
be set in the naminterp and windspeed files.
9.3
Anomaly calculation
The anomaly calculation programs are written in Fortran and run on the cluster at
UEA.
• ipcc yearly calculates the running 15-year mean for years 2007 to 2065 and
interpolates daily values for each year. (The first 15 days of 2007 and the last
15 days of 2065 are extrapolated from the first and last two points respectively).
• ipcc anomaly
1. calculates an offset from the difference between the mean for years 2001to
2006 and the mean for year 1948 to 1963. Offsets are calcuated for air
temperature, humidity, wind speed and aind fluxes. As no trend is observed for precipitation or cloud cover the offset is not calculated
2. calculates the predicted change in the values (the anomaly) by subtracting the monthly mean for years 2000-2015 from the running mean calculated over 15 years for each year (the IPCC predicted change from 2000
to 2065)
3. The anomaly was added to the offset plus the values from ncep files for
years from 1948 to 2006 (giving the predicted value with a “natural”
inter-annual variability superimposed). Values for humidity, windspeed
and cloud cover are not set to zero if the calculated value is negative.