Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
13cd040
deprecate cdo/python-cdo: replace CDO usage with xarray, add deprecat…
Copilot Apr 1, 2026
8f5552e
revert unrelated JSON file changes
Copilot Apr 1, 2026
5439a08
address code review: use xarray .dt accessor and context managers
Copilot Apr 1, 2026
0443782
remove CDO, add xarray averager, rename frepytools to NumpyTimeAverag…
Copilot Apr 3, 2026
4c91d82
fix: handle non-numeric vars and cftime time_bnds in xarrayTimeAverag…
Copilot Apr 3, 2026
e7f465b
test: add comprehensive unit tests for xarrayTimeAverager and add wei…
Copilot Apr 3, 2026
164c394
test: address code review — rename bnds_dtype → time_bnds_encoding, f…
Copilot Apr 3, 2026
c8fdd85
fix: NumpyTimeAverager now handles arbitrary dimensions (scalar, 3D, …
Copilot Apr 9, 2026
61babc7
some additional logging is helpful
ilaflott Apr 10, 2026
80ecff9
put default back to cdo to check for smooth transition behavior. add …
ilaflott Apr 10, 2026
5e603f3
refactor: move NumpyTimeAverager to numpyTimeAverager.py, add monthly…
Copilot Apr 10, 2026
7689a02
revert unrelated CMOR test file changes from test runs
Copilot Apr 10, 2026
381ba00
fix: typo in numpyTimeAverager.py comment
Copilot Apr 10, 2026
50482c9
fix: strip stale unlimited_dims encoding from xarrayTimeAverager outp…
Copilot Apr 10, 2026
8563837
feat: add rigorous numerical accuracy tests and fre-python-tools mont…
Copilot Apr 10, 2026
473849f
refactor: derive unlimited_dims from encoding rather than hardcoding …
Copilot Apr 10, 2026
09c34e4
feat: add test_numpyTimeAverager.py covering uncovered exception/erro…
Copilot Apr 10, 2026
b5dcf5a
feat: add cross-pkg weighted averaging bitwise reproducibility tests …
Copilot Apr 10, 2026
f88e5e6
fix: correct rtol value in cross-pkg test docstring from 1e-12 to 1e-6
Copilot Apr 10, 2026
b18aadc
fix: correctly reduce time-metadata variables (time_bnds, time, avera…
Copilot Apr 10, 2026
2333bbb
update frenctools timeaverager tests
ilaflott Apr 10, 2026
99729f4
docs: update time averager documentation — README.md, docs/tools/app.…
Copilot Apr 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 73 additions & 1 deletion docs/tools/app.rst
Original file line number Diff line number Diff line change
@@ -1 +1,73 @@
`fre app` tools are intended to be a collection of single-purpose tools.
``fre app`` tools are a collection of single-purpose postprocessing utilities.

``gen-time-averages``
~~~~~~~~~~~~~~~~~~~~~
Compute time-averaged NetCDF files from input history or timeseries files.
Supports weighted and unweighted averages across the full time dimension,
by month (monthly climatology), or by season.

.. code-block:: console

fre app gen-time-averages -i INPUT.nc -o OUTPUT.nc -p xarray [-v VAR] [-u] [-a all|month|seas]

Options:

* ``-i / --inf`` — input NetCDF file (required)
* ``-o / --outf`` — output NetCDF file (required)
* ``-p / --pkg`` — backend package. One of:

- ``xarray`` (default) — uses xarray/dask. Supports ``all``, ``seas``, ``month``.
- ``fre-python-tools`` or ``numpy`` — pure numpy + netCDF4. Supports ``all``, ``month``.
- ``fre-nctools`` — wraps the Fortran ``timavg.csh`` from fre-nctools. Supports ``all``, ``month``.
- ``cdo`` — **deprecated stub** that redirects to xarray with a ``FutureWarning``.
CDO/python-cdo has been removed from fre-cli.

* ``-v / --var`` — target variable name (auto-detected from filename if omitted)
* ``-u / --unwgt`` — compute unweighted (simple) mean instead of ``time_bnds``-weighted
* ``-a / --avg_type`` — averaging mode: ``all`` (default), ``month``, or ``seas``

**Average types**

+----------+-------------------------------------------------------------------+
| Type | Description |
+==========+===================================================================+
| ``all`` | Average over all timesteps → single output timestep |
+----------+-------------------------------------------------------------------+
| ``month``| Monthly climatology → one file per calendar month (``.01.nc`` …) |
+----------+-------------------------------------------------------------------+
| ``seas`` | Seasonal climatology (xarray only) → DJF / MAM / JJA / SON |
+----------+-------------------------------------------------------------------+


``gen-time-averages-wrapper``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Workflow-level wrapper that generates climatologies for all variables across
a set of history sources, date ranges, and grids. Called internally by
``fre pp`` Cylc workflows.

.. code-block:: console

fre app gen-time-averages-wrapper --cycle-point YYYY --dir DIR --sources SRC1,SRC2 \
--output-interval P10Y --input-interval P2Y --grid native --frequency yr -p xarray

``combine-time-averages``
~~~~~~~~~~~~~~~~~~~~~~~~~
Merge per-variable climatology shards into combined files, used downstream
of ``gen-time-averages-wrapper`` by ``fre pp`` workflows.

.. code-block:: console

fre app combine-time-averages --in-dir /path/to/av --out-dir /path/to/pp \
--component atmos --begin 1979 --end 1988 --frequency yr --interval P10Y

``regrid``
~~~~~~~~~~
Regrid target NetCDF files to a specified output grid.

``mask-atmos-plevel``
~~~~~~~~~~~~~~~~~~~~~
Mask diagnostic pressure-level output below surface pressure.

``remap``
~~~~~~~~~
Remap NetCDF files to an updated output directory structure.
2 changes: 0 additions & 2 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ dependencies:
- noaa-gfdl::analysis_scripts==0.0.1
- noaa-gfdl::catalogbuilder==2025.01.01
# - noaa-gfdl::fre-nctools==2022.02.01
- conda-forge::cdo>=2
- conda-forge::cftime
- conda-forge::click>=8.2
- conda-forge::cmor>=3.14
Expand All @@ -22,7 +21,6 @@ dependencies:
- conda-forge::pytest
- conda-forge::pytest-cov
- conda-forge::pylint
- conda-forge::python-cdo
- conda-forge::pyyaml
- conda-forge::xarray>=2024.*
- conda-forge::netcdf4>=1.7.*
20 changes: 11 additions & 9 deletions fre/app/freapp.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,9 +138,12 @@ def mask_atmos_plevel(infile, psfile, outfile, warn_no_ps):
required = True,
help = "Output file name")
@click.option("-p", "--pkg",
type = click.Choice(["cdo","fre-nctools","fre-python-tools"]),
default = "cdo",
help = "Time average approach")
type = click.Choice(["cdo","fre-nctools","fre-python-tools","xarray","numpy"]),
default = "xarray",
help = "Time average backend. 'xarray' supports all/seas/month; "
"'fre-python-tools'/'numpy' support all/month; "
"'fre-nctools' wraps Fortran timavg.csh (all/month). "
"'cdo' is deprecated and redirects to xarray.")
@click.option("-v", "--var",
type = str,
default = None,
Expand All @@ -152,9 +155,8 @@ def mask_atmos_plevel(infile, psfile, outfile, warn_no_ps):
@click.option("-a", "--avg_type",
type = click.Choice(["month","seas","all"]),
default = "all",
help = "Type of time average to generate. \n \
currently, fre-nctools and fre-python-tools pkg options\n \
do not support seasonal and monthly averaging.\n")
help = "Type of time average to generate. "
"'seas' is only supported by the xarray backend.")
def gen_time_averages(inf, outf, pkg, var, unwgt, avg_type):
"""
generate time averages for specified set of netCDF files.
Expand Down Expand Up @@ -192,9 +194,9 @@ def gen_time_averages(inf, outf, pkg, var, unwgt, avg_type):
required = True,
help = "Frequency of desired climatology: 'mon' or 'yr'")
@click.option("-p", "--pkg",
type = click.Choice(["cdo","fre-nctools","fre-python-tools"]),
default = "cdo",
help = "Time average approach")
type = click.Choice(["cdo","fre-nctools","fre-python-tools","xarray","numpy"]),
default = "xarray",
help = "Time average backend. 'cdo' is deprecated and redirects to xarray.")
def gen_time_averages_wrapper(cycle_point, dir_, sources, output_interval, input_interval, grid, frequency, pkg):
"""
Wrapper for climatology tool.
Expand Down
81 changes: 75 additions & 6 deletions fre/app/generate_time_averages/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,77 @@
From an input netCDF file, output a time-averaged netCDF file.
# generate_time_averages

Will average all available information on a given variable across time,
but can also handle monthly+seasonal averaging.
Compute time-averaged NetCDF files from input history or timeseries files.
Supports weighted and unweighted averages across the full time dimension,
by month (monthly climatology), or by season.

To run time-averaging tests, return to root directory and call just those
tests with `python -m pytest tests/test_generate_time_averages.py`, or run
all tests with `python -m pytests tests/`
## Available backends (`--pkg`)

| `--pkg` value | Backend class | Notes |
|---|---|---|
| `xarray` | `xarrayTimeAverager` | Default. Uses xarray/dask. Supports `all`, `seas`, `month`. |
| `fre-python-tools` / `numpy` | `NumpyTimeAverager` | Pure numpy + netCDF4. Supports `all`, `month`. |
| `fre-nctools` | `frenctoolsTimeAverager` | Wraps the Fortran `timavg.csh` from fre-nctools. Supports `all`, `month`. |
| `cdo` | `cdoTimeAverager` | **Deprecated stub** — silently redirects to `xarrayTimeAverager` with a `FutureWarning`. CDO/python-cdo has been removed. |

### Key differences between backends

- **xarray** handles non-numeric time-dependent variables (e.g. `average_T1`,
`average_T2`) by retaining their first value; numeric variables are weighted
by `time_bnds` durations.
- **numpy** uses explicit per-variable reduction for time metadata
(`time_bnds`, `time`, `average_T1`, `average_T2`, `average_DT`) to correctly
span the full averaging period. Other numeric variables are weighted via
numpy vectorised operations.
- Both backends produce results that are consistent to within ~6e-8 relative
tolerance (float32 ULP) due to different floating-point accumulation order.
The same backend is bitwise reproducible (idempotent).

## Average types (`--avg_type`)

| Type | Description |
|---|---|
| `all` | Average over all timesteps → single output timestep |
| `month` | Monthly climatology → one output file per calendar month (`.01.nc` … `.12.nc`) |
| `seas` | Seasonal climatology (xarray only) → grouped by DJF/MAM/JJA/SON |

## CLI usage

```
fre app gen-time-averages -i INPUT.nc -o OUTPUT.nc -p xarray [-v VAR] [-u] [-a all|month|seas]
```

Options:
- `-i / --inf` — input NetCDF file (required)
- `-o / --outf` — output NetCDF file (required)
- `-p / --pkg` — backend package (default: `xarray`)
- `-v / --var` — target variable name (auto-detected if omitted)
- `-u / --unwgt` — compute unweighted (simple) mean instead of `time_bnds`-weighted
- `-a / --avg_type` — averaging mode: `all` (default), `month`, or `seas`

## Architecture

```
timeAverager (abstract base)
├── xarrayTimeAverager ──── cdoTimeAverager (deprecated stub)
└── NumpyTimeAverager ──── frepytoolsTimeAverager (alias stub)
```

Supporting modules:
- `generate_time_averages.py` — steering function (`generate_time_average`)
that dispatches to the correct backend
- `wrapper.py` — workflow-level wrapper (`generate_wrapper`) that loops over
sources, variables, and date ranges for climatology generation
- `combine.py` — merges per-variable climatology shards into combined files

## Running tests

```bash
# full test suite
pytest -v fre/app/generate_time_averages/tests/

# specific test files
pytest -v fre/app/generate_time_averages/tests/test_numpyTimeAverager.py
pytest -v fre/app/generate_time_averages/tests/test_xarrayTimeAverager.py
pytest -v fre/app/generate_time_averages/tests/test_generate_time_averages.py
pytest -v fre/app/generate_time_averages/tests/test_cross_pkg_bitwise.py
```
3 changes: 2 additions & 1 deletion fre/app/generate_time_averages/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
'''required for generate_time_averages module import functionality'''
__all__ = ['generate_time_averages', 'timeAverager', 'wrapper', 'combine',
'frenctoolsTimeAverager', 'cdoTimeAverager', 'frepytoolsTimeAverager']
'frenctoolsTimeAverager', 'cdoTimeAverager', 'frepytoolsTimeAverager',
'numpyTimeAverager', 'xarrayTimeAverager']
96 changes: 21 additions & 75 deletions fre/app/generate_time_averages/cdoTimeAverager.py
Original file line number Diff line number Diff line change
@@ -1,89 +1,35 @@
''' class using (mostly) cdo functions for time-averages '''
''' stub that redirects pkg='cdo' requests to the xarray time averager '''

import logging
import warnings

from netCDF4 import Dataset
import numpy as np

import cdo
from cdo import Cdo

from .timeAverager import timeAverager
from .xarrayTimeAverager import xarrayTimeAverager

fre_logger = logging.getLogger(__name__)

class cdoTimeAverager(timeAverager):

class cdoTimeAverager(xarrayTimeAverager): # pylint: disable=invalid-name
'''
class inheriting from abstract base class timeAverager
generates time-averages using cdo (mostly, see weighted approach)
Legacy entry-point kept for backward compatibility.
CDO/python-cdo has been removed. All work is now done by xarrayTimeAverager.
'''

def generate_timavg(self, infile = None, outfile = None):
def generate_timavg(self, infile=None, outfile=None):
"""
use cdo package routines via python bindings
Emit a loud warning then delegate to the xarray implementation.

:param self: This is an instance of the class cdoTimeAverager
:param infile: path to history file, or list of paths, default is None
:type infile: str, list
:param outfile: path to where output file should be stored, default is None
:param infile: path to input NetCDF file
:type infile: str
:param outfile: path to output file
:type outfile: str
:return: 1 if the instance variable self.avg_typ is unsupported, 0 if function has a clean exit
:return: 0 on success
:rtype: int
"""

if self.avg_type not in ['all', 'seas', 'month']:
fre_logger.error('requested unknown avg_type %s.', self.avg_type)
raise ValueError(f'requested unknown avg_type {self.avg_type}')

if self.var is not None:
fre_logger.warning('WARNING: variable specification not twr supported for cdo time averaging. ignoring!')

fre_logger.info('python-cdo version is %s', cdo.__version__)

_cdo = Cdo()

wgts_sum = 0
if not self.unwgt: #weighted case, cdo ops alone don't support a weighted time-average.

nc_fin = Dataset(infile, 'r')

time_bnds = nc_fin['time_bnds'][:].copy()
# Ensure float64 precision for consistent results across numpy versions
# NumPy 2.0 changed type promotion rules (NEP 50), so explicit casting
# is needed to avoid precision differences
time_bnds = np.asarray(time_bnds, dtype=np.float64)
# Transpose once to avoid redundant operations
time_bnds_transposed = np.moveaxis(time_bnds, 0, -1)
wgts = time_bnds_transposed[1] - time_bnds_transposed[0]
# Use numpy.sum for consistent dtype handling across numpy versions
wgts_sum = np.sum(wgts, dtype=np.float64)

fre_logger.debug('wgts_sum = %s', wgts_sum)

if self.avg_type == 'all':
fre_logger.info('time average over all time requested.')
if self.unwgt:
_cdo.timmean(input = infile, output = outfile, returnCdf = True)
else:
_cdo.divc( str(wgts_sum), input = "-timsum -muldpm "+infile, output = outfile)
fre_logger.info('done averaging over all time.')

elif self.avg_type == 'seas':
fre_logger.info('seasonal time-averages requested.')
_cdo.yseasmean(input = infile, output = outfile, returnCdf = True)
fre_logger.info('done averaging over seasons.')

elif self.avg_type == 'month':
fre_logger.info('monthly time-averages requested.')
outfile_str = str(outfile)
_cdo.ymonmean(input = infile, output = outfile_str, returnCdf = True)
fre_logger.info('done averaging over months.')

fre_logger.warning(" splitting by month")
outfile_root = outfile_str.removesuffix(".nc") + '.'
_cdo.splitmon(input = outfile_str, output = outfile_root)
fre_logger.debug('Done with splitting by month, outfile_root = %s', outfile_root)

fre_logger.info('done averaging')
fre_logger.info('output file created: %s', outfile)
return 0
msg = (
"WARNING *** CDO/python-cdo has been REMOVED from fre-cli. "
"pkg='cdo' now uses the xarray time-averager under the hood. "
"Please switch to pkg='xarray' or pkg='fre-python-tools'. ***"
)
warnings.warn(msg, FutureWarning, stacklevel=2)
fre_logger.warning(msg)
return super().generate_timavg(infile=infile, outfile=outfile)
4 changes: 3 additions & 1 deletion fre/app/generate_time_averages/combine.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,9 @@ def merge_netcdfs(input_file_glob: str, output_file: str) -> None:
raise FileExistsError(f"Output file '{output_file}' already exists")

ds = xr.open_mfdataset(input_files, compat='override', coords='minimal')
ds.to_netcdf(output_file, unlimited_dims=['time'])
# only declare 'time' unlimited if it is actually a dimension in the merged dataset
unlim = ['time'] if 'time' in ds.dims else []
ds.to_netcdf(output_file, unlimited_dims=unlim)


def combine( root_in_dir: str,
Expand Down
Loading
Loading