ccdtools.loaders

Default and custom loader functions for datasets contained within the catalog

Functions

default(self, row[, resolution, static])

Default loader function to load data based on file extension.

measures_velocity(self, row[, resolution, ...])

Custom loader for MEaSUREs Velocity datasets.

racmo(self, row, **kwargs)

Custom loader for RACMO datasets.

ccdtools.loaders.default(self, row, resolution=None, static=True, **kwargs)

Default loader function to load data based on file extension.

Load data from various file formats (CSV, GeoPackage, Shapefile, GeoTIFF, NetCDF) and return the appropriate data structure.

Parameters:
  • self (object) – The class instance.

  • row (dict) – A dictionary containing dataset metadata including file path, extension, and loading parameters.

  • resolution (str, optional) – The resolution to filter files by. Required if the dataset defines multiple resolutions. Default is None.

  • static (bool, optional) – Flag indicating whether to load static files (True). Default is True. The default loader only supports static files.

  • **kwargs (dict) – Additional keyword arguments passed to the underlying loading functions (e.g., pd.read_csv, gpd.read_file, rxr.open_rasterio, xr.open_mfdataset).

Returns:

The loaded data in the appropriate format based on file extension:

  • ’csv’: pandas.DataFrame

  • ’gpkg’ or ‘shp’: geopandas.GeoDataFrame

  • ’tif’: xarray.Dataset

  • ’nc’: xarray.Dataset

Return type:

pd.DataFrame or gpd.GeoDataFrame or xr.Dataset

Raises:
  • FileNotFoundError – If no files matching the criteria are found.

  • ValueError – If the file extension is not supported or if static flag is not True.

ccdtools.loaders.measures_velocity(self, row, resolution=None, static=None, **kwargs)

Custom loader for MEaSUREs Velocity datasets.

Loads MEaSUREs ice velocity NetCDF files, optionally filtering by resolution and static/annual mode, and adds a time dimension to annual files based on the year(s) encoded in the filename.

Parameters:
  • self (object) – The class instance.

  • row (dict) – Dictionary containing dataset metadata, including file path, extension, and loading parameters.

  • resolution (str, optional) – The resolution to filter files by. Required if the dataset defines multiple resolutions.

  • static (bool, optional) – Flag indicating whether to load static files (True) or annual files (False).

  • **kwargs – Additional keyword arguments passed to xr.open_mfdataset.

Returns:

Loaded MEaSUREs velocity data as an xarray Dataset. For annual files, a time dimension is added based on the year(s) in the filename (set to July 2nd of the middle year).

Return type:

xr.Dataset

Raises:
  • FileNotFoundError – If no files matching the criteria are found.

  • ValueError – If required parameters are missing or filtering criteria are not met.

Notes

  • For annual files, the time dimension is set to July 2nd of the middle year (or the single year if only one is present).

  • Assumes the filename is stored in ds.encoding['source'].

  • Uses parallel loading via xr.open_mfdataset.

Examples

>>> ds = loaders.measures_velocity(row, resolution='1km', static=False)
ccdtools.loaders.racmo(self, row, **kwargs)

Custom loader for RACMO datasets.

Loads RACMO NetCDF files, preprocesses them by dropping unnecessary variables and setting coordinates, then combines them into a single xarray Dataset.

Parameters:
  • self (object) – The class instance.

  • row (dict) – Dictionary containing dataset metadata, including file path, extension, and loading parameters.

  • **kwargs – Additional keyword arguments passed to xr.open_mfdataset.

Returns:

Loaded RACMO data as an xarray Dataset with preprocessed variables and coordinates.

Return type:

xr.Dataset

Raises:

FileNotFoundError – If no files matching the criteria are found.

Warning

UserWarning

For the ‘racmo2.3p2_monthly_27km_1979-2022’ dataset, a warning is issued noting that timesteps vary between variables/files, which may introduce all-NaN arrays for timesteps where a variable lacks data.

Notes

  • Drops variables ‘block1’ and ‘block2’ if present.

  • Sets ‘rlat’ and ‘rlon’ as coordinates.

  • Uses parallel loading via xr.open_mfdataset.

  • Default merge behavior: combine by coordinates, outer join, override compatibility.