ccdtools.loaders
Default and custom loader functions for datasets contained within the catalog
Functions
|
Default loader function to load data based on file extension. |
|
Custom loader for MEaSUREs Velocity datasets. |
|
Custom loader for RACMO datasets. |
- ccdtools.loaders.default(self, row, resolution=None, static=True, **kwargs)
Default loader function to load data based on file extension.
Load data from various file formats (CSV, GeoPackage, Shapefile, GeoTIFF, NetCDF) and return the appropriate data structure.
- Parameters:
self (object) – The class instance.
row (dict) – A dictionary containing dataset metadata including file path, extension, and loading parameters.
resolution (str, optional) – The resolution to filter files by. Required if the dataset defines multiple resolutions. Default is None.
static (bool, optional) – Flag indicating whether to load static files (True). Default is True. The default loader only supports static files.
**kwargs (dict) – Additional keyword arguments passed to the underlying loading functions (e.g., pd.read_csv, gpd.read_file, rxr.open_rasterio, xr.open_mfdataset).
- Returns:
The loaded data in the appropriate format based on file extension:
’csv’:
pandas.DataFrame’gpkg’ or ‘shp’:
geopandas.GeoDataFrame’tif’:
xarray.Dataset’nc’:
xarray.Dataset
- Return type:
pd.DataFrame or gpd.GeoDataFrame or xr.Dataset
- Raises:
FileNotFoundError – If no files matching the criteria are found.
ValueError – If the file extension is not supported or if static flag is not True.
- ccdtools.loaders.measures_velocity(self, row, resolution=None, static=None, **kwargs)
Custom loader for MEaSUREs Velocity datasets.
Loads MEaSUREs ice velocity NetCDF files, optionally filtering by resolution and static/annual mode, and adds a time dimension to annual files based on the year(s) encoded in the filename.
- Parameters:
self (object) – The class instance.
row (dict) – Dictionary containing dataset metadata, including file path, extension, and loading parameters.
resolution (
str, optional) – The resolution to filter files by. Required if the dataset defines multiple resolutions.static (
bool, optional) – Flag indicating whether to load static files (True) or annual files (False).**kwargs – Additional keyword arguments passed to
xr.open_mfdataset.
- Returns:
Loaded MEaSUREs velocity data as an xarray Dataset. For annual files, a time dimension is added based on the year(s) in the filename (set to July 2nd of the middle year).
- Return type:
xr.Dataset
- Raises:
FileNotFoundError – If no files matching the criteria are found.
ValueError – If required parameters are missing or filtering criteria are not met.
Notes
For annual files, the time dimension is set to July 2nd of the middle year (or the single year if only one is present).
Assumes the filename is stored in
ds.encoding['source'].Uses parallel loading via
xr.open_mfdataset.
Examples
>>> ds = loaders.measures_velocity(row, resolution='1km', static=False)
- ccdtools.loaders.racmo(self, row, **kwargs)
Custom loader for RACMO datasets.
Loads RACMO NetCDF files, preprocesses them by dropping unnecessary variables and setting coordinates, then combines them into a single xarray Dataset.
- Parameters:
self (object) – The class instance.
row (dict) – Dictionary containing dataset metadata, including file path, extension, and loading parameters.
**kwargs – Additional keyword arguments passed to
xr.open_mfdataset.
- Returns:
Loaded RACMO data as an xarray Dataset with preprocessed variables and coordinates.
- Return type:
xr.Dataset
- Raises:
FileNotFoundError – If no files matching the criteria are found.
Warning
- UserWarning
For the ‘racmo2.3p2_monthly_27km_1979-2022’ dataset, a warning is issued noting that timesteps vary between variables/files, which may introduce all-NaN arrays for timesteps where a variable lacks data.
Notes
Drops variables ‘block1’ and ‘block2’ if present.
Sets ‘rlat’ and ‘rlon’ as coordinates.
Uses parallel loading via
xr.open_mfdataset.Default merge behavior: combine by coordinates, outer join, override compatibility.