Python API

Cube I/O

xcube.core.dsio.open_cube(input_path: str, format_name: Optional[str] = None, **kwargs)xarray.Dataset[source]

Open a xcube dataset from input_path. If format is not provided it will be guessed from input_path. :type format_name: str :type input_path: str :param input_path: input path :param format_name: format, e.g. “zarr” or “netcdf4” :param kwargs: format-specific keyword arguments :return: xcube dataset

xcube.core.dsio.write_cube(cube: xarray.Dataset, output_path: str, format_name: Optional[str] = None, cube_asserted: bool = False, **kwargs)xarray.Dataset[source]

Write a xcube dataset to output_path. If format is not provided it will be guessed from output_path. :type cube_asserted: bool :type format_name: str :type output_path: str :param cube: xcube dataset to be written. :param output_path: output path :param format_name: format, e.g. “zarr” or “netcdf4” :param kwargs: format-specific keyword arguments :param cube_asserted: If False, cube will be verified, otherwise it is expected to be a valid cube. :return: xcube dataset cube

Cube generation

xcube.core.gen.gen.gen_cube(input_paths: Optional[Sequence[str]] = None, input_processor_name: Optional[str] = None, input_processor_params: Optional[Dict] = None, input_reader_name: Optional[str] = None, input_reader_params: Optional[Dict[str, Any]] = None, output_region: Optional[Tuple[float, float, float, float]] = None, output_size: Tuple[int, int] = [512, 512], output_resampling: str = 'Nearest', output_path: str = 'out.zarr', output_writer_name: Optional[str] = None, output_writer_params: Optional[Dict[str, Any]] = None, output_metadata: Optional[Dict[str, Any]] = None, output_variables: Optional[List[Tuple[str, Optional[Dict[str, Any]]]]] = None, processed_variables: Optional[List[Tuple[str, Optional[Dict[str, Any]]]]] = None, profile_mode: bool = False, no_sort_mode: bool = False, append_mode: Optional[bool] = None, dry_run: bool = False, monitor: Optional[Callable[[], None]] = None)bool[source]

Generate a xcube dataset from one or more input files.

Return type

bool

Parameters
  • no_sort_mode (bool) –

  • input_paths – The input paths.

  • input_processor_name (str) – Name of a registered input processor (xcube.core.gen.inputprocessor.InputProcessor) to be used to transform the inputs.

  • input_processor_params – Parameters to be passed to the input processor.

  • input_reader_name (str) – Name of a registered input reader (xcube.core.util.dsio.DatasetIO).

  • input_reader_params – Parameters passed to the input reader.

  • output_region – Output region as tuple of floats: (lon_min, lat_min, lon_max, lat_max).

  • output_size – The spatial dimensions of the output as tuple of ints: (width, height).

  • output_resampling (str) – The resampling method for the output.

  • output_path (str) – The output directory.

  • output_writer_name (str) – Name of an output writer (xcube.core.util.dsio.DatasetIO) used to write the cube.

  • output_writer_params – Parameters passed to the output writer.

  • output_metadata – Extra metadata passed to output cube.

  • output_variables – Output variables.

  • processed_variables – Processed variables computed on-the-fly.

  • profile_mode (bool) – Whether profiling should be enabled.

  • append_mode (bool) – Deprecated. The function will always either insert, replace, or append new time slices.

  • dry_run (bool) – Doesn’t write any data. For testing.

  • monitor – A progress monitor.

Returns

True for success.

xcube.core.new.new_cube(title='Test Cube', width=360, height=180, x_name='lon', y_name='lat', x_dtype='float64', y_dtype=None, x_units='degrees_east', y_units='degrees_north', x_res=1.0, y_res=None, x_start=- 180.0, y_start=- 90.0, inverse_y=False, time_name='time', time_dtype='datetime64[s]', time_units='seconds since 1970-01-01T00:00:00', time_calendar='proleptic_gregorian', time_periods=5, time_freq='D', time_start='2010-01-01T00:00:00', drop_bounds=False, variables=None)[source]

Create a new empty cube. Useful for creating cubes templates with predefined coordinate variables and metadata. The function is also heavily used by xcube’s unit tests.

The values of the variables dictionary can be either constants, array-like objects, or functions that compute their return value from passed coordinate indexes. The expected signature is::

def my_func(time: int, y: int, x: int) -> Union[bool, int, float]
Parameters
  • title (str) – A title. Defaults to ‘Test Cube’.

  • width (int) – Horizontal number of grid cells. Defaults to 360.

  • height (int) – Vertical number of grid cells. Defaults to 180.

  • x_name (str) – Name of the x coordinate variable. Defaults to ‘lon’.

  • y_name (str) – Name of the y coordinate variable. Defaults to ‘lat’.

  • x_dtype (str) – Data type of x coordinates. Defaults to ‘float64’.

  • y_dtype – Data type of y coordinates. Defaults to ‘float64’.

  • x_units (str) – Units of the x coordinates. Defaults to ‘degrees_east’.

  • y_units (str) – Units of the y coordinates. Defaults to ‘degrees_north’.

  • x_start (float) – Minimum x value. Defaults to -180.

  • y_start (float) – Minimum y value. Defaults to -90.

  • x_res (float) – Spatial resolution in x-direction. Defaults to 1.0.

  • y_res – Spatial resolution in y-direction. Defaults to 1.0.

  • inverse_y (bool) – Whether to create an inverse y axis. Defaults to False.

  • time_name (str) – Name of the time coordinate variable. Defaults to ‘time’.

  • time_periods (int) – Number of time steps. Defaults to 5.

  • time_freq (str) – Duration of each time step. Defaults to `1D’.

  • time_start (str) – First time value. Defaults to ‘2010-01-01T00:00:00’.

  • time_dtype (str) – Numpy data type for time coordinates. Defaults to ‘datetime64[s]’.

  • time_units (str) – Units for time coordinates. Defaults to ‘seconds since 1970-01-01T00:00:00’.

  • time_calendar (str) – Calender for time coordinates. Defaults to ‘proleptic_gregorian’.

  • drop_bounds (bool) – If True, coordinate bounds variables are not created. Defaults to False.

  • variables – Dictionary of data variables to be added. None by default.

Returns

A cube instance

Cube computation

xcube.core.compute.compute_cube(cube_func: Callable[[], Union[xarray.DataArray, numpy.ndarray, Sequence[Union[xarray.DataArray, numpy.ndarray]]]], *input_cubes: xarray.Dataset, input_cube_schema: Optional[xcube.core.schema.CubeSchema] = None, input_var_names: Optional[Sequence[str]] = None, input_params: Optional[Dict[str, Any]] = None, output_var_name: str = 'output', output_var_dtype: Any = numpy.float64, output_var_attrs: Optional[Dict[str, Any]] = None, vectorize: Optional[bool] = None, cube_asserted: bool = False)xarray.Dataset[source]

Compute a new output data cube with a single variable named output_var_name from variables named input_var_names contained in zero, one, or more input data cubes in input_cubes using a cube factory function cube_func.

cube_func is called concurrently for each of the chunks of the input variables. It is expected to return a chunk block whith is type np.ndarray.

If input_cubes is not empty, cube_func receives variables as specified by input_var_names. If input_cubes is empty, input_var_names must be empty too, and input_cube_schema must be given, so that a new cube can be created.

The full signature of cube_func is::

def cube_func(*input_vars: np.ndarray,
              input_params: Dict[str, Any] = None,
              dim_coords: Dict[str, np.ndarray] = None,
              dim_ranges: Dict[str, Tuple[int, int]] = None) -> np.ndarray:
    pass

The arguments are:

  • input_vars: the variables according to the given input_var_names;

  • input_params: is this call’s input_params, a mapping from parameter name to value;

  • dim_coords: a mapping from dimension names to the current chunk’s coordinate arrays;

  • dim_ranges: a mapping from dimension names to the current chunk’s index ranges.

Only the input_vars argument is mandatory. The keyword arguments input_params, input_params, input_params do need to be present at all.

Parameters
  • cube_func – The cube factory function.

  • input_cubes – An optional sequence of input cube datasets, must be provided if input_cube_schema is not.

  • input_cube_schema (CubeSchema) – An optional input cube schema, must be provided if input_cubes is not.

  • input_var_names – A sequence of variable names

  • input_params – Optional dictionary with processing parameters passed to cube_func.

  • output_var_name (str) – Optional name of the output variable, defaults to 'output'.

  • output_var_dtype – Optional numpy datatype of the output variable, defaults to 'float32'.

  • output_var_attrs – Optional metadata attributes for the output variable.

  • vectorize (bool) – Whether all input_cubes have the same variables which are concatenated and passed as vectors to cube_func. Not implemented yet.

  • cube_asserted (bool) – If False, cube will be verified, otherwise it is expected to be a valid cube.

Returns

A new dataset that contains the computed output variable.

xcube.core.evaluate.evaluate_dataset(dataset: xarray.Dataset, processed_variables: Optional[List[Tuple[str, Optional[Dict[str, Any]]]]] = None, errors: str = 'raise')xarray.Dataset[source]

Compute a dataset from another dataset by evaluating expressions provided as variable attributes.

New variables are computed according to the value of an expression attribute which, if given, must by a valid Python expression that can reference any other preceding variables by name. The expression can also reference any flags defined by another variable according the their CF attributes flag_meaning and flag_values.

Invalid values may be masked out using the value of an optional valid_pixel_expression attribute that forms a boolean Python expression. The value of the _FillValue attribute or NaN will be used in the new variable where the expression returns zero or false.

Other attributes will be stored as variable metadata as-is.

Parameters
  • dataset – A dataset.

  • processed_variables – Optional list of variables that will be loaded or computed in the order given. Each variable is either identified by name or by a name to variable attributes mapping.

  • errors (str) – How to deal with errors while evaluating expressions. May be be one of “raise”, “warn”, or “ignore”.

Returns

new dataset with computed variables

Cube data extraction

xcube.core.extract.get_cube_values_for_points(cube: xarray.Dataset, points: Union[xarray.Dataset, pandas.DataFrame, Mapping[str, Any]], var_names: Optional[Sequence[str]] = None, include_coords: bool = False, include_bounds: bool = False, include_indexes: bool = False, index_name_pattern: str = '{name}_index', include_refs: bool = False, ref_name_pattern: str = '{name}_ref', method: str = 'nearest', cube_asserted: bool = False)xarray.Dataset[source]

Extract values from cube variables at given coordinates in points.

Returns a new dataset with values of variables from cube selected by the coordinate columns provided in points. All variables will be 1-D and have the same order as the rows in points.

Parameters
  • cube – The cube dataset.

  • points – Dictionary that maps dimension name to coordinate arrays.

  • var_names – An optional list of names of data variables in cube whose values shall be extracted.

  • include_coords (bool) – Whether to include the cube coordinates for each point in return value.

  • include_bounds (bool) – Whether to include the cube coordinate boundaries (if any) for each point in return value.

  • include_indexes (bool) – Whether to include computed indexes into the cube for each point in return value.

  • index_name_pattern (str) – A naming pattern for the computed index columns. Must include “{name}” which will be replaced by the index’ dimension name.

  • include_refs (bool) – Whether to include point (reference) values from points in return value.

  • ref_name_pattern (str) – A naming pattern for the computed point data columns. Must include “{name}” which will be replaced by the point’s attribute name.

  • method (str) – “nearest” or “linear”.

  • cube_asserted (bool) – If False, cube will be verified, otherwise it is expected to be a valid cube.

Returns

A new data frame whose columns are values from cube variables at given points.

xcube.core.extract.get_cube_point_indexes(cube: xarray.Dataset, points: Union[xarray.Dataset, pandas.DataFrame, Mapping[str, Any]], dim_name_mapping: Optional[Mapping[str, str]] = None, index_name_pattern: str = '{name}_index', index_dtype=numpy.float64, cube_asserted: bool = False)xarray.Dataset[source]

Get indexes of given point coordinates points into the given dataset.

Parameters
  • cube – The cube dataset.

  • points – A mapping from column names to column data arrays, which must all have the same length.

  • dim_name_mapping – A mapping from dimension names in cube to column names in points.

  • index_name_pattern (str) – A naming pattern for the computed indexes columns. Must include “{name}” which will be replaced by the dimension name.

  • index_dtype – Numpy data type for the indexes. If it is a floating point type (default), then indexes will contain fractions, which may be used for interpolation. For out-of-range coordinates in points, indexes will be -1 if index_dtype is an integer type, and NaN, if index_dtype is a floating point types.

  • cube_asserted (bool) – If False, cube will be verified, otherwise it is expected to be a valid cube.

Returns

A dataset containing the index columns.

xcube.core.extract.get_cube_values_for_indexes(cube: xarray.Dataset, indexes: Union[xarray.Dataset, pandas.DataFrame, Mapping[str, Any]], include_coords: bool = False, include_bounds: bool = False, data_var_names: Optional[Sequence[str]] = None, index_name_pattern: str = '{name}_index', method: str = 'nearest', cube_asserted: bool = False)xarray.Dataset[source]

Get values from the cube at given indexes.

Parameters
  • cube – A cube dataset.

  • indexes – A mapping from column names to index and fraction arrays for all cube dimensions.

  • include_coords (bool) – Whether to include the cube coordinates for each point in return value.

  • include_bounds (bool) – Whether to include the cube coordinate boundaries (if any) for each point in return value.

  • data_var_names – An optional list of names of data variables in cube whose values shall be extracted.

  • index_name_pattern (str) – A naming pattern for the computed indexes columns. Must include “{name}” which will be replaced by the dimension name.

  • method (str) – “nearest” or “linear”.

  • cube_asserted (bool) – If False, cube will be verified, otherwise it is expected to be a valid cube.

Returns

A new data frame whose columns are values from cube variables at given indexes.

xcube.core.extract.get_dataset_indexes(dataset: xarray.Dataset, coord_var_name: str, coord_values: Union[xarray.DataArray, numpy.ndarray], index_dtype=numpy.float64) → Union[xarray.DataArray, numpy.ndarray][source]

Compute the indexes and their fractions into a coordinate variable coord_var_name of a dataset for the given coordinate values coord_values.

The coordinate variable’s labels must be monotonic increasing or decreasing, otherwise the result will be nonsense.

For any value in coord_values that is out of the bounds of the coordinate variable’s values, the index depends on the value of index_dtype. If index_dtype is an integer type, invalid indexes are encoded as -1 while for floating point types, NaN will be used.

Returns a tuple of indexes as int64 array and fractions as float64 array.

Parameters
  • dataset – A cube dataset.

  • coord_var_name (str) – Name of a coordinate variable.

  • coord_values – Array-like coordinate values.

  • index_dtype – Numpy data type for the indexes. If it is floating point type (default), then indexes contain fractions, which may be used for interpolation. If dtype is an integer type out-of-range coordinates are indicated by index -1, and NaN if it is is a floating point type.

Returns

The indexes and their fractions as a tuple of numpy int64 and float64 arrays.

xcube.core.timeseries.get_time_series(cube: xarray.Dataset, geometry: Optional[Union[shapely.geometry.base.BaseGeometry, Dict[str, Any], str, Sequence[Union[float, int]]]] = None, var_names: Optional[Sequence[str]] = None, start_date: Optional[Union[numpy.datetime64, str]] = None, end_date: Optional[Union[numpy.datetime64, str]] = None, agg_methods: Union[str, Sequence[str], AbstractSet[str]] = 'mean', include_count: bool = False, include_stdev: bool = False, use_groupby: bool = False, cube_asserted: bool = False) → Optional[xarray.Dataset][source]

Get a time series dataset from a data cube.

geometry may be provided as a (shapely) geometry object, a valid GeoJSON object, a valid WKT string, a sequence of box coordinates (x1, y1, x2, y2), or point coordinates (x, y). If geometry covers an area, i.e. is not a point, the function aggregates the variables to compute a mean value and if desired, the number of valid observations and the standard deviation.

start_date and end_date may be provided as a numpy.datetime64 or an ISO datetime string.

Returns a time-series dataset whose data variables have a time dimension but no longer have spatial dimensions, hence the resulting dataset’s variables will only have N-2 dimensions. A global attribute max_number_of_observations will be set to the maximum number of observations that could have been made in each time step. If the given geometry does not overlap the cube’s boundaries, or if not output variables remain, the function returns None.

Parameters
  • cube – The xcube dataset

  • geometry – Optional geometry

  • var_names – Optional sequence of names of variables to be included.

  • start_date – Optional start date.

  • end_date – Optional end date.

  • agg_methods – Aggregation methods. May be single string or sequence of strings. Possible values are ‘mean’, ‘median’, ‘min’, ‘max’, ‘std’, ‘count’. Defaults to ‘mean’. Ignored if geometry is a point.

  • include_count (bool) – Deprecated. Whether to include the number of valid observations for each time step. Ignored if geometry is a point.

  • include_stdev (bool) – Deprecated. Whether to include standard deviation for each time step. Ignored if geometry is a point.

  • use_groupby (bool) – Use group-by operation. May increase or decrease runtime performance and/or memory consumption.

  • cube_asserted (bool) – If False, cube will be verified, otherwise it is expected to be a valid cube.

Returns

A new dataset with time-series for each variable.

Cube manipulation

xcube.core.resample.resample_in_time(cube: xarray.Dataset, frequency: str, method: Union[str, Sequence[str]], offset=None, base: int = 0, tolerance=None, interp_kind=None, time_chunk_size=None, var_names: Optional[Sequence[str]] = None, metadata: Optional[Dict[str, Any]] = None, cube_asserted: bool = False)xarray.Dataset[source]

Resample a xcube dataset in the time dimension.

The argument method may be one or a sequence of 'all', 'any', 'argmax', 'argmin', 'argmax', 'count', 'first', 'last', 'max', 'min', 'mean', 'median', 'percentile_<p>', 'std', 'sum', 'var'.

In value 'percentile_<p>' is a placeholder, where '<p>' must be replaced by an integer percentage value, e.g. 'percentile_90' is the 90%-percentile.

Important note: As of xarray 0.14 and dask 2.8, the methods 'median' and 'percentile_<p>'` cannot be used if the variables in *cube* comprise chunked dask arrays. In this case, use the ``compute() or load() method to convert dask arrays into numpy arrays.

Parameters
  • cube – The xcube dataset.

  • frequency (str) – Temporal aggregation frequency. Use format “<count><offset>” “where <offset> is one of ‘H’, ‘D’, ‘W’, ‘M’, ‘Q’, ‘Y’.

  • method – Resampling method or sequence of resampling methods.

  • offset – Offset used to adjust the resampled time labels. Uses same syntax as frequency.

  • base (int) – For frequencies that evenly subdivide 1 day, the “origin” of the aggregated intervals. For example, for ‘24H’ frequency, base could range from 0 through 23.

  • time_chunk_size – If not None, the chunk size to be used for the “time” dimension.

  • var_names – Variable names to include.

  • tolerance – Time tolerance for selective upsampling methods. Defaults to frequency.

  • interp_kind – Kind of interpolation if method is ‘interpolation’.

  • metadata – Output metadata.

  • cube_asserted (bool) – If False, cube will be verified, otherwise it is expected to be a valid cube.

Returns

A new xcube dataset resampled in time.

xcube.core.vars2dim.vars_to_dim(cube: xarray.Dataset, dim_name: str = 'var', var_name='data', cube_asserted: bool = False)[source]

Convert data variables into a dimension.

Parameters
  • cube – The xcube dataset.

  • dim_name (str) – The name of the new dimension and coordinate variable. Defaults to ‘var’.

  • var_name (str) – The name of the new, single data variable. Defaults to ‘data’.

  • cube_asserted (bool) – If False, cube will be verified, otherwise it is expected to be a valid cube.

Returns

A new xcube dataset with data variables turned into a new dimension.

xcube.core.chunk.chunk_dataset(dataset: xarray.Dataset, chunk_sizes: Optional[Dict[str, int]] = None, format_name: Optional[str] = None)xarray.Dataset[source]

Chunk dataset using chunk_sizes and optionally update encodings for given format_name.

Parameters
  • dataset – input dataset

  • chunk_sizes – mapping from dimension name to new chunk size

  • format_name (str) – optional format, e.g. “zarr” or “netcdf4”

Returns

the (re)chunked dataset

xcube.core.unchunk.unchunk_dataset(dataset_path: str, var_names: Optional[Sequence[str]] = None, coords_only: bool = False)[source]

Unchunk dataset variables in-place.

Parameters
  • dataset_path (str) – Path to ZARR dataset directory.

  • var_names – Optional list of variable names.

  • coords_only (bool) – Un-chunk coordinate variables only.

xcube.core.optimize.optimize_dataset(input_path: str, output_path: Optional[str] = None, in_place: bool = False, unchunk_coords: Union[bool, str, Sequence[str]] = False, exception_type: Type[Exception] = <class 'ValueError'>)[source]

Optimize a dataset for faster access.

Reduces the number of metadata and coordinate data files in xcube dataset given by given by dataset_path. Consolidated cubes open much faster from remote locations, e.g. in object storage, because obviously much less HTTP requests are required to fetch initial cube meta information. That is, it merges all metadata files into a single top-level JSON file “.zmetadata”.

If unchunk_coords is given, it also removes any chunking of coordinate variables so they comprise a single binary data file instead of one file per data chunk. The primary usage of this function is to optimize data cubes for cloud object storage. The function currently works only for data cubes using Zarr format. unchunk_coords can be a name, or list of names of the coordinate variable(s) to be consolidated. If boolean True is used, coordinate all variables will be consolidated.

Parameters
  • input_path (str) – Path to input dataset with ZARR format.

  • output_path (str) – Path to output dataset with ZARR format. May contain “{input}” template string, which is replaced by the input path’s file name without file name extension.

  • in_place (bool) – Whether to modify the dataset in place. If False, a copy is made and output_path must be given.

  • unchunk_coords – The name of a coordinate variable or a list of coordinate variables whose chunks should be consolidated. Pass True to consolidate chunks of all coordinate variables.

  • exception_type – Type of exception to be used on value errors.

Cube subsetting

xcube.core.select.select_variables_subset(dataset: xarray.Dataset, var_names: Optional[Collection[str]] = None)xarray.Dataset[source]

Select data variable from given dataset and create new dataset.

Parameters
  • dataset – The dataset from which to select variables.

  • var_names – The names of data variables to select.

Returns

A new dataset. It is empty, if var_names is empty. It is dataset, if var_names is None.

xcube.core.geom.clip_dataset_by_geometry(dataset: xarray.Dataset, geometry: Union[shapely.geometry.base.BaseGeometry, Dict[str, Any], str, Sequence[Union[float, int]]], save_geometry_wkt: Union[str, bool] = False) → Optional[xarray.Dataset][source]

Spatially clip a dataset according to the bounding box of a given geometry.

Parameters
  • dataset – The dataset

  • geometry – A geometry-like object, see py:function:convert_geometry.

  • save_geometry_wkt – If the value is a string, the effective intersection geometry is stored as a Geometry WKT string in the global attribute named by save_geometry. If the value is True, the name “geometry_wkt” is used.

Returns

The dataset spatial subset, or None if the bounding box of the dataset has a no or a zero area intersection with the bounding box of the geometry.

Cube masking

xcube.core.geom.mask_dataset_by_geometry(dataset: xarray.Dataset, geometry: Union[shapely.geometry.base.BaseGeometry, Dict[str, Any], str, Sequence[Union[float, int]]], excluded_vars: Optional[Sequence[str]] = None, no_clip: bool = False, save_geometry_mask: Union[str, bool] = False, save_geometry_wkt: Union[str, bool] = False) → Optional[xarray.Dataset][source]

Mask a dataset according to the given geometry. The cells of variables of the returned dataset will have NaN-values where their spatial coordinates are not intersecting the given geometry.

Parameters
  • dataset – The dataset

  • geometry – A geometry-like object, see py:function:convert_geometry.

  • excluded_vars – Optional sequence of names of data variables that should not be masked (but still may be clipped).

  • no_clip (bool) – If True, the function will not clip the dataset before masking, this is, the returned dataset will have the same dimension size as the given dataset.

  • save_geometry_mask – If the value is a string, the effective geometry mask array is stored as a 2D data variable named by save_geometry_mask. If the value is True, the name “geometry_mask” is used.

  • save_geometry_wkt – If the value is a string, the effective intersection geometry is stored as a Geometry WKT string in the global attribute named by save_geometry. If the value is True, the name “geometry_wkt” is used.

Returns

The dataset spatial subset, or None if the bounding box of the dataset has a no or a zero area intersection with the bounding box of the geometry.

class xcube.core.maskset.MaskSet(flag_var: xarray.DataArray)[source]

A set of mask variables derived from a variable flag_var with CF attributes “flag_masks” and “flag_meanings”.

Each mask is represented by an xarray.DataArray and has the name of the flag, is of type numpy.unit8, and has the dimensions of the given flag_var.

Parameters

flag_var – an xarray.DataArray that defines flag values. The CF attributes “flag_masks” and “flag_meanings” are expected to exists and be valid.

classmethod get_mask_sets(dataset: xarray.Dataset) → Dict[str, xcube.core.maskset.MaskSet][source]

For each “flag” variable in given dataset, turn it into a MaskSet, store it in a dictionary.

Parameters

dataset – The dataset

Returns

A mapping of flag names to MaskSet. Will be empty if there are no flag variables in dataset.

Rasterisation of Features

xcube.core.geom.rasterize_features(dataset: xarray.Dataset, features: Union[pandas.geodataframe.GeoDataFrame, Sequence[Mapping[str, Any]]], feature_props: Sequence[str], var_props: Dict[str, Mapping[str, Mapping[str, Any]]] = None, in_place: bool = False) → Optional[xarray.Dataset][source]

Rasterize feature properties given by feature_props of vector-data features as new variables into dataset.

dataset must have two spatial 1-D coordinates, either lon and lat in degrees, reprojected coordinates, x and y, or similar.

feature_props is a sequence of names of feature properties that must exists in each feature of features.

features may be passed as pandas.GeoDataFrame`` or as an iterable of GeoJSON features.

Using the optional var_props, the properties of newly created variables from feature properties can be specified. It is a mapping of feature property names to mappings of variable properties. Here is an example variable properties mapping::

{

‘name’: ‘land_class’, # (str) - the variable’s name, default is the feature property name; ‘dtype’ np.int16, # (str|np.dtype) - the variable’s dtype, default is np.float64; ‘fill_value’: -1, # (bool|int|float|np.nparray) - the variable’s fill value, default is np.nan; ‘attrs’: {}, # (Mapping[str, Any]) - the variable’s fill value, default is {}; ‘converter’: int, # (Callable[[Any], Any]) - a converter function used to convert from property

# feature value to variable value, default is float.

}

Currently, the coordinates of the geometries in the given features must use the same CRS as the given dataset.

Parameters
  • dataset – The xarray dataset.

  • features – A geopandas.GeoDataFrame instance or a sequence of GeoJSON features.

  • feature_props – Sequence of names of numeric feature properties to be rasterized.

  • var_props – Optional mapping of feature property name to a name or a 5-tuple (name, dtype, fill_value, attributes, converter) for the new variable.

  • in_place (bool) – Whether to add new variables to dataset. If False, a copy will be created and returned.

Returns

dataset with rasterized feature_property

Cube metadata

xcube.core.edit.edit_metadata(input_path: str, output_path: Optional[str] = None, metadata_path: Optional[str] = None, update_coords: bool = False, in_place: bool = False, monitor: Optional[Callable[[...], None]] = None, exception_type: Type[Exception] = <class 'ValueError'>)[source]

Edit the metadata of an xcube dataset.

Editing the metadata because it may be incorrect, inconsistent or incomplete. The metadata attributes should be given by a yaml file with the keywords to be edited. The function currently works only for data cubes using ZARR format.

Parameters
  • input_path (str) – Path to input dataset with ZARR format.

  • output_path (str) – Path to output dataset with ZARR format. May contain “{input}” template string, which is replaced by the input path’s file name without file name extentsion.

  • metadata_path (str) – Path to the metadata file, which will edit the existing metadata.

  • update_coords (bool) – Whether to update the metadata about the coordinates.

  • in_place (bool) – Whether to modify the dataset in place. If False, a copy is made and output_path must be given.

  • monitor – A progress monitor.

  • exception_type – Type of exception to be used on value errors.

xcube.core.update.update_dataset_attrs(dataset: xarray.Dataset, global_attrs: Optional[Dict[str, Any]] = None, update_existing: bool = False, in_place: bool = False)xarray.Dataset[source]

Update spatio-temporal CF/THREDDS attributes given dataset according to spatio-temporal coordinate variables time, lat, and lon.

Parameters
  • dataset – The dataset.

  • global_attrs – Optional global attributes.

  • update_existing (bool) – If True, any existing attributes will be updated.

  • in_place (bool) – If True, dataset will be modified in place and returned.

Returns

A new dataset, if in_place if False (default), else the passed and modified dataset.

xcube.core.update.update_dataset_spatial_attrs(dataset: xarray.Dataset, update_existing: bool = False, in_place: bool = False)xarray.Dataset[source]

Update spatial CF/THREDDS attributes of given dataset.

Parameters
  • dataset – The dataset.

  • update_existing (bool) – If True, any existing attributes will be updated.

  • in_place (bool) – If True, dataset will be modified in place and returned.

Returns

A new dataset, if in_place if False (default), else the passed and modified dataset.

xcube.core.update.update_dataset_temporal_attrs(dataset: xarray.Dataset, update_existing: bool = False, in_place: bool = False)xarray.Dataset[source]

Update temporal CF/THREDDS attributes of given dataset.

Parameters
  • dataset – The dataset.

  • update_existing (bool) – If True, any existing attributes will be updated.

  • in_place (bool) – If True, dataset will be modified in place and returned.

Returns

A new dataset, if in_place is False (default), else the passed and modified dataset.

Cube verification

xcube.core.verify.assert_cube(dataset: xarray.Dataset, name=None)xarray.Dataset[source]

Assert that the given dataset is a valid xcube dataset.

Parameters
  • dataset – The dataset to be validated.

  • name – Optional parameter name.

Raise

ValueError, if dataset is not a valid xcube dataset

xcube.core.verify.verify_cube(dataset: xarray.Dataset) → List[str][source]

Verify the given dataset for being a valid xcube dataset.

The tool verifies that dataset * defines two spatial x,y coordinate variables, that are 1D, non-empty, using correct units; * defines a time coordinate variables, that are 1D, non-empty, using correct units; * has valid bounds variables for spatial x,y and time coordinate variables, if any; * has any data variables and that they are valid, e.g. min. 3-D, all have

same dimensions, have at least the dimensions dim(time), dim(y), dim(x) in that order.

Returns a list of issues, which is empty if dataset is a valid xcube dataset.

Parameters

dataset – A dataset to be verified.

Returns

List of issues or empty list.

Multi-resolution pyramids

xcube.core.level.compute_levels(dataset: xarray.Dataset, spatial_dims: Optional[Tuple[str, str]] = None, spatial_shape: Optional[Tuple[int, int]] = None, spatial_tile_shape: Optional[Tuple[int, int]] = None, var_names: Optional[Sequence[str]] = None, num_levels_max: Optional[int] = None, post_process_level: Optional[Callable[[xarray.Dataset, int, int], Optional[xarray.Dataset]]] = None, progress_monitor: Optional[Callable[[xarray.Dataset, int, int], Optional[xarray.Dataset]]] = None) → List[xarray.Dataset][source]

Transform the given dataset into the levels of a multi-level pyramid with spatial resolution decreasing by a factor of two in both spatial dimensions.

It is assumed that the spatial dimensions of each variable are the inner-most, that is, the last two elements of a variable’s shape provide the spatial dimension sizes.

Parameters
  • dataset – The input dataset to be turned into a multi-level pyramid.

  • spatial_dims – If given, only variables are considered whose last to dimension elements match the given spatial_dims.

  • spatial_shape – If given, only variables are considered whose last to shape elements match the given spatial_shape.

  • spatial_tile_shape – If given, chunking will match the provided spatial_tile_shape.

  • var_names – Variables to consider. If None, all variables with at least two dimensions are considered.

  • num_levels_max (int) – If given, the maximum number of pyramid levels.

  • post_process_level – If given, the function will be called for each level and must return a dataset.

  • progress_monitor – If given, the function will be called for each level.

Returns

A list of dataset instances representing the multi-level pyramid.

xcube.core.level.read_levels(dir_path: str, progress_monitor: Optional[Callable[[xarray.Dataset, int, int], Optional[xarray.Dataset]]] = None) → List[xarray.Dataset][source]

Read the of a multi-level pyramid with spatial resolution decreasing by a factor of two in both spatial dimensions.

Parameters
  • dir_path (str) – The directory path.

  • progress_monitor – An optional progress monitor.

Returns

A list of dataset instances representing the multi-level pyramid.

xcube.core.level.write_levels(output_path: str, dataset: Optional[xarray.Dataset] = None, input_path: Optional[str] = None, link_input: bool = False, progress_monitor: Optional[Callable[[xarray.Dataset, int, int], Optional[xarray.Dataset]]] = None, **kwargs) → List[xarray.Dataset][source]

Transform the given dataset given by a dataset instance or input_path string into the levels of a multi-level pyramid with spatial resolution decreasing by a factor of two in both spatial dimensions and write them to output_path.

One of dataset and input_path must be given.

Parameters
  • output_path (str) – Output path

  • dataset – Dataset to be converted and written as levels.

  • input_path (str) – Input path to a dataset to be transformed and written as levels.

  • link_input (bool) – Just link the dataset at level zero instead of writing it.

  • progress_monitor – An optional progress monitor.

  • kwargs – Keyword-arguments accepted by the compute_levels() function.

Returns

A list of dataset instances representing the multi-level pyramid.

Utilities

xcube.core.geom.convert_geometry(geometry: Optional[Union[shapely.geometry.base.BaseGeometry, Dict[str, Any], str, Sequence[Union[float, int]]]]) → Optional[shapely.geometry.base.BaseGeometry][source]

Convert a geometry-like object into a shapely geometry object (shapely.geometry.BaseGeometry).

A geometry-like object is may be any shapely geometry object, * a dictionary that can be serialized to valid GeoJSON, * a WKT string, * a box given by a string of the form “<x1>,<y1>,<x2>,<y2>”

or by a sequence of four numbers x1, y1, x2, y2,

  • a point by a string of the form “<x>,<y>” or by a sequence of two numbers x, y.

Handling of geometries crossing the antimeridian:

  • If box coordinates are given, it is allowed to pass x1, x2 where x1 > x2, which is interpreted as a box crossing the antimeridian. In this case the function splits the box along the antimeridian and returns a multi-polygon.

  • In all other cases, 2D geometries are assumed to _not cross the antimeridian at all_.

Parameters

geometry – A geometry-like object

Returns

Shapely geometry object or None.

class xcube.core.schema.CubeSchema(shape: Sequence[int], coords: Mapping[str, xarray.DataArray], x_name: str = 'lon', y_name: str = 'lat', time_name: str = 'time', dims: Optional[Sequence[str]] = None, chunks: Optional[Sequence[int]] = None)[source]

A schema that can be used to create new xcube datasets. The given shape, dims, and chunks, coords apply to all data variables.

Parameters
  • shape – A tuple of dimension sizes.

  • coords – A dictionary of coordinate variables. Must have values for all dims.

  • dims – A sequence of dimension names. Defaults to ('time', 'lat', 'lon').

  • chunks – A tuple of chunk sizes in each dimension.

property ndim

Number of dimensions.

property dims

Tuple of dimension names.

property x_name

Name of the spatial x coordinate variable.

property y_name

Name of the spatial y coordinate variable.

property time_name

Name of the time coordinate variable.

property x_var

Spatial x coordinate variable.

property y_var

Spatial y coordinate variable.

property time_var

Time coordinate variable.

property x_dim

Name of the spatial x dimension.

property y_dim

Name of the spatial y dimension.

property time_dim

Name of the time dimension.

property x_size

Size of the spatial x dimension.

property y_size

Size of the spatial y dimension.

property time_size

Size of the time dimension.

property shape

Tuple of dimension sizes.

property chunks

Tuple of dimension chunk sizes.

property coords

Dictionary of coordinate variables.

classmethod new(cube: xarray.Dataset)xcube.core.schema.CubeSchema[source]

Create a cube schema from given cube.

Plugin Development

class xcube.util.extension.ExtensionRegistry[source]

A registry of extensions. Typically used by plugins to register extensions.

has_extension(point: str, name: str)bool[source]

Test if an extension with given point and name is registered.

Return type

bool

Parameters
  • point (str) – extension point identifier

  • name (str) – extension name

Returns

True, if extension exists

get_extension(point: str, name: str) → Optional[xcube.util.extension.Extension][source]

Get registered extension for given point and name.

Parameters
  • point (str) – extension point identifier

  • name (str) – extension name

Returns

the extension or None, if no such exists

get_component(point: str, name: str) → Any[source]

Get extension component for given point and name. Raises a ValueError if no such extension exists.

Parameters
  • point (str) – extension point identifier

  • name (str) – extension name

Returns

extension component

find_extensions(point: str, predicate: Optional[Callable[[xcube.util.extension.Extension], bool]] = None) → List[xcube.util.extension.Extension][source]

Find extensions for point and optional filter function predicate.

The filter function is called with an extension and should return a truth value to indicate a match or mismatch.

Parameters
  • point (str) – extension point identifier

  • predicate – optional filter function

Returns

list of matching extensions

find_components(point: str, predicate: Optional[Callable[[xcube.util.extension.Extension], bool]] = None) → List[Any][source]

Find extension components for point and optional filter function predicate.

The filter function is called with an extension and should return a truth value to indicate a match or mismatch.

Parameters
  • point (str) – extension point identifier

  • predicate – optional filter function

Returns

list of matching extension components

add_extension(point: str, name: str, component: Optional[Any] = None, loader: Optional[Callable[[xcube.util.extension.Extension], Any]] = None, **metadata)xcube.util.extension.Extension[source]

Register an extension component or an extension component loader for the given extension point, name, and additional metadata.

Either component or loader must be specified, but not both.

A given loader must be a callable with one positional argument extension of type Extension and is expected to return the actual extension component, which may be of any type. The loader will only be called once and only when the actual extension component is requested for the first time. Consider using the function import_component() to create a loader that lazily imports a component from a module and optionally executes it.

Return type

Extension

Parameters
  • point (str) – extension point identifier

  • name (str) – extension name

  • component – extension component

  • loader – extension component loader function

  • metadata – extension metadata

Returns

a registered extension

remove_extension(point: str, name: str)[source]

Remove registered extension name from given point.

Parameters
  • point (str) – extension point identifier

  • name (str) – extension name

to_dict()[source]

Get a JSON-serializable dictionary representation of this extension registry.

class xcube.util.extension.Extension(point: str, name: str, component: Optional[Any] = None, loader: Optional[Callable[[xcube.util.extension.Extension], Any]] = None, **metadata)[source]

An extension that provides a component of any type.

Extensions are registered in a ExtensionRegistry.

Extension objects are not meant to be instantiated directly. Instead, ExtensionRegistry.add_extension() is used to register extensions.

Parameters
  • point – extension point identifier

  • name – extension name

  • component – extension component

  • loader – extension component loader function

  • metadata – extension metadata

property is_lazy

Whether this is a lazy extension that uses a loader.

property component

Extension component.

property point

Extension point identifier.

property name

Extension name.

property metadata

Extension metadata.

to_dict() → Dict[str, Any][source]

Get a JSON-serializable dictionary representation of this extension.

xcube.util.extension.import_component(spec: str, transform: Optional[Callable[[Any, xcube.util.extension.Extension], Any]] = None, call: bool = False, call_args: Optional[Sequence[Any]] = None, call_kwargs: Optional[Mapping[str, Any]] = None) → Callable[[xcube.util.extension.Extension], Any][source]

Return a component loader that imports a module or module component from spec. To import a module, spec should be the fully qualified module name. To import a component, spec must also append the component name to the fully qualified module name separated by a color (“:”) character.

An optional transform callable my be used to transform the imported component. If given, a new component is computed:

component = transform(component, extension)

If the call flag is set, the component is expected to be a callable which will be called using the given call_args and call_kwargs to produce a new component:

component = component(*call_kwargs, **call_kwargs)

Finally, the component is returned.

Parameters
  • spec (str) – String of the form “module_path” or “module_path:component_name”

  • transform – callable that takes two positional arguments, the imported component and the extension of type Extension

  • call (bool) – Whether to finally call the component with given call_args and call_kwargs

  • call_args – arguments passed to a callable component if call flag is set

  • call_kwargs – keyword arguments passed to callable component if call flag is set

Returns

a component loader

xcube.constants.EXTENSION_POINT_INPUT_PROCESSORS = 'xcube.core.gen.iproc'

The extension point identifier for input processor extensions

xcube.constants.EXTENSION_POINT_DATASET_IOS = 'xcube.core.dsio'

The extension point identifier for dataset I/O extensions

xcube.constants.EXTENSION_POINT_CLI_COMMANDS = 'xcube.cli'

The extension point identifier for CLI command extensions

xcube.util.plugin.get_extension_registry()xcube.util.extension.ExtensionRegistry[source]

Get populated extension registry.

xcube.util.plugin.get_plugins() → Dict[str, Dict][source]

Get mapping of “xcube_plugins” entry point names to JSON-serializable plugin meta-information.