geographer.utils package

Submodules

geographer.utils.cluster_rasters module

Cluster rasters.

Given a dataset and an optional list of rasters partition the rasters into equivalence classes (‘clusters’) that need to be respected when generating the train-validation split.

geographer.utils.cluster_rasters.get_raster_clusters(connector, clusters_defined_by, raster_names=None, preclustering_method='y then x-axis')[source]

Return clusters of raster.

Return type:

list[set[str]]

Parameters:
  • connector (Connector | Path | str) – connector or path or str to data dir containing connector

  • clusters_defined_by (Literal['rasters_that_share_vectors', 'rasters_that_share_vectors_or_overlap']) – relation between rasters defining clusters

  • raster_names (list[str] | None) – optional list of raster names

  • preclustering_method (Literal['x then y-axis', 'y then x-axis', 'x-axis', 'y-axis'] | None) – optional preclustering method to speed up clustering

Returns:

(names of rasters defining) clusters

geographer.utils.connector_utils module

Utilites used in the Connector class.

geographer.utils.connector_utils.empty_gdf(index_name, cols_and_types, crs_epsg_code=4326)[source]

Return an empty GeoDataFrame.

Return an empty GeoDataFrame with specified index and column names and types and crs.

Parameters:
  • index_name (str) – name of the index of the new empty GeoDataFrame

  • cols_and_types (dict) – dict with keys the names of the index and columns of the GeoDataFrame and values the types of the indices/column entries.

  • crs_epsg_code (int) – EPSG code of the crs the empty GeoDataFrame should have.

Returns:

the empty vectors_df GeoDataFrame.

Return type:

new_empty_df

geographer.utils.connector_utils.empty_gdf_same_format_as(df)[source]

Create an empty df of the same format as df.

Create an empty df of the same format (index name, columns, column types) as the df argument.

Return type:

GeoDataFrame

Parameters:

df (GeoDataFrame) – input GeoDataFrame.

Returns:

empty GeoDataFrame of same format as input.

geographer.utils.connector_utils.empty_graph()[source]

Return an empty bipartite graph to be used by Connector.

Return type:

BipartiteGraph

Returns:

empty graph

geographer.utils.connector_utils.empty_rasters_same_format_as(rasters)[source]

Create an empty rasters of the same format.

Create an empty rasters of the same format (index name, columns, column types) as the rasters argument.

Return type:

GeoDataFrame

Parameters:

rasters (GeoDataFrame) – Example rasters dataframe

Returns:

New empty rasters datagrame

geographer.utils.connector_utils.empty_vectors_same_format_as(vectors)[source]

Create an empty vectors of the same format.

Create an empty vectors of the same format (index name, columns, column types) as the vectors argument.

Return type:

GeoDataFrame

Parameters:

vectors (GeoDataFrame) – Example polygon dataframe

Returns:

New empty dataframe

geographer.utils.merge_datasets module

Utility functions for merging datasets.

geographer.utils.merge_datasets.merge_datasets(source_data_dir, target_data_dir, delete_source=True)[source]

Merge datasets.

Return type:

None

Parameters:
  • source_data_dir (Path | str) – data dir of source dataset

  • target_data_dir (Path | str) – data dir of target dataset

  • delete_source (bool) – Whether to delete source dataset after merging. Defaults to True.

geographer.utils.merge_datasets.merge_dirs(root_src_dir, root_dst_dir)[source]

Recursively merge two folders including subfolders.

(Shamelessly copied from stackoverflow)

Return type:

None

Parameters:
  • root_src_dir (Path | str) – root source directory

  • root_dst_dir (Path | str) – root target directory

geographer.utils.rasters_from_tif_dir module

Create associator rasters from a directory containing GeoTiff rasters.

geographer.utils.rasters_from_tif_dir.default_read_in_raster_for_raster_df_function(raster_path)[source]

Read in crs and bbox defining a GeoTIFF raster.

Return type:

tuple[int, Polygon]

Parameters:

raster_path (Path) – location of the raster

Returns:

crs code of the raster, bounding rectangle of the raster

Return type:

tuple

geographer.utils.rasters_from_tif_dir.rasters_from_rasters_dir(rasters_dir, rasters_crs_epsg_code=None, raster_names=None, rasters_datatype='tif', read_in_raster_for_raster_df_function=<function default_read_in_raster_for_raster_df_function>)[source]

Return rasters from a directory of GeoTiffs.

Build and return an associator rasters from a directory of rasters (or from a data directory). Only the index (rasters_index_name, defaults to raster_name), geometry column (coordinates of the raster_bounding_rectangle, and orig_crs_epsg_code (epsg code of crs the raster is in) columns will be populated, custom columns will have to be populated by a custom written function.

Return type:

GeoDataFrame

Parameters:
  • rasters_dir (Path | str) – path of the directory that the rasters are in (assumes the dir has no rasters subdir), or path to a data_dir with a rasters subdir.

  • rasters_crs_epsg_code (int | None) – epsg code of rasters crs to be returned.

  • raster_names (list[str] | None) – optional list of raster names. Defaults to None, i.e. all rasters in rasters_dir.

  • rasters_datatype (str) – datatype suffix of the rasters

  • read_in_raster_for_raster_df_function (Callable[[Path], tuple[int, Polygon]]) – function that reads in the crs code and the bounding rectangle for the rasters

Returns:

rasters conforming to the associator rasters format with index rasters_index_name and columns geometry and orig_crs_epsg_code

geographer.utils.utils module

Utility functions.

transform_shapely_geometry(geometry, from_epsg, to_epsg): Transforms a shapely geometry from one crs to another.

round_shapely_geometry(geometry, ndigits=1): Rounds the coordinates of a shapely vector geometry. Useful in some cases for testing the coordinate conversion of raster bounding rectangles.

geographer.utils.utils.concat_gdfs(objs, **kwargs)[source]

Return concatentation of a list of GeoDataFrames.

The crs and index name of the returned concatenated GeoDataFrames will be the crs and index name of the first GeoDataFrame in the list.

Return type:

GeoDataFrame

Parameters:
  • objs (list[GeoDataFrame])

  • kwargs (Any)

geographer.utils.utils.create_kml_all_geodataframes(data_dir, out_path)[source]

Create KML file from a dataset’s rasters and vectors.

Can be used to visualize data in Google Earth Pro.

Return type:

None

Parameters:
  • data_dir (Path | str)

  • out_path (Path | str)

geographer.utils.utils.create_logger(app_name, level=20)[source]

Create a logger.

Serves as a unified way to instantiate a new logger. Will create a new logging instance with the name app_name. The logging output is sent to the console via a logging.StreamHandler() instance. The output will be formatted using the logging time, the logger name, the level at which the logger was called and the logging message. As the root logger threshold is set to WARNING, the instantiation via logging.getLogger(__name__) results in a logger instance, which console handel also has the threshold set to WARNING. One needs to additionally set the console handler level to the desired level, which is done by this function. :rtype: Logger

Note

Function might be adapted for more specialized usage in the future

Parameters:
  • app_name (str) – Name of the logger. Will appear in the console output

  • level (int) – threshold level for the new logger.

Returns:

new logging instance

Return type:

logging.Logger

Examples:

>>> import logging
>>> logger=create_logger(__name__,logging.DEBUG)
geographer.utils.utils.deepcopy_gdf(gdf)[source]

Return deepcopy of GeoDataFrame.

Return type:

GeoDataFrame

Parameters:

gdf (GeoDataFrame)

geographer.utils.utils.map_dict_values(fun, dict_arg)[source]

Apply function to all values of a dict.

Return type:

dict

Parameters:
  • fun (Callable)

  • dict_arg (dict)

geographer.utils.utils.removeprefix(input_str, prefix)[source]

Remove prefix from string.

Return type:

str

Parameters:
  • input_str (str)

  • prefix (str)

geographer.utils.utils.round_shapely_geometry(geometry, ndigits=1)[source]

Round the coordinates of a shapely geometry.

Round the coordinates of a shapely geometry (e.g. Polygon or Point). Useful in some cases for testing the coordinate conversion of raster bounding rectangles.

Return type:

Polygon | Point

Parameters:
  • geometry (Point | Polygon | MultiPoint | MultiPolygon | MultiLineString | LinearRing | LineString | GeometryCollection) – shapely geometry to be rounded

  • ndigits – number of significant digits to round to. Defaults to 1.

Returns:

geometry with all coordinates rounded to ndigits number of significant digits.

geographer.utils.utils.transform_shapely_geometry(geometry, from_epsg, to_epsg)[source]

Transform a shapely geometry from one crs to another.

Return type:

Union[Point, Polygon, MultiPoint, MultiPolygon, MultiLineString, LinearRing, LineString, GeometryCollection]

Parameters:
  • geometry (Point | Polygon | MultiPoint | MultiPolygon | MultiLineString | LinearRing | LineString | GeometryCollection) – shapely geometry to be transformed.

  • from_epsg (int) – EPSG code of crs to be transformed from.

  • to_epsg (int) – EPSG code of crs to be transformed to.

Returns:

transformed shapely geometry

Module contents

Utils for handling remote sensing datasets.

  • convert_connector_dataset_tif2npy converts a dataset of GeoTiffs to

    .npys.

  • rasters_from_tif_dir generates an rasters GeoDataFrame from a

    directory of GeoTiffs.