geographer.utils package¶
Submodules¶
geographer.utils.cluster_rasters module¶
Cluster rasters.
Given a dataset and an optional list of rasters partition the rasters into equivalence classes (‘clusters’) that need to be respected when generating the train-validation split.
- geographer.utils.cluster_rasters.get_raster_clusters(connector, clusters_defined_by, raster_names=None, preclustering_method='y then x-axis')[source]¶
Return clusters of raster.
- Return type:
list[set[str]]- Parameters:
connector (Connector | Path | str) – connector or path or str to data dir containing connector
clusters_defined_by (Literal['rasters_that_share_vectors', 'rasters_that_share_vectors_or_overlap']) – relation between rasters defining clusters
raster_names (list[str] | None) – optional list of raster names
preclustering_method (Literal['x then y-axis', 'y then x-axis', 'x-axis', 'y-axis'] | None) – optional preclustering method to speed up clustering
- Returns:
(names of rasters defining) clusters
geographer.utils.connector_utils module¶
Utilites used in the Connector class.
- geographer.utils.connector_utils.empty_gdf(index_name, cols_and_types, crs_epsg_code=4326)[source]¶
Return an empty GeoDataFrame.
Return an empty GeoDataFrame with specified index and column names and types and crs.
- Parameters:
index_name (str) – name of the index of the new empty GeoDataFrame
cols_and_types (dict) – dict with keys the names of the index and columns of the GeoDataFrame and values the types of the indices/column entries.
crs_epsg_code (int) – EPSG code of the crs the empty GeoDataFrame should have.
- Returns:
the empty vectors_df GeoDataFrame.
- Return type:
new_empty_df
- geographer.utils.connector_utils.empty_gdf_same_format_as(df)[source]¶
Create an empty df of the same format as df.
Create an empty df of the same format (index name, columns, column types) as the df argument.
- Return type:
GeoDataFrame- Parameters:
df (GeoDataFrame) – input GeoDataFrame.
- Returns:
empty GeoDataFrame of same format as input.
- geographer.utils.connector_utils.empty_graph()[source]¶
Return an empty bipartite graph to be used by Connector.
- Return type:
- Returns:
empty graph
- geographer.utils.connector_utils.empty_rasters_same_format_as(rasters)[source]¶
Create an empty rasters of the same format.
Create an empty rasters of the same format (index name, columns, column types) as the rasters argument.
- Return type:
GeoDataFrame- Parameters:
rasters (GeoDataFrame) – Example rasters dataframe
- Returns:
New empty rasters datagrame
- geographer.utils.connector_utils.empty_vectors_same_format_as(vectors)[source]¶
Create an empty vectors of the same format.
Create an empty vectors of the same format (index name, columns, column types) as the vectors argument.
- Return type:
GeoDataFrame- Parameters:
vectors (GeoDataFrame) – Example polygon dataframe
- Returns:
New empty dataframe
geographer.utils.merge_datasets module¶
Utility functions for merging datasets.
- geographer.utils.merge_datasets.merge_datasets(source_data_dir, target_data_dir, delete_source=True)[source]¶
Merge datasets.
- Return type:
None- Parameters:
source_data_dir (Path | str) – data dir of source dataset
target_data_dir (Path | str) – data dir of target dataset
delete_source (bool) – Whether to delete source dataset after merging. Defaults to True.
- geographer.utils.merge_datasets.merge_dirs(root_src_dir, root_dst_dir)[source]¶
Recursively merge two folders including subfolders.
(Shamelessly copied from stackoverflow)
- Return type:
None- Parameters:
root_src_dir (Path | str) – root source directory
root_dst_dir (Path | str) – root target directory
geographer.utils.rasters_from_tif_dir module¶
Create associator rasters from a directory containing GeoTiff rasters.
- geographer.utils.rasters_from_tif_dir.default_read_in_raster_for_raster_df_function(raster_path)[source]¶
Read in crs and bbox defining a GeoTIFF raster.
- Return type:
tuple[int,Polygon]- Parameters:
raster_path (Path) – location of the raster
- Returns:
crs code of the raster, bounding rectangle of the raster
- Return type:
tuple
- geographer.utils.rasters_from_tif_dir.rasters_from_rasters_dir(rasters_dir, rasters_crs_epsg_code=None, raster_names=None, rasters_datatype='tif', read_in_raster_for_raster_df_function=<function default_read_in_raster_for_raster_df_function>)[source]¶
Return rasters from a directory of GeoTiffs.
Build and return an associator rasters from a directory of rasters (or from a data directory). Only the index (rasters_index_name, defaults to raster_name), geometry column (coordinates of the raster_bounding_rectangle, and orig_crs_epsg_code (epsg code of crs the raster is in) columns will be populated, custom columns will have to be populated by a custom written function.
- Return type:
GeoDataFrame- Parameters:
rasters_dir (Path | str) – path of the directory that the rasters are in (assumes the dir has no rasters subdir), or path to a data_dir with a rasters subdir.
rasters_crs_epsg_code (int | None) – epsg code of rasters crs to be returned.
raster_names (list[str] | None) – optional list of raster names. Defaults to None, i.e. all rasters in rasters_dir.
rasters_datatype (str) – datatype suffix of the rasters
read_in_raster_for_raster_df_function (Callable[[Path], tuple[int, Polygon]]) – function that reads in the crs code and the bounding rectangle for the rasters
- Returns:
rasters conforming to the associator rasters format with index rasters_index_name and columns geometry and orig_crs_epsg_code
geographer.utils.utils module¶
Utility functions.
transform_shapely_geometry(geometry, from_epsg, to_epsg): Transforms a shapely geometry from one crs to another.
round_shapely_geometry(geometry, ndigits=1): Rounds the coordinates of a shapely vector geometry. Useful in some cases for testing the coordinate conversion of raster bounding rectangles.
- geographer.utils.utils.concat_gdfs(objs, **kwargs)[source]¶
Return concatentation of a list of GeoDataFrames.
The crs and index name of the returned concatenated GeoDataFrames will be the crs and index name of the first GeoDataFrame in the list.
- Return type:
GeoDataFrame- Parameters:
objs (list[GeoDataFrame])
kwargs (Any)
- geographer.utils.utils.create_kml_all_geodataframes(data_dir, out_path)[source]¶
Create KML file from a dataset’s rasters and vectors.
Can be used to visualize data in Google Earth Pro.
- Return type:
None- Parameters:
data_dir (Path | str)
out_path (Path | str)
- geographer.utils.utils.create_logger(app_name, level=20)[source]¶
Create a logger.
Serves as a unified way to instantiate a new logger. Will create a new logging instance with the name app_name. The logging output is sent to the console via a logging.StreamHandler() instance. The output will be formatted using the logging time, the logger name, the level at which the logger was called and the logging message. As the root logger threshold is set to WARNING, the instantiation via logging.getLogger(__name__) results in a logger instance, which console handel also has the threshold set to WARNING. One needs to additionally set the console handler level to the desired level, which is done by this function. :rtype:
LoggerNote
Function might be adapted for more specialized usage in the future
- Parameters:
app_name (str) – Name of the logger. Will appear in the console output
level (int) – threshold level for the new logger.
- Returns:
new logging instance
- Return type:
logging.Logger
Examples:
>>> import logging >>> logger=create_logger(__name__,logging.DEBUG)
- geographer.utils.utils.deepcopy_gdf(gdf)[source]¶
Return deepcopy of GeoDataFrame.
- Return type:
GeoDataFrame- Parameters:
gdf (GeoDataFrame)
- geographer.utils.utils.map_dict_values(fun, dict_arg)[source]¶
Apply function to all values of a dict.
- Return type:
dict- Parameters:
fun (Callable)
dict_arg (dict)
- geographer.utils.utils.removeprefix(input_str, prefix)[source]¶
Remove prefix from string.
- Return type:
str- Parameters:
input_str (str)
prefix (str)
- geographer.utils.utils.round_shapely_geometry(geometry, ndigits=1)[source]¶
Round the coordinates of a shapely geometry.
Round the coordinates of a shapely geometry (e.g. Polygon or Point). Useful in some cases for testing the coordinate conversion of raster bounding rectangles.
- Return type:
Polygon|Point- Parameters:
geometry (Point | Polygon | MultiPoint | MultiPolygon | MultiLineString | LinearRing | LineString | GeometryCollection) – shapely geometry to be rounded
ndigits – number of significant digits to round to. Defaults to 1.
- Returns:
geometry with all coordinates rounded to ndigits number of significant digits.
- geographer.utils.utils.transform_shapely_geometry(geometry, from_epsg, to_epsg)[source]¶
Transform a shapely geometry from one crs to another.
- Return type:
Union[Point,Polygon,MultiPoint,MultiPolygon,MultiLineString,LinearRing,LineString,GeometryCollection]- Parameters:
geometry (Point | Polygon | MultiPoint | MultiPolygon | MultiLineString | LinearRing | LineString | GeometryCollection) – shapely geometry to be transformed.
from_epsg (int) – EPSG code of crs to be transformed from.
to_epsg (int) – EPSG code of crs to be transformed to.
- Returns:
transformed shapely geometry
Module contents¶
Utils for handling remote sensing datasets.
- convert_connector_dataset_tif2npy converts a dataset of GeoTiffs to
.npys.
- rasters_from_tif_dir generates an rasters GeoDataFrame from a
directory of GeoTiffs.