Cutting datasets: basics¶
The DSCutter classes are used for cutting a dataset.
GeoGrapher has two general customizable DSCutter classes:
geographer.cutters. There are two helper functions that return
DSCutter s customized for the following two common use cases:
get_cutter_every_raster_to_grid, described in Cutting every raster to a grid of rasters, andget_cutter_every_raster_to_grid, described in Cutting rasters around vectors
Cutting every raster to a grid of rasters¶
To create a new dataset in target_data_dir from a source dataset in
source_data_dir by cutting every raster in the dataset to a grid of
rasters use the geographer.cutters.get_cutter_every_raster_to_grid()
function:
from geographer.cutters import get_cutter_every_raster_to_grid
cutter = get_cutter_every_raster_to_grid(
new_raster_size=512,
source_data_dir=<SOURCE_DATA_DIR>,
target_data_dir=<TARGET_DATA_DIR>,
name=<OPTIONAL_NAME_FOR_SAVING>)
cutter.cut()
The geographer.cutters.get_cutter_every_raster_to_grid() function returns
a geographer.cutters.DSCutterIterOverRasters instance. The cut()
method will save the cutter to a JSON file in connector.connector_dir. To
update the target dataset after the source dataset has grown, first read the
JSON file and then run update():
from geographer.cutters import DSCutterIterOverRasters
dataset_cutter = DSCutterIterOverRasters.from_json_file(<path/to/saved.json>)
dataset_cutter.update()
Warning
The update method assumes that that no vectors or raster
rasters that remain in the target dataset have been removed from the
source dataset.
Cutting rasters around vectors¶
Cutting rasters around vector features (e.g. create 512 × 512 pixel cutouts around vector features from 10980 × 10980 Sentinel-2 tiles):
from geographer.cutters import get_cutter_rasters_around_every_vector
cutter = get_cutter_rasters_around_every_vector(
source_data_dir=<SOURCE_DATA_DIR>,
target_data_dir=<TARGET_DATA_DIR>,
name=<OPTIONAL_NAME_FOR_SAVING>
new_raster_size: RasterSize | None
new_raster_size=512,
target_raster_count=2,
mode: "random")
cutter.cut()
The geographer.cutters.get_cutter_rasters_around_every_vector() function
returns a geographer.cutters.DSCutterIterOverVectors instance. The
cut() method will save the cutter to a JSON file in
connector.connector_dir. To update the target dataset after the source
dataset has grown, first read the JSON file and then run update():
from geographer.cutters import DSCutterIterOverVectors
dataset_cutter = DSCutterIterOverVectors.from_json_file(<path/to/saved.json>)
dataset_cutter.update()
Warning
The update method assumes that that no vectors or rasters that remain in
the target dataset have been removed from the source dataset.