Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove GDAL and RichDEM dependancy from tests #675

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ xdem/_version.py

# Example data downloaded/produced during tests
examples/data/
tests/test_data/

doc/source/basic_examples/
doc/source/advanced_examples/
Expand Down
32 changes: 9 additions & 23 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,18 @@ ifndef VENV
VENV = "venv"
endif

# Python version requirement
PYTHON_VERSION_REQUIRED = 3.10

# Python global variables definition
PYTHON_VERSION_MIN = 3.10
# Set PYTHON if not defined in command line
# Example: PYTHON="python3.10" make venv to use python 3.10 for the venv
# By default the default python3 of the system.
ifndef PYTHON
# Try to find python version required
PYTHON = "python$(PYTHON_VERSION_REQUIRED)"
PYTHON = "python3"
endif
PYTHON_CMD=$(shell command -v $(PYTHON))

PYTHON_VERSION_CUR=$(shell $(PYTHON_CMD) -c 'import sys; print("%d.%d" % sys.version_info[0:2])')
PYTHON_VERSION_OK=$(shell $(PYTHON_CMD) -c 'import sys; req_ver = tuple(map(int, "$(PYTHON_VERSION_REQUIRED)".split("."))); cur_ver = sys.version_info[0:2]; print(int(cur_ver == req_ver))')
PYTHON_VERSION_CUR=$(shell $(PYTHON_CMD) -c 'import sys; print("%d.%d"% sys.version_info[0:2])')
PYTHON_VERSION_OK=$(shell $(PYTHON_CMD) -c 'import sys; cur_ver = sys.version_info[0:2]; min_ver = tuple(map(int, "$(PYTHON_VERSION_MIN)".split("."))); print(int(cur_ver >= min_ver))')

############### Check python version supported ############

Expand All @@ -30,7 +31,7 @@ ifeq (, $(PYTHON_CMD))
endif

ifeq ($(PYTHON_VERSION_OK), 0)
$(error "Requires Python version == $(PYTHON_VERSION_REQUIRED). Current version is $(PYTHON_VERSION_CUR)")
$(error "Requires Python version >= $(PYTHON_VERSION_MIN). Current version is $(PYTHON_VERSION_CUR)")
endif

################ MAKE Targets ######################
Expand All @@ -45,19 +46,6 @@ venv: ## Create a virtual environment in 'venv' directory if it doesn't exist
@touch ${VENV}/bin/activate
@${VENV}/bin/python -m pip install --upgrade wheel setuptools pip

.PHONY: install-gdal
install-gdal: ## Install GDAL version matching the system's GDAL via pip
@if command -v gdalinfo >/dev/null 2>&1; then \
GDAL_VERSION=$$(gdalinfo --version | awk '{print $$2}'); \
echo "System GDAL version: $$GDAL_VERSION"; \
${VENV}/bin/pip install gdal==$$GDAL_VERSION; \
else \
echo "Warning: GDAL not found on the system. Proceeding without GDAL."; \
echo "Try installing GDAL by running the following commands depending on your system:"; \
echo "Debian/Ubuntu: sudo apt-get install -y gdal-bin libgdal-dev"; \
echo "Red Hat/CentOS: sudo yum install -y gdal gdal-devel"; \
echo "Then run 'make install-gdal' to proceed with GDAL installation."; \
fi

.PHONY: install
install: venv ## Install xDEM for development (depends on venv)
Expand All @@ -66,8 +54,6 @@ install: venv ## Install xDEM for development (depends on venv)
@test -f .git/hooks/pre-commit || echo "Installing pre-commit hooks"
@test -f .git/hooks/pre-commit || ${VENV}/bin/pre-commit install -t pre-commit
@test -f .git/hooks/pre-push || ${VENV}/bin/pre-commit install -t pre-push
@echo "Attempting to install GDAL..."
@make install-gdal
@echo "xdem installed in development mode in virtualenv ${VENV}"
@echo "To use: source ${VENV}/bin/activate; xdem -h"

Expand Down
4 changes: 1 addition & 3 deletions dev-environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ dependencies:
- matplotlib=3.*
- pyproj>=3.4,<4
- rasterio>=1.3,<2
- scipy=1.*
- scipy<1.15.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why <1.15.0 ? not great to keep that. If really needed, please add an issue to correct it in another PR. This can be difficult with other packages in time. I prefer scipy >=, !=1.15.0 . it allows that next scipy release automatic upgrade and test if bugs are resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- tqdm
- scikit-image=0.*
- scikit-gstat>=1.0.18,<1.1
Expand All @@ -28,13 +28,11 @@ dependencies:
- scikit-learn

# Test dependencies
- gdal # To test against GDAL
- pytest
- pytest-xdist
- pyyaml
- flake8
- pylint
- richdem # To test against richdem

# Doc dependencies
- sphinx
Expand Down
2 changes: 1 addition & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ dependencies:
- matplotlib=3.*
- pyproj>=3.4,<4
- rasterio>=1.3,<2
- scipy=1.*
- scipy<1.15.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same review

- tqdm
- scikit-image=0.*
- scikit-gstat>=1.0.18,<1.1
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ numpy==1.*
matplotlib==3.*
pyproj>=3.4,<4
rasterio>=1.3,<2
scipy==1.*
scipy<1.15.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

tqdm
scikit-image==0.*
scikit-gstat>=1.0.18,<1.1
Expand Down
1 change: 0 additions & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,6 @@ test =
flake8
pylint
scikit-learn
richdem
doc =
sphinx
sphinx-book-theme
Expand Down
194 changes: 64 additions & 130 deletions tests/conftest.py
Original file line number Diff line number Diff line change
@@ -1,142 +1,76 @@
from typing import Callable, List, Union
import os
import tarfile
import tempfile
import urllib
from distutils.dir_util import copy_tree
from typing import Callable

import geoutils as gu
import numpy as np
import pytest
import richdem as rd
from geoutils.raster import RasterType

from xdem._typing import NDArrayf
_TESTDATA_DIRECTORY = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "tests", "test_data"))

# Define a URL to the xdem-data repository's test data
_TESTDATA_REPO_URL = "https://github.com/vschaffn/xdem-data/tarball/2-richdem_gdal"
_COMMIT_HASH = "31a7159c982cec4b352f0de82bd4e0be61db3afe"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quickly checked if downloading was the best solution, and it seems fine and easy to maintain. Indeed, it might be better to download a tarball via a release. Is that what you were thinking? Also, we obviously need to update this PR once the xdem-data PR is merged and a release is created, so we can have the correct tarball/zip with a fixed version API. It doesn't seem like you're handling a version here. Maybe it's better to use a release, something like:

https://github.com/glaciohack/xdem-data/releases/download/v1.0.0/xdem-data.zip

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duboise-cnes I used the same download system as for importing the example data (Longyearbyen DEMs) here to maintain the same consistency and ease of maintenance. The example data are downloading from the main xdem-data repository which is why the aim is to merge the changes made to xdem-data so that one can also point to the main repository to download the test data. (i. e. to replace the url by https://github.com/GlacioHack/xdem-data/tarball/main with time)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I understand therefore the rationale, you know better the code of xdem, i didn't know that there was the example download code. I would say that using requests is better than urllib for maintenance but these would change the entire also for the examples.

And the download function could not be factorized with the example data, some code replication here. Not needed in my opinion.
Maybe add a global xdem function to download and avoid the duplication of code (and therefore ease maintenance and evolution of downloading if needed afterwards). @adebardo @adehecq your opinion ?
My 2cts.


@pytest.fixture(scope="session") # type: ignore
def raster_to_rda() -> Callable[[RasterType], rd.rdarray]:
def _raster_to_rda(rst: RasterType) -> rd.rdarray:
"""
Convert geoutils.Raster to richDEM rdarray.
"""
arr = rst.data.filled(rst.nodata).squeeze()
rda = rd.rdarray(arr, no_data=rst.nodata)
rda.geotransform = rst.transform.to_gdal()
return rda

return _raster_to_rda
def download_test_data(overwrite: bool = False) -> None:
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you can use requests et zipfile (or tarball as used equivalent) if a release ?

For example from a well known AI ;) not used directly just for inspiration


import os
import requests
import zipfile
from pathlib import Path

def download_reference_data(data_dir="tests/data", url="https://github.com/nom_utilisateur/xdem-data/releases/download/v1.0.0/xdem-data.zip"):
    """Télécharge et extrait les données de référence si elles ne sont pas déjà présentes."""
    data_path = Path(data_dir)
    
    # Vérifie si les données existent déjà
    if data_path.exists() and any(data_path.iterdir()):
        print("Les données de référence sont déjà disponibles.")
        return
    
    # Crée le répertoire si nécessaire
    data_path.mkdir(parents=True, exist_ok=True)
    
    # Télécharge l'archive ZIP
    print(f"Téléchargement des données de référence depuis {url}...")
    response = requests.get(url)
    response.raise_for_status()
    
    # Extraction de l'archive ZIP
    with zipfile.ZipFile(io.BytesIO(response.content)) as zip_ref:
        zip_ref.extractall(data_dir)
    
    print(f"Données de référence extraites dans {data_dir}.")

does it help ?

Download the entire test_data directory from the xdem-data repository.

:param overwrite: If True, re-downloads the data even if it already exists.
"""
if not overwrite and os.path.exists(_TESTDATA_DIRECTORY) and os.listdir(_TESTDATA_DIRECTORY):
return # Test data already exists

@pytest.fixture(scope="session") # type: ignore
def get_terrainattr_richdem(raster_to_rda: Callable[[RasterType], rd.rdarray]) -> Callable[[RasterType, str], NDArrayf]:
def _get_terrainattr_richdem(rst: RasterType, attribute: str = "slope_radians") -> NDArrayf:
"""
Derive terrain attribute for DEM opened with geoutils.Raster using RichDEM.
"""
rda = raster_to_rda(rst)
terrattr = rd.TerrainAttribute(rda, attrib=attribute)
terrattr[terrattr == terrattr.no_data] = np.nan
return np.array(terrattr)
# Clear the directory if overwrite is True
if overwrite and os.path.exists(_TESTDATA_DIRECTORY):
for root, dirs, files in os.walk(_TESTDATA_DIRECTORY, topdown=False):
for name in files:
os.remove(os.path.join(root, name))
for name in dirs:
os.rmdir(os.path.join(root, name))

# Create a temporary directory to download the tarball
temp_dir = tempfile.TemporaryDirectory()
tar_path = os.path.join(temp_dir.name, "test_data.tar.gz")

# Construct the URL with the commit hash
url = f"{_TESTDATA_REPO_URL}#commit={_COMMIT_HASH}"

response = urllib.request.urlopen(url)
if response.getcode() == 200:
with open(tar_path, "wb") as outfile:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not needed with requests ?

outfile.write(response.read())
else:
raise ValueError(f"Failed to download test data: {response.status_code}")

return _get_terrainattr_richdem
# Extract the tarball
with tarfile.open(tar_path) as tar:
tar.extractall(temp_dir.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check the format with a versioned release in xdem-data. if tar ok, otherwise zip equivalent.


# Copy the test_data directory to the target directory
extracted_dir = os.path.join(
temp_dir.name,
[dirname for dirname in os.listdir(temp_dir.name) if os.path.isdir(os.path.join(temp_dir.name, dirname))][0],
"test_data",
)

copy_tree(extracted_dir, _TESTDATA_DIRECTORY)


@pytest.fixture(scope="session") # type: ignore
def get_terrain_attribute_richdem(
get_terrainattr_richdem: Callable[[RasterType, str], NDArrayf]
) -> Callable[[RasterType, Union[str, list[str]], bool, float, float, float], Union[RasterType, list[RasterType]]]:
def _get_terrain_attribute_richdem(
dem: RasterType,
attribute: Union[str, List[str]],
degrees: bool = True,
hillshade_altitude: float = 45.0,
hillshade_azimuth: float = 315.0,
hillshade_z_factor: float = 1.0,
) -> Union[RasterType, List[RasterType]]:
"""
Derive one or multiple terrain attributes from a DEM using RichDEM.
"""
if isinstance(attribute, str):
attribute = [attribute]

if not isinstance(dem, gu.Raster):
raise ValueError("DEM must be a geoutils.Raster object.")

terrain_attributes = {}

# Check which products should be made to optimize the processing
make_aspect = any(attr in attribute for attr in ["aspect", "hillshade"])
make_slope = any(
attr in attribute
for attr in [
"slope",
"hillshade",
"planform_curvature",
"aspect",
"profile_curvature",
"maximum_curvature",
]
)
make_hillshade = "hillshade" in attribute
make_curvature = "curvature" in attribute
make_planform_curvature = "planform_curvature" in attribute or "maximum_curvature" in attribute
make_profile_curvature = "profile_curvature" in attribute or "maximum_curvature" in attribute

if make_slope:
terrain_attributes["slope"] = get_terrainattr_richdem(dem, "slope_radians")

if make_aspect:
# The aspect of RichDEM is returned in degrees, we convert to radians to match the others
terrain_attributes["aspect"] = np.deg2rad(get_terrainattr_richdem(dem, "aspect"))
# For flat slopes, RichDEM returns a 90° aspect by default, while GDAL return a 180° aspect
# We stay consistent with GDAL
slope_tmp = get_terrainattr_richdem(dem, "slope_radians")
terrain_attributes["aspect"][slope_tmp == 0] = np.pi

if make_hillshade:
# If a different z-factor was given, slopemap with exaggerated gradients.
if hillshade_z_factor != 1.0:
slopemap = np.arctan(np.tan(terrain_attributes["slope"]) * hillshade_z_factor)
else:
slopemap = terrain_attributes["slope"]

azimuth_rad = np.deg2rad(360 - hillshade_azimuth)
altitude_rad = np.deg2rad(hillshade_altitude)

# The operation below yielded the closest hillshade to GDAL (multiplying by 255 did not work)
# As 0 is generally no data for this uint8, we add 1 and then 0.5 for the rounding to occur between
# 1 and 255
terrain_attributes["hillshade"] = np.clip(
1.5
+ 254
* (
np.sin(altitude_rad) * np.cos(slopemap)
+ np.cos(altitude_rad) * np.sin(slopemap) * np.sin(azimuth_rad - terrain_attributes["aspect"])
),
0,
255,
).astype("float32")

if make_curvature:
terrain_attributes["curvature"] = get_terrainattr_richdem(dem, "curvature")

if make_planform_curvature:
terrain_attributes["planform_curvature"] = get_terrainattr_richdem(dem, "planform_curvature")

if make_profile_curvature:
terrain_attributes["profile_curvature"] = get_terrainattr_richdem(dem, "profile_curvature")

# Convert the unit if wanted.
if degrees:
for attr in ["slope", "aspect"]:
if attr not in terrain_attributes:
continue
terrain_attributes[attr] = np.rad2deg(terrain_attributes[attr])

output_attributes = [terrain_attributes[key].reshape(dem.shape) for key in attribute]

if isinstance(dem, gu.Raster):
output_attributes = [
gu.Raster.from_array(attr, transform=dem.transform, crs=dem.crs, nodata=-99999)
for attr in output_attributes
]

return output_attributes if len(output_attributes) > 1 else output_attributes[0]

return _get_terrain_attribute_richdem
def get_test_data_path() -> Callable[[str], str]:
def _get_test_data_path(filename: str, overwrite: bool = False) -> str:
"""Get file from test_data"""
download_test_data(overwrite=overwrite) # Ensure the test data is downloaded
file_path = os.path.join(_TESTDATA_DIRECTORY, filename)

if not os.path.exists(file_path):
if overwrite:
raise FileNotFoundError(f"The file {filename} was not found in the test_data directory.")
file_path = _get_test_data_path(filename, overwrite=True)

return file_path

return _get_test_data_path
Loading
Loading