-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spatial Coordinate Variable Support #7
Comments
Some thoughts / questions:
I don't know if that needs to be supported, but I've seen something exotic lately, where the geospatial variable was indexed by a single dimension "node", and the "lat" and "lon" arrays where indexed by "node" itself. So this is scattered point data / ungridded. Cf indexed. As far as I read it, the current draft of GeoZarr excludes such scenario since it mandates each dimension to be indexed by a 1D variable of the same name (https://github.com/zarr-developers/geozarr-spec/blob/main/geozarr-spec.md#geozarr-coordinates); It would also exlclude the "2D": array scenario I would say that if GeoZarr wants to support many different use cases, close to what all netCDF CF allows (and netCDF CF allows to do pretty much anything), then it might be best that GeoZarr == netCDF CF (or a subset of it) translated to JSON without any semantic change. Reinventing something somewhat similar but different than netCDF CF would be just a loss of time IMHO. This is an important decision that must be taken early in the process:
|
Not adding much to this thread just wanted to point to: https://docs.ogc.org/per/21-032.html#toc23 which I see @rouault contributed to and found super interesting. |
From my understanding, tools supporting Zarr are currently all based on 2D array coordinates. Note it was well supported by xarray for super big datacubes. I assume this should be the baseline even if I believe vectors and offset based coordinates might be great. |
Great discussion points here. My opinion on this question above is the first. That is already how any other standard works - putting the onus on data producers. |
I'm not sure what is the actual assumption behind "simple for consumers" (in particular for the present topic of variables) More generally, I agree with the statement "oblige them to do processing to fit their data" and being generic. But I would expect Geozarr easily support a wide range of data (across domains):
Moreover, if the user doesn't know about specificities of the dataset, we must provide recommendations for various types of data to be encoded based on standard conventions: for example, formats supporting multispectral band might be encoded in a single DataArray with a dimension for band (subsidiary question: how to encode such data if the bands are not all provided with same resolution ? how to discover the various part from a parent Zarr dataset ?). |
There is a real trade off here, but it's maybe not quite as simple as the dichotomy that @rouault laid out. I agree with @christophenoel that " I would expect Geozarr easily support a wide range of data (across domains)". The extension to that statement should be that it [Geozarr] should do that with a minimum set of design patterns chosen from those that are readily available in software with both read and write functionality. Apologies for loosing the thread @rouault -- I see you had a question about my original issue.
That should have been COARDS which is what the CF convention was largely based on. I had not included discrete geometry in my original list... that could certainly be considered in scope so the potential list would be:
I tend to agree with @rouault that if we were to try to encompass all of that scope, which is supported by CF, we would probably want to adopt more or less all of CF. I could see an argument for a CF clone that used WKT/PROJJSON and had support for origin/offset coordinates. |
I think I will call this issue overcome by events and close it in favor of #17 We may want to support auxiliary coordinate variables as is done in CF, but that should be brought up in a separate, more specific issue. |
We have use cases for coordinate variables encoded as:
Any other spatial coordinate variable types that need to supported?
All three will need to be supported in geozarr. The CF style for COARDS and 2D style seems to be a clear initial candidate. Is there an zarr encoding for raster coordinates to consider?
(edit -- the original post had "coords" instead of "COARDS" see https://cfconventions.org/cf-conventions/cf-conventions.html#coards-relationship for more on COARDS and CF)
The text was updated successfully, but these errors were encountered: