Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to pydantic #51

Open
sergei-maertens opened this issue Sep 6, 2022 · 5 comments
Open

Switch to pydantic #51

sergei-maertens opened this issue Sep 6, 2022 · 5 comments
Assignees

Comments

@sergei-maertens
Copy link
Member

Currently our API models are build on top of dataclasses, but pydantic has much more functionality and is better tested while serving the same purpose.

We probably still need to:

  • Derive UUID from URL
  • Convert camel case into snake_case before passing it to the model factories
@sergei-maertens sergei-maertens self-assigned this Sep 6, 2022
@CharString
Copy link

Case juggling comes built in in pydantic: https://pydantic-docs.helpmanual.io/usage/model_config/#alias-generator

@CharString
Copy link

I've been using msgspec in a different project. I could replace a handrolled parser and drf serialisers with a few msgspec Structs.

But it could have used the existing dataclasses too. It's a nice design. you have a json.decode(buf: bytelike, type: targettype) which parses the incoming bytes as json into a value of the given type.
https://jcristharif.com/msgspec/supported-types.html
If you need some other type, you can register a function that builds that type from the given JSON value.

Like pydantic, it supports case juggling:

rename (str, mapping, callable, or None, default None) – Controls renaming the field names used when encoding/decoding the struct. May be one of "lower", "upper", "camel", "pascal", or "kebab" to rename in lowercase, UPPERCASE, camelCase, PascalCase, or kebab-case respectively. May also be a mapping from field names to the renamed names (missing fields are not renamed). Alternatively, may be a callable that takes the field name and returns a new name or None to not rename that field. Default is None for no field renaming.

Unlike pydantic it doesn't validate on each and every model instantiation (because you can decode into any type, not just Structs). Validation happens in de decode function from arbitrary bytes. It assumes your type-checker will catch errors when you construct values in your own code.

"Parse, don't validate" or rather "json.decode, don't rewrite your shotgun validation in Rust, so its runtime becomes acceptable"

@sergei-maertens
Copy link
Member Author

cc @swrichards ^

@sergei-maertens
Copy link
Member Author

sergei-maertens commented Dec 19, 2024

Adding to this - this is a broader discussion on how to deal with client libraries. The current state of zgw-consumers is that it's more of a generic service configuration library than specifically dealing with ZGW concepts and that's also how we use it.

So, the discussion is still relevant, however, I'd like to move zgw_consumers.api_models out of this lib into standalone projects, so we can also provide proper centralized support/types/client interaction for things like the objects/objecttypes API, klaninteracties etc.

And then, as our final act, we will have to rename the library because it's not doing anything from its original purpose any longer :)

@swrichards
Copy link
Collaborator

More on this after the holidays, but first thoughts:

I agree that runtime validation is overkill (though I would wonder how much this overhead is non-negligible for the average API endpoint). I would say the main problem we're trying to solve is that server-side contracts such as OAS are too under specified to capture all edge cases, and even then the theoretical tooling that would always keep your clients in sync with the server spec seemed rickety the last time I checked (which was a few months ago). To this we can add the problem of dialects and slight variations between deployed versions (and the difficulty in detecting this version for APIs that don't expose good versioning metadata in their headers).

From this I think it follows that we should be able to validate responses, but not that we actually need to do that on an ongoing basis during runtime. The thing is that you want a specification of your expected client contract so that you can intermittently validate it against actual server responses to detect drift.

This is more or less what we've done in our embedded openklant2 client in OIP. The actual client types are simple TypedDict objects, and we make use of Pydantic's ability to generate validators for TypedDict specs to define validators. Importantly, these validators are not actually invoked in the client, but instead we use them in our tests in combination with some Docker/VCR tooling to easily run a testsuite against a live Docker server, using the validators to warn us if there's been any breaking changes in the contract.

For example, the Partij type validators are exported, but they are only used in these tests which run against VCR fixtures recorded from spinning up a fresh, clean-state Docker container and seeding with API calls in our pytest fixtures.

This pattern has worked rather well so far, but of course it depends on some tooling which can quickly grow slow and complex, and it presumes you actually have access to Docker images (which is true for FOSS components such as ours, but not necessarily true for closed-source services that in theory speak the same standard, but in practice speak some kind of dialect).

Typing in general is important so the client can declare its expected contract, in a world in which there are still too many edge cases which are not fully captured by server-side OAS Specs and the like. But that need not be validated on every request, because the happy flow is that our expected client contract matches the server contract, and what we want to be able to detect is drift.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants