-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataflows and metadataflows #214
Comments
For data, maybe the mechanism for this could be a "skip_invalid_data" setting. This would depend on something like #20 (being worked on in #190). What I'm thinking is, that when outputting the data, if this setting is true, then any row which has a disaggregation/unit/series value that is not part of the data schema will be skipped. For example, to take the case of the SDMX for global usage: The data schema would be imported from the global SDMX DSD. Then any data row that uses custom disaggregations (like sub-national REF_AREAs, etc.) will be omitted in the output. |
@brockfanning is this done/partly done? sounds familiar |
@LucyGwilliamAdmin Partly, I'd say. What I describe in the example above we definitely already have - with the "constrain_data" and "constrain_metadata" parameters. We also have the "global_content_constraints" which similarly drops rows of data that don't comply with the global content constraints (like that certain series have to be female, etc.). A couple of things, I think, still need to be done, regarding that "global_content_constraints" parameter:
Thoughts? |
@brockfanning thanks, that makes sense
|
In SDMX there is the concept of a "dataflow" or "metadataflow", which (as I understand it) is a way to filter the output according to some constraints. We may be able to implement something like that here.
One use-case that definitely exists is in our SDMX output. Many countries may be interested in using the SDMX output in order to submit their data to the UNSD's database. However, this is not possible if the data uses any non-global codes/dimensions. So it would be useful to have a "dataflow" which filters the output to only including global codes/dimensions.
Ideally this filtering would be applied to the data in its internal DataFrame form, so that the feature could be used regardless of whether the output is going to be SDMX, GeoJSON, etc.
The text was updated successfully, but these errors were encountered: