Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an optional data transformer to gafferpy results #4

Open
t92549 opened this issue Jul 28, 2022 · 1 comment
Open

Add an optional data transformer to gafferpy results #4

t92549 opened this issue Jul 28, 2022 · 1 comment
Labels
enhancement New feature or request

Comments

@t92549
Copy link
Contributor

t92549 commented Jul 28, 2022

Currently, results are returned in gafferpy as either the direct json result from the Gaffer api, or as gafferpy object equivalent. This is okay for some use cases, but if a users wants to perform a simple, fast query, it can become bogged down in a lot of Java related boilerplate to do with types.
This is an example output from the road-traffic example:

{'class': 'uk.gov.gchq.gaffer.data.element.Edge',
  'destination': 'M32:M4 (19)',
  'directed': True,
  'group': 'RoadUse',
  'matchedVertex': 'SOURCE',
  'properties': {'count': {'java.lang.Long': 841303},
                 'countByVehicleType': {'uk.gov.gchq.gaffer.types.FreqMap': {'AMV': 407034,
                                                                             'BUS': 1375,
                                                                             'CAR': 320028,
                                                                             'HGV': 27234,
                                                                             'HGVA3': 1277,
                                                                             'HGVA5': 5964,
                                                                             'HGVA6': 4817,
                                                                             'HGVR2': 11369,
                                                                             'HGVR3': 2004,
                                                                             'HGVR4': 1803,
                                                                             'LGV': 55312,
                                                                             'PC': 1,
                                                                             'WMV2': 3085}},
                 'endDate': {'java.util.Date': 1431543599999},
                 'startDate': {'java.util.Date': 1034319600000}},
  'source': 'M32:1'}

It would be great if this could be optionally return an object that you could get results directly from without nested types involved:

>>> print(result.source)
'M32:1'
>>> print(result.properties.count)
841303
>>> print(result.countByVehicleType.CAR)
320028

This could be implemented as a generator that takes json input to create these results objects lazily. Dictionaries can be mapped to objects easily in Python (see munch).

When creating this generator, users should be able to easily add transform functions to the result, like removing, renaming and applying functions to fields. A lot of this functionality (renaming fields, ignoring fields and transforming them) already comes with Gaffer though, so perhaps this could be added to the OperationChain rather than executed in Python.

@t92549
Copy link
Contributor Author

t92549 commented Aug 4, 2022

As well as better output handling, it would be great if there was an option for the output to be streamed using execute/chunked.

@t92549 t92549 transferred this issue from gchq/gaffer-tools Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant