Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endpoint to conflate the submission with osm data #1594

Merged
merged 6 commits into from
Jul 11, 2024

Conversation

Sujanadh
Copy link
Collaborator

What type of PR is this? (check all applicable)

  • πŸ• Feature
  • πŸ› Bug Fix
  • πŸ“ Documentation
  • πŸ§‘β€πŸ’» Refactor
  • βœ… Test
  • πŸ€– Build or CI
  • ❓ Other (please specify)

Related Issue

Describe this PR

This PR creates a new endpoint in order to conflate the submission data with osm features.

  • if both geometries (buildings for now) are equal or almost equal, then only their tags/properties are appended and returned for further validation
  • if both geometries overlaps to some extent then in that case both geometries are returned along with their properties and tags appended
  • is_duplicate and is_partial_overlap boolean flags (True or False) are appended, respectively.

Response

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              85.297789,
              27.723315
            ],
            [
              85.297842,
              27.723297
            ],
            [
              85.297856,
              27.723329
            ],
            [
              85.297908,
              27.723312
            ],
            [
              85.297924,
              27.723351
            ],
            [
              85.297819,
              27.723386
            ],
            [
              85.297789,
              27.723315
            ]
          ]
        ]
      },
      "properties": {
        "start": "2024-06-11T14:01:37.472+05:45",
        "end": "2024-06-11T14:02:11.013+05:45",
        "today": "2024-06-11",
        "phonenumber": null,
        "deviceid": "collect:N9J5nVWtA9ApE60K",
        "username": "localadmin",
        "email": "NOT IMPLEMENTED",
        "instructions": null,
        "task_id": null,
        "existing": "be3d424a-a5a4-4544-a0c5-3b38dd522dd7",
        "point": null,
        "digitisation_correct": "yes",
        "digitisation_problem": null,
        "digitisation_problem_other": null,
        "image": null,
        "form_category": "Unknown",
        "xid": "650020468",
        "status": "2",
        "category": "housing",
        "name": null,
        "building_material": null,
        "building_levels": null,
        "religion": null,
        "denomination": null,
        "muslim": null,
        "hindu": null,
        "religious": null,
        "service": null,
        "Shop": null,
        "tourism": null,
        "education_details": null,
        "link": null,
        "government": null,
        "medical": null,
        "food": null,
        "cuisine": null,
        "heritage": null,
        "emergency": null,
        "operator": null,
        "inscription": null,
        "housing": null,
        "provider": null,
        "rooms": null,
        "ref": null,
        "building_neighbor": null,
        "power": null,
        "generator_source": null,
        "generator": null,
        "water": null,
        "water_source": null,
        "pump_unit": null,
        "capacity": null,
        "age": null,
        "building_prefab": null,
        "building_floor": null,
        "building_roof": null,
        "condition": null,
        "access_roof": null,
        "levels_underground": null,
        "comment": null,
        "osm_id": 650020468,
        "osm_type": "ways_poly",
        "version": 2,
        "tags": {
          "building": "yes"
        },
        "changeset": 98649258,
        "timestamp": "2021-02-03T14:09:58",
        "is_duplicate": true,
        "is_partial_overlap": false
      }
    }
  ]
}

Alternative Approaches Considered

Did you attempt any other approaches that are not documented in code?

Review Guide

Notes for the reviewer. How to test this change?

Checklist before requesting a review

[optional] What gif best describes this PR or how it makes you feel?

@Sujanadh Sujanadh requested review from nrjadkry and spwoodcock June 21, 2024 07:18
@Sujanadh Sujanadh self-assigned this Jun 21, 2024
@github-actions github-actions bot added the backend Related to backend code label Jun 21, 2024
@@ -777,3 +785,160 @@ def parse_featcol(features: Union[Feature, FeatCol, MultiPolygon, Polygon]):
elif isinstance(features, Feature):
feat_col = geojson.FeatureCollection([feat_col])
return feat_col


def request_snapshot(geometry):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function and the poll_task_status function can be replaced with osm_fieldwork.PostgresClient.
There is already functionality in place to get a data extract using provided config πŸ‘
Example:

pg = PostgresClient(

@@ -524,3 +530,45 @@ async def get_submission_detail(
odk_form.getSubmissions(project.odkid, db_xform.odk_form_id, submission_id)
)
return submission.get("value", [])[0]


async def get_submission_geojson(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed we already have central_crud.convert_odk_submission_json_to_geojson and submission_routes.download_submission_geojson that all basically do the same thing.

Could we refactor to only have the logic once, then reuse the code?

Copy link
Collaborator Author

@Sujanadh Sujanadh Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if i want geojson only for the conflation, then this function won't work since it returns submission in bytes. like you said, we can refactor the download_submission endpoint to use convert_odk_submission_json_to_geojson. Or we can update this function to return geojson and convert it into bytes wherever necessary instead of returning bytes from this function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second option sounds best in the spirit of DRY code.

A single function to return a geojson, then encode as bytes where specifically needed.

If absolutely necessary it's also possible to decode bytes back to geojson using .decode("utf--8") πŸ‘

json.load would probably also do that

)


def conflate_features(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@spwoodcock
Copy link
Member

spwoodcock commented Jul 2, 2024

We can merge this for now as initial functionality πŸ‘

In the long term we should look into integrating this (after some good testing): osm-merge/osm-merge#9

@Sujanadh
Copy link
Collaborator Author

Sujanadh commented Jul 4, 2024

updates:

  • used PostgresClient to extract features from osm.
  • used task outline instead of bbox of submitted feature.
  • updated the convert_odk_submission_json_to_geojson to return geojson and respective calls to convert geojson to bytes

geometry,
extra_params={
"fileName": (
f"fmtm/{settings.FMTM_DOMAIN}/data_extract"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor thing - the folder structure starting with fmtm/... should only be used if we want to keep the data extract.

Raw data api has a rule if we use auth and a set structure, the file is retained for the project.

In this case we could remove this param as we only need a temp extract πŸ‘

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

if settings.RAW_DATA_API_AUTH_TOKEN
else "fmtm_extract"
),
"outputType": "geojson",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe bind zip true and geojson are defaults on raw data api, but no harm being explicit πŸ‘

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its better to mention, as we won't know what are the default values.

@Sujanadh
Copy link
Collaborator Author

Sujanadh commented Jul 11, 2024

Updates:

  • bypass check for public project in wrap_check_access
  • removed filename parameter in extra params when requesting data extracts as we don't need it since it is only temp extracts and we don't have to save it in fmtm

@@ -242,6 +242,12 @@ async def wrap_check_access(
role: ProjectRole,
) -> ProjectUserDict:
"""Wrap check_access call with HTTPException."""
# If project is public, skip permission check
if project.visibility == ProjectVisibility.PUBLIC:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this logic here means that all users have access to all endpoints if the project is public.

So they could delete the project, modify anything, etc.

It's currently only present in the mapper permission, as we only want to bypass the auth if the endpoint requires mapper role only:

if project.visibility == ProjectVisibility.PUBLIC:

Was this logic needed for the code in this PR to work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything else is all set to merge!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a minor update, fixing the return type on the mapper role if ProjectVisibility.PUBLIC.

I'm going to merge as this PR has been open a while - let me know if any issues!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh I totally missed that. Yes , you are right. I changed that in order to get ProjectUserDict.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh I totally missed that. Yes , you are right. I changed that in order to get ProjectUserDict.

@spwoodcock spwoodcock merged commit 477f810 into development Jul 11, 2024
5 checks passed
@spwoodcock spwoodcock deleted the feat/data-conflation branch July 11, 2024 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Related to backend code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants