Skip to content

Commit

Permalink
code refactor docs
Browse files Browse the repository at this point in the history
  • Loading branch information
pszufe committed Oct 12, 2023
1 parent d1dbec4 commit f7f7c3b
Show file tree
Hide file tree
Showing 18 changed files with 23,594 additions and 174,531 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/TagBot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ on:
workflow_dispatch:
inputs:
lookback:
default: 3
default: 7
permissions:
actions: read
checks: read
Expand Down
1 change: 1 addition & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ version = "0.1.0"
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
EzXML = "8f5d6c58-4d21-5cfd-889c-e3ad7ee6a615"
NamedTupleTools = "d9ec5142-1e00-5aa0-9d6a-321866360f50"
OpenStreetMapX = "86cd37e6-c0ff-550b-95fe-21d72c8d4fc9"
Parameters = "d96e819e-fc66-5662-9728-84c9c7592b0a"
Parsers = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0"
Expand Down
130 changes: 129 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,137 @@


# Tools for OSM parsing and POI extraction
# Tools for Open Steet Map (OSM) Point-of-Interest (POI) extraction and tiling/slicing of map files.

[![Dev](https://img.shields.io/badge/docs-dev-blue.svg)](https://pszufe.github.io/OSMToolset.jl/)

The goal of the package is to provide a fast and convenient interaface for extraction data from OpenStreetMap project and for construction of walkability indexes based on map data.

The package offers the following functionalities:
1. Export points-of-interests (POIs) from a OSM xml map file to a `DataFrame`
2. A spatial attractiveness index for analyzig location attractivenss across maps (can be used for an example in research of city's walkability index)
3. OSM map tiling/slicing - functionality to tile a large OSM file into smaller tiles without loosing connections on the tile edge. The map tiling works directly on XML files

This toolset has been constructed with performance in mind for large scale scraping of spatial data.
Hence, this package should work sufficiently well with datasets of size of entire states or countries.

## Exporting points of interests

The examples assume that the sample file is used
```
file = sample_osm_file()
```
Let us use the default configuration for parsing.
```
julia> df1 = find_poi(file)
78×10 DataFrame
Row │ elemtype elemid nodeid lat lon key value ⋯
│ Symbol Int64 Int64 Float64 Float64 String String ⋯
─────┼─────────────────────────────────────────────────────────────────────────────────────
1 │ node 69487440 69487440 42.3649 -71.1029 public_transport stop_positi ⋯
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
78 │ relation 7943642 2913461577 42.3624 -71.0847 leisure park ⋯
4 columns and 76 rows omitted
```
The default configuration file can be founds in `OSMToolset.__builtin_config_path`. This configuration has meta-data columns that can be seen in results of the parsing process. You could create on base on that your own configuration and use it from scratch.
```
myconfig = ScrapePOIConfig{AttractivenessMetaPOI}(OSMToolset.__builtin_config_path)
df1 = find_poi(file;scrape_config=myconfig)
```

Suppose that rather you want to configure manually what is scraped. Perhaps we just wanted parking spaces
that can be either defined in an OSM file as `amenity=parking` or as `parking` key value:
```
julia> config = DataFrame(key=["parking", "amenity"], values=["*", "parking"])
2×2 DataFrame
Row │ key values
│ String String
─────┼──────────────────
1 │ parking *
2 │ amenity parking
```
Note that contrary to the previous example this time we do not have meta data columns and hence we will use the `NoneMetaPOI` configuration.

Now this can be scraped as :
```
julia> df2 = find_poi(file; scrape_config=ScrapePOIConfig{NoneMetaPOI}(config))
12×7 DataFrame
Row │ elemtype elemid nodeid lat lon key value
│ Symbol Int64 Int64 Float64 Float64 String String
─────┼───────────────────────────────────────────────────────────────────────
1 │ way 187565434 1982207088 42.3603 -71.0866 amenity parking
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
12 │ way 1052438049 9672086211 42.3624 -71.0878 parking surface
10 rows omitted
```
This data can be further processed in many ways. For example [here](TODO) is a sample code that performs vizualisation

## Spatial attractiveness processing

Suppose we have the `df1` data from the previous example. Now we can do a spatial attractiveness index in the following way:
```
ix = AttractivenessSpatIndex(df1)
```
Note that the default configuration works with the `AttractivenessMetaPOI` data format. If you want a different structure of data for this index you need to crate a subtype of `MetaPOI` and use it in the constructor.

Let us consider some point on the map:
```
lat, lon = mean(df1.lat), mean(df1.lon)
```
We can use the API to calculate attractiveness of that location:
```
julia> attractiveness(ix, lat, lon)
(education = 42.73746118854219, entertainment = 30.385266049775055, healthcare = 12.491783858701343, leisure = 134.5949900134078, parking = 7.310719949554132, restaurants = 25.200347106553586, shopping = 6.89416203789267, transport = 12.090409181473555)
```
If, for the debugging purposes, we want to understand what data has been used to calculate that attractiveness use the `explain=true` parameter:
```
julia> attractiveness(ix, lat, lon ;explain=true).explanation
68×7 DataFrame
Row │ group influence range attractiveness poidistance lat lon
│ Symbol Float64 Float64 Float64 Float64 Float64 Float64
─────┼─────────────────────────────────────────────────────────────────────────────────
1 │ education 20.0 10000.0 16.9454 1527.31 42.3553 -71.105
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
68 │ shopping 5.0 500.0 0.618922 438.108 42.3625 -71.0834
66 rows omitted
```
The attractiveness function is fully configurable on how the attractiveness is actually calculated.
The available parameters can be used to define attractiveness dimension, aggreagation function,
attractivess function and how the distance is on map is calculated.

Let us for an example take maximum influence values rather than summing them:
```
julia> att = attractiveness(ix, lat, lon, aggregator = x -> length(x)==0 ? 0 : maximum(x))
(education = 19.245381074958622, entertainment = 17.69295158791498, healthcare = 6.245891929350671, leisure = 4.723681042516024, parking = 2.9623334286775806, restaurants = 4.596901824773207, shopping = 2.0103741801865715, transport = 6.407028429850689)
```

We could also used the custom scraped `df2` for the attractiveness:
```
ix2 = AttractivenessSpatIndex{NoneMetaPOI}(df2; get_range=a->300, get_group=a->:parking);
```
Note that since we did not have metadata we have manually provided `300` meters for the range and `:parking` for the group.

Now we can use this custom scraper to query the attractiveness:
```
julia> attractiveness(ix2, lat, lon; aggregator = sum, calculate_attractiveness = (a,dist) -> dist > 300 ? 0 : 300/dist )
(parking = 13.200370032301507,)
```
Note that for this code to work we needed to provide the way the attractiveness is calculated with the respect of metadata a (now an empty `struct` as this is NoneMetaPOI).

### OSM map tiling/slicing

The native format for OSM files is XML. The files are often huge and for many processing scenarios it might make sense to slice them into smaller portions. That is where this functionality becomes handy.

The file tiling can be executed as follows:
```
outfiles = tile_osm_file("file.osm", nrow=2, ncol=3, out_dir="some/target/directory")
```
After the execution `outfile` will be a matrix with file names of all tiles.


File tiling limitations
-----------------------
The OSM tiler is simultanously opening a file writer for each file. The operating system might limit the number of simultanously opened file descriptors. If you want to create large number of tiles you need to either change the operating system setting accordingly or use a recursive approach to file tiling.

## Aknowledgments

<sup>This research was funded in whole or in part by [National Science Centre, Poland][2021/41/B/HS4/03349].</sup>
Expand Down
31 changes: 0 additions & 31 deletions config/Attractiveness.csv

This file was deleted.

31 changes: 31 additions & 0 deletions config/ScrapePOIconfig.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
key values group influence range
amenity kindergarten education 3 2000
amenity school,music_school,language_school education 5 3000
amenity university,college education 20 10000
amenity library education 5 1000
amenity restaurant,fast_food restaurants 5 750
amenity food_court restaurants 15 1000
amenity pub,bar entertainment 5 750
amenity cafe,ice_cream restaurants 5 750
amenity bank,atm shopping 1 750
amenity parking parking 5 250
parking * parking 5 250
amenity bus_station transport 5 300
public_transport station transport 5 300
railway station transport 10 700
aeroway aerodrome,terminal transport 15 1000
public_transport stop_position transport 5 300
railway tram_stop transport 5 300
amenity clinic,doctors,dentist healthcare 10 500
healthcare * healthcare 10 500
amenity pharmacy healthcare 5 500
amenity hospital healthcare 20 1000
amenity cinema,theatre,arts_centre entertainment 20 1000
amenity nightclub entertainment 10 800
shop * shopping 5 500
amenity marketplace shopping 10 800
leisure garden,park,dog_park leisure 5 500
leisure sports_centre,sports_hall,stadium,track,pitch,horse_riding,swimming_pool,fitness_centre,fitness_station leisure 5 800
sport fitness leisure 5 800
landuse recreation_ground,winter_sports leisure 5 1500
tourism * leisure 5 1500
Binary file added docs/src/poiviz.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
9 changes: 7 additions & 2 deletions docs/src/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,16 @@ DocTestSetup = quote
end
```

Measuring Attractiveness Spatial Index
Scraping points-of-interest (POI)
---------------------
```@docs
find_poi(::AbstractString; ::AbstractString)
AttractivenessConfig
ScrapePOIConfig
```

Measuring Attractiveness Spatial Index
--------------------------------------
```@docs
AttractivenessSpatIndex
attractiveness(::AttractivenessSpatIndex, ::ENU; ::Function; ::Bool)
attractiveness(::AttractivenessSpatIndex, ::Float64, ::Float64; ::Function; ::Bool)
Expand Down
36 changes: 36 additions & 0 deletions docs/src/vizualize.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
### How to visualize the data

```
using PyCall
using Colors
using OSMToolset
file = sample_osm_file()
df = find_poi(file)
ix = AttractivenessSpatIndex(df);
flm = pyimport("folium");
colrs = distinguishable_colors(length(ix.measures), [RGB(0.1,0.2,0.4)])
class2col = Dict(ix.measures .=> colrs);
m = flm.Map(tiles = "Stamen Toner")
line = 0
for row in eachrow(df)
line += 1
info = "$(row.group):$(row.key)=$(row.value)"
k = findfirst(==(Symbol(row.group)), ix.measures)
flm.Circle((row.lat, row.lon), color="#$(hex(colrs[k]))",radius=row.influence,
fill_color="#$(hex(colrs[k]))", fill_opacity=0.06, tooltip=info).add_to(m)
end
bb = getbounds(file)
bounds = [(bb.minlat, Float64(bb.minlon)), (bb.maxlat, Float64(bb.maxlon))]
m.fit_bounds(bounds)
flm.Rectangle(bounds, color="blue",weight=2).add_to(m)
m
```

11 changes: 9 additions & 2 deletions src/OSMToolset.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ module OSMToolset
using CSV, DataFrames
using SpatialIndexing
using StatsBase

using NamedTupleTools
using Parsers, EzXML, Parameters
import OpenStreetMapX
import OpenStreetMapX: OSMData, LLA, ENU, distance
Expand All @@ -17,10 +17,17 @@ include("attractiveness.jl")

export tile_osm_file
export FloatLon
export AttractivenessConfig
export AttractivenessSpatIndex
export attractiveness
export find_poi
export calc_tiling
export getbounds, Bounds
export ScrapePOIConfig
export MetaPOI
export NoneMetaPOI
export AttractivenessMetaPOI
export sample_osm_file
export calculate_attractiveness, get_attractiveness_group
export clean_pois_by_group

end # module OSMToolset
Loading

0 comments on commit f7f7c3b

Please sign in to comment.