Skip to content

Commit

Permalink
Merge branch 'main' into update/data-pipeline
Browse files Browse the repository at this point in the history
mege.
  • Loading branch information
ianalexmac committed May 22, 2024
2 parents a0f8fc6 + c52eea4 commit e39cf38
Show file tree
Hide file tree
Showing 15 changed files with 7,838 additions and 10,753 deletions.
31 changes: 16 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,21 @@
### Welcome
This repository contains code and data regarding locations of solar installations in the Anchorage Alaska area between 2017 and 2023. The maps and visuals generated by this repo are intended for public display, either in publications or presentations, and represent a general portrait of the residential solar sector in Anchorage.

To go straight to the animated heat map, [click here](https://acep-uaf.github.io/sw_anchorage_solar_locations/)

<br>

### Getting Started
You are welcome to interact with this repo on several levels:
* Download the static maps and visuals for use in publications or presentations
* Embed the dynamic Mapbox maps in your website
* Embed or link the dynamic Mapbox maps to your website
* Run new/updated data through the geocoding scripts

Questions and comments? If you have a GitHub account, please open a new issue in this repo. Otherwise, contact [email protected]
<br>

### Overview
This repository can be divided into roughly two parts:
1. Convert partial string addresses (ex: "2439 DOUGLAS") into lat/long coordinates
2. Build a heat map interpolation from the lat/long coordinates

<br>

### Part 1: Convert Addresses to Coordinates

#### Address Cleaning Loop
The source data comes from the City of Anchorage Building Permits. The address strings in this source data did not include common nouns (ex: "Street", "Road", "Circle"), which caused a few errors during the geocode conversion to coordinates. Approximately 160 addresses failed to parse, returning only "Anchorage, AK". These problem addresses were fed into maps.google.com, and the results were written into `address_changes.csv` as `clean_address`. Pre-cleaned addresses were written as `raw_address`. These addresses changes were iterated through in the script `address_extraction_cleaning.R` and the cleaned permit data was written to `clean_2017_2023_SolarPermits.csv`. Addresses only were written to `input_addresses.csv`.
The source data comes from the City of Anchorage Building Permits. The address strings in this source data did not include common nouns (ex: "Street", "Road", "Circle"), which caused errors during the geocode conversion to coordinates. Approximately 160 addresses failed to parse, returning only "Anchorage, AK". These problem addresses were fed into maps.google.com, and the results were written into `address_changes.csv` as `clean_address`. Pre-cleaned addresses were written as `raw_address`. These addresses changes were iterated through in the script `address_extraction_cleaning.R` and the cleaned permit data was written to `clean_2017_2023_SolarPermits.csv`. Addresses only were written to `input_addresses.csv`.

#### API Call
Clean addresses from `input_addresses.csv` were read by the script `api_call.js`, which iterated through each address, pinged Google's Geocode API, caught the response, and wrote the collected responses to a JSON file `api_output.json`.
Expand All @@ -32,19 +26,26 @@ In order to get the JSON results pared down and converted to CSV, the script `ou
#### Joining API output and Permit Data
The flow came full circle via the script `join_output_source.R`, where the output data was loaded from `output.csv` and combined with `clean_2017_2023_SolarPermits.csv`. The combined dataset was written to file as `permits_lat_long.csv`.

#### Conversion Back to GeoJSON
It was useful to have the outputted data in CSV format in order to join back to the source data, but our end product is a web map and it was best to have the data in GeoJSON format. So, a quick script was written, `csv_to_geojson.js` in order to do so. The data was saved as `permits_lat_long.geojson`. Perhaps in the future, this JSON -> CSV -> GeoJSON dance could be eliminated.


#### Heat Map
With the source data combined with location coordinates and saved as a GeoJSON file, we were able to place points on a map and build a heat map of solar installations from 2017 to 2023. An animation of the heat map was added, which loops through the years 2017-2023, advancing 1 year every second. Densities are shown as cumulative installations.

[Click here](https://github.com/acep-uaf/sw-ARCTIC-locations/issues) to view the heat map on GitHub Pages

<br>
<br>

### Workflow Diagram
![Diagram of Workflow](/flow.jpg?raw=true "Workflow")
![Diagram of Workflow](/images/pipeline.jpg?raw=true "Workflow")

<br>
<br>

### Part 2: Heat Map of Solar Installs

#### Heat Map
With the source data combined with location coordinates, we were able to place points on a map and interpolate a heat map of solar installations from 2017 to 2023.




Expand Down
22 changes: 21 additions & 1 deletion code/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,29 @@ After that, a new column `year` was created by parsing date_time from the column
**Address Cleaning**
With special characters removed and year extracted, the next step was to clean the address strings. The source data did not include common nouns (such as "Street", "Road", "Circle"), which caused a few errors during the geocode API call. Approximately 160 addresses failed to parse, returning only "Anchorage, AK". These problem addresses were fed into maps.google.com, and the results were written into `address_changes.csv` under the column `clean_address`. Pre-cleaned addresses were written under the column `raw_address`. These addresses changes were looped back into the script and the clean addresses written to `clean_2017_2023_SolarPermits.csv`.

Finally, the column `Address` was extracted and written to file as `data/input_addresses.csv`
Finally, the column `Address` was extracted and written to file as `input_addresses.csv`

<br>

## `api_call.js`
This script:
* imports addresses from `input_addresses.csv`
* sends them to Google's Geocode API
* catches the responses
* and writes the resulting JSON to file as `api_output.json`.

<br>

## `output_to_csv.js`
This script imports the JSON API results `api_output.json` and saves them as CSV `output.csv` in order to facilitate wrangling with R.
In the future, this script and the two afterwards could be simplified or done differently (*keep in JSON? Maybe use [this package?](https://cran.r-project.org/web/packages/jsonlite/vignettes/json-aaquickstart.html) Or maybe wrangle in Javascript*).

<br>

## `join_output_source.R`
This is a short script that imports both `output.csv` and the cleaned source data `clean_2017_2023_SolarPermits.csv`, joins them together, and writes the result to file as `permits_lat_long.csv`

<br>

## `csv_to_geojson.js`
In order to map the data, we need it in GeoJSON format. This script imports `permits_lat_long.csv`, converts it to GeoJSON, and saves it as `permits_lat_long.geojson`. This is the source data for `heatmap.js`, the Mapbox heat map.
110 changes: 0 additions & 110 deletions code/contourmap.js

This file was deleted.

12 changes: 6 additions & 6 deletions code/csv_to_geojson.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@ const data = [];
fs.createReadStream('data/permits_lat_long.csv')
.pipe(csv())
.on('data', (row) => {
// Assuming your CSV has 'latitude' and 'longitude' columns
data.push({
lat: parseFloat(row.lat),
lng: parseFloat(row.lng),
properties: row
});
// Convert latitude and longitude to numbers
row.lat = parseFloat(row.lat);
row.lng = parseFloat(row.lng);

// Add the row directly to the data array
data.push(row);
})
.on('end', () => {
const geoJson = GeoJSON.parse(data, {Point: ['lat', 'lng']});
Expand Down
2 changes: 1 addition & 1 deletion code/heatmap.js
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ function jsonCallback(err, data) {
}

data.features = data.features.map((d) => {
d.properties.year = Number(d.properties.properties.year);
d.properties.year = Number(d.properties.year);
return d;
});

Expand Down
56 changes: 0 additions & 56 deletions code/value_interpolation.js

This file was deleted.

Loading

0 comments on commit e39cf38

Please sign in to comment.