Skip to content

Tip Sheet: Shapefiles

Hope Johnson edited this page Feb 13, 2020 · 7 revisions

To make any digital map you'll need either a file that defines the outlines of the shapes in the map (state boundaries are shapes, as are zip codes -- they're both "polygons") or the latitude and longitude of each point on your map. If you need to convert addresses to lat/lon points, you need the Geocoding Tip Sheet

There are two file types that you'll encounter if you're mapping polygons:

  • A "Shapefile" is actually a .zip of a handful of different database files. Usually you do not want to unzip your shapefile before you use it.
  • A "KML" is a single file in Google's proprietary mapping format.

Finding Shapefiles (and KML files)

Where to Start:

CUNY's research center maintains some though that database is not kept up to date, so always look for the source URL.

International

Geocommons includes a wide range of data sets including things like Iraqi pipelines -- just be sure you understand the provenance of any boundary files you find there!

The World Bank maintains a data catalog as well as an open data portal. I haven't quite sussed out which has GIS data or what the difference is.

The United Nations maintains some geospatial information.

National

The US Census provides shapefiles for state boundaries, congressional districts and counties as well as for Cenus specific regions such as Metropolitan Statistical Areas and Zipcode Tabulation Areas.

The National Historical Geographic Information System (NHGIS ) provides, free of charge, aggregate census data and GIS-compatible boundary files for the United States between 1790 and 2013.

A word about zipcodes

Lyzi "Bonecrusher" Diamond@lyzidiamond

Y'all. ZIP codes are not defined areas. Addresses have ZIP codes. The definition of a "ZIP code" is a list of addresses. And there are lots of different ways to take a bunch of points and turn them into a polygon.

More on that: So you want to map zipcodes.

Voting Precincts and Political Geographies

MGGG has high-quality precinct-level shapefiles matched with various other political geographies for a decent number of states. OpenPrecincts has precinct-level shapefiles, some matched with election results, for a handful of states too. The Voting Election and Science Team updates their data often with new shapefiles, although these ones sometimes have topology errors and missing data for certain counties.

Every 10 years, the Census releases shapefiles of VTDs (voter tabulation districts). This is the authoritative source on precinct shapes. But because precinct boundaries can change between elections, after it's been released for a few years the Census' VTDs can be fairly inaccurate.

City and County

Who maintains these boundaries? The city? county? community board? Do you know (or can you guess) which agency is responsible for the boundary files? Check their websites.

SF Bay Area

Some, but not all, jurisdictions have open data portals that include shapefiles:

Transbase is a great round up of San Francisco data, provided by the Department of Public Health.

NYC

NYC's data portal includes lots of curious and obscure data sets including things like shapefiles for NYC BIDs, or John Weir keeps a nice list of every NYC data set available that might be more searchable

Using Shapefiles in Carto

In CartoDB, you have to use some SQL to join a data table to a table with polygons in it.

If you have two tables, one called births2010 contains birth records, by community district, and the other nycd is a shapefile. Both tables include a column called borocd I would use this query to UPDATE my births2010 table with geometry from my nycd table:

UPDATE births2010
SET the_geom =  nycd.the_geom
FROM nycd
WHERE nycd.borocd = births2010.borocd

See Also

Where to Find Data | Search Strategies

Jump in!

This page is a wiki. All you need is a Github account and you can add to it!