-
Notifications
You must be signed in to change notification settings - Fork 0
Where to Find Data
Where should you look for reliable data?
This is a wiki -- all you need to contribute is a Github account -- so please feel free to chime in!
If you can't find what you need in public data, you might have to ask for it. And when you do, it helps to know your rights. Reporters Committee for Freedom of the Press has lots of information on US laws. The Global Investigative Journalism Network maintains a list of freedom of information laws around the world.
If you can't find public data and the agency that keeps it won't just hand it over, you might need to turn to FOIA.
-
A portal of portals -- this is a swamp, though.
-
Data.World is another swamp, but a rich one.
-
New Media @ UCB includes a few tutorials on finding specific types of data (scroll to the bottom to find those)
-
Open Data Charter's Anti-Corruption Guide is a resource for key data sources to examine.
- Search Systems maintains a database of available public records by US states and Canadian provinces. The site itself is vague about who runs it -- it is registered to a Tim Kostner of CA. Use your judgement!
- US Census is the first place to start. Census Reporter is perhaps a more accessible census data portal.
- United Nations databases
- International census data
- General Social Survey gathers data on contemporary American society in order to monitor and explain trends and constants in attitudes, behaviors, and attributes. Some questions go back 70+ years. GSS is compiled by NORC, a research organization at University of Chicago.
- FedStats
- FRED collates 240,000 US and international data sets related to economic indicators. State by state employment and population, for instance.
- National Center for Children in Poverty
- I put together a great rundown of juvenile justice data sources.
- Child Welfare Statistics
- AFCARS or Adoption and Foster Care Analysis and Reporting System
- The Center for State Foster Care and Adoption Data
- HHS Administration for Children & Families Child Welfare Outcomes Report is a PDF, but you can get state-by-state data as well.
- OSHA
- Bureau of Labor Statistics maintains data on workplace injuries and fatalities
- National Institutes of Health
- Medicare
- CDC
- World Health Organization, World Bank, NGOs, UN
- Charles Ornstein at ProPublica compiled an excellent guide to Covering the Opioid Epidemic With Data -- it is a rich round up of health care data sources.
- PACER provides electronic access to U.S. Federal District and Bankruptcy Courts; RECAP can give you free access to court filings that RECAP users have previously accessed.
- TRAC keeps excellent data on US immigration courts, as well as the IRS, DEA, ATF and FBI. They maintain a great series on judicial case loads
- I put together a great rundown of juvenile justice data sources for a project with JJIE.
- Center for Investigative Reporting maintains an extensive database of VA backlog data
- Hoovers
- [Investigative Dashboard] (http://www.investigativedashboard.org)
- Securities and Exchange Commission
- OSHA
- BLS, especially https://www.bls.gov/oes/current/naics5_511110.htm
- Corporation Wiki aggregates public records related to US corporations. They're more than a little vague about who is behind the site, but it is registered to one Michael Prince of Sagewire Research LLC. As with Wikipedia, this is not the final say, but a good place to start.
- US Economics and Statistics Administration is an agency of the Department of Commerce.
- The US Department of Labor publishes extensive data on workplace enforcement and extensive data on employment of women and veterans
- Department of Commerce
- TRAC keeps excellent data on US immigration courts, backlogs, proceedings and deportations.
- IRE's tips for covering Superstorm Sandy are full of great resources
- Weather Underground
- Earthquakes
- International Red Cross
- NOAA maintains loads of historical temperature data as well as State of the Climate, a collection of monthly summaries recapping climate-related occurrences on both a global and national scale.
Spring 2013 we focused the first few weeks on data around guns and gun deaths, and came up with some great resources.
- The Law Center to Prevent Gun Violence
- NRA Institute for Legislative Action
- National Conference of State Legislatures
- Brady Campaign to Prevent Gun Violence
- https://reportcards.nysed.gov/databasedownload.php
- US Department of Education collects data annually from state education departments and combines it into their Common Core of Data and SPLC has good context
CUNY Journalism's Research Center publishes excellent research guides
IRE maintains an impressive collection of databases most of which are not free of charge.
The Knight Center for Journalism in the Americas is offering an online course on data journalism this fall, and students there have been assembling a long list of datasets: https://datajmoocreading.hackpad.com/Datasets-EL3czksysiB
Google's Public Data Explorer is interesting, but you'll want to make sure you understand the provenance of any material you find there.
Data.gov is the home of the US Government's open data. Here you will find data, tools, and resources to conduct research, develop web and mobile applications, design data visualizations, and more.
The following are some roundups that I haven't had a chance to go through:
http://publicrecords.onlinesearches.com/UnitedStates.htm
http://publicrecords.searchsystems.net/United-States/
http://enigma.io/
http://www.reporter.org/desktop/
http://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public
http://www.reddit.com/r/datasets
http://databib.org/
http://dirtdiggersdigest.org/enforcement
http://tinyurl.com/mrrokph
http://openprism.thomaslevine.com
Sources and tools for data journalism is a nice roundup of reliable UK data sources.