There’s a lot of data out there. But where do you start to find what you need?
Some basic strategies that work pretty well: Google it. That’s never a bad place to start and it only takes a second. (And use Google in a smart way. Use key words specific to your data. Use filetype: to narrow your search for specific file types. For example, use filetype: csv for only csv file formats. Use the results to dig deeper and discover related agencies that may have the data).
- Figure out who should have the data? Who might have it? Is this information only the NYPD or the IRS can collect? The Departments of City Planning, Buildings, Housing, Finance and Taxation all keep tabs on who owns property in New York City, where that property is located and what it can be used for. If you know who ought to have the numbers you’re looking for, you can start your search by asking them.
- Look at recent reporting about the subject. Who has been releasing reports? Who has been cited in stories? Go ask them for data, or ask them for help finding it.
- Wikipedia is a fantastic resource. Don’t be afraid of it. Most information there comes with a citation — don’t take some Wikipedia author’s word for it, but do look at the source they cited and confirm that the numbers are there.
- Look for think tanks and aid organizations that specialize in the issue you’re interested in.
- Ask a librarian
Know your sources
You can get data anywhere, so it is up to you to decide whether or not you’re working with reliable data. You should know where your sources are coming from — do they have an agenda that can help you understand how they’re framing the data they put out? You can roughly guess who is behind NRA Institute for Legislative Action, but what about Law Center to Prevent Gun Violence? Don’t assume that a think tank is reliable just because it kind of feels professional.
A famous example is the misleading website www.martinlutherking.org. Though the site appears to be an informational site about the civil rights leader Martin Luther King, Jr., it actually is a mouthpiece for the white supremacist group Stormfront.org. You can verify the ownership of domain sites using www.betterwhois.com.
Provenance
It is also up to you to know where your data is coming from. Did the organization hire a research firm to conduct a comprehensive study? Or did they post a little box on their website asking visitors how they feel?
Be skeptical: an advocate (or government agency) insisting that these numbers mean something doesn’t make it so.
Where to look?
The Journalism School’s Research Center maintains an excellent roundup of guides, many of which will point you to great data sets. Check out the census, business and crime guides in particular.
NICAR’s database library is a great resource. So is Amanda’s tumblr’s “data sources” tag.
Here’s a working guide from last semester: https://github.com/amandabee/cunyjdata/wiki/Where-to-Find-Data