But gathering interesting data makes you want to pull your hair out, and not everyone has the resources to gather data on a large scale.
Luckily, there are enough people in this world who believe data and data sets should be shared as much as possible and have created ample resources to simplify things.
Where Can I Find Free Data Sets?
Luckily, there are enough people in this world who believe data should be shared as much as possible and have created ample resources to simplify things.
We’ve scoured the Internet and found 500 of the most interesting data sets out there. To keep it short and sweet, here are 50 to start you off.
To make it easier for you, we’ve divided the datasets examples into a couple of categories. You can skip the Google dataset search and click on the below links to do a quick jump to your preferred section.
Before you dig deep into the sources if you want to create visuals with your data, sign up for a Piktochart account. It’s free to get started, and you can use one of the infographics, reports, posters, brochures, or presentation templates to make it easier.
Business and Employment Data Sets
1. Crunchbase – Find business information about private and public companies. You can look up how many investments they had, who the founding members were, and if they had any mergers or acquisitions.
2. Glassdoor Research – Glassdoor offers data related to employment. You can, for example, figure out how much you can save by retaining employees.
3. Open Corporates – Open Corporates is the largest open database of companies and company data in the world. Used by banks and governments, Open corporates pride themselves on having the most accurate data.
Crime/Conflict/Drugs Data Sets
4. FBI Uniform Crime Reporting – The Uniform Crime Reporting compiles statistical crime reports, publications, and data points from thousands of cities, universities, states, and federal law enforcement agencies.
5. Uppsala Conflict Data Program – The Uppsala Conflict Data Program (UCDP) provides data on organized crime and civil war around the world.
6. National Institute on Drug Abuse – The National Institute on Drug Abuse (NIDA) monitors the prevalence and trends regarding drug abuse in the United States.
Internet Data Sets
7. DBpedia – DBpedia aims to make Wikipedia’s information easily searchable via SPARQL queries or by downloading their information directly. For instance, you can search for NBA players born in the 80s in cities with more than 1M inhabitants.
8. Google Trends – Google Trends allows you to look at what’s going on in the world. It gives you data about what’s becoming popular and how much people are searching for a particular term, making this useful for exploratory data analysis.
9. Instagram API – Facebook allows you to use Instagram’s API to quickly access comments, metadata, and metrics.
Finance Data Sets
If you’re looking to find economic and financial data, look no further than these top sources housing a plethora of historical data sets.
10. Comtrade – Official trade in goods and services data sets managed by the UN COMTRADE database. There are data visualization tools and an API, and other extraction tools available.
11. Datahub – Stock Market – From gold prices and NASDAQ listings to S&P 500 companies, you’ll find it all on datahub.io
12. Global Financial Data – Global Financial Data gives you exactly what it says on the tin; data about the finances of the world. Ranges from real estate and global macro data to market data.
13. IMF Data – The IMF, or International Monetary Fund, is an organization that aims to foster monetary collaboration between countries. You can find data on trade, government finance, and financial development.
14. The Atlas of Economic Complexity – The Atlas of Economic Complexity provides data about global trade dynamics over time. Want to know the number of textiles China exported to South Korea? Easy.
15. World Bank – Not only does the World Bank provide financial data about countries, but it also provides data on education and health.
16. Financial Times Data – Here, you’ll find cold, hard numbers about the different markets in the world. Data include fluctuations in currency, yield rates of bonds, and commodity prices.
Health Data Sets
17. Centers for Disease Control (CDC) – The CDC provides quantitative data on a wide variety of health-related topics like diabetes, life expectancy, cancer, and obesity. They also provide other resources you can use to find more data.
18. Enigma Public – health – Enigma Public calls itself “the world’s broadest collection of public data.” The mostly US-centric site provides data on foodborne outbreaks, Medicare drug spending, and OSHA. Also provides data about other subjects like transportation and immigration.
19. Health Data – Health portal with 3,000+ valuable data sets about epidemiology and population statistics, managed by the U.S. Department of Health & Human Services. API available.
20. NHS Digital – Provides data about the health and social care system in the UK. Want to know what drugs are prescribed by doctors in the UK? Well, now you can find out.
21. US Food & Drug Administration – The FDA provides data about what drugs are currently approved in the US. The data is updated every week. You might have to brush up on your Excel skills since the data is only available in database or CSV form.
22. World Health Organization – As the name suggests, the WHO provides data about different health-related topics. Ranging from road safety, water, and sanitation, to mental health.
Entertainment Data Sets
23. BFI – Film Forever – Here, you can find data about the film industry in the UK. You can find data about how a film has influenced UK culture and how much Avengers: Endgame made every other film irrelevant the week it came out.
24. Football Data – Want to know who the referee was in a particular football (or soccer, depending on where you’re from) game in Scotland? Well, you’re in luck. Football data provides just that and much more. The site is heavily focused on betting, but you can find a lot of info about past football matches.
25. Statista – Video Games – Statista’s sub-catalog, where you can find statistics, facts, and market data on the video game industry worldwide, such as the number of games and gaming revenue.
Government Data Sets
26. Australian Government Catalogue – As you might have guessed from reading the name, this dataset is focused on the Australian government. You can find data about soil quality, marine life, or environmental planning.
27. Data.gov – The US counterpart of the AGC. Loads and loads of data on about 14 different topics. From agriculture, and public safety, to local government. The data sets are older, but still accurate and good to use.
28. Data.gov.uk – With over 50 000 data sets, you’ll have no trouble finding what you need to know about the UK government.
29. data.europa.eu – Open data portal by the European Commission and other institutions of the European Union, covering 14,000+ data sets on energy, agriculture, or economics.
30. London Datastore – Data about life in London. Want to know how much the population has increased in five years? Or maybe you want to know how many tourists they had compared to last quarter? You’ll find it in the London datastore.
31. NYC Open Data – If London isn’t your thing, you can look up the data for New York City. You can find data about corruption, election, and media.
32. Open Data Canada – The official government portal sharing public data sets in Canada. Much like the Australian Government Catalogue and Data.gov.
33. UK Data Service – UK Data Service’s vision is to “strengthen society and improve people’s lives by informing quality research through unlocking the power of data.” They work with different institutions and agencies to gather data about a wide variety of subjects.
Transportation Data Sets
34. National Travel & Tourism Office – The site might look like it was made in the 90s, but it does a good job at giving data about international tourism in the US.
35. NYC Taxi Trip Data – Here, you can find detailed data from the NYC Taxi and Limousine Commission. Data includes pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts.
36. Statista – Travel – Here, you’ll find data about different tourism-related topics like hotels, holiday destinations, and more.
37. US Travel Association – The U.S. Travel Association is a non-profit organization representing all components of the travel industry and provides high-quality research data on tourism and transportation.
Weather & Environment Data Sets
38. Africa Climate – Environment & climate change data in African countries, reported by major international organizations such as the World Bank, WHO, and The Global Fund.
39. Open AQ – Open AQ’s mission is to fight air pollution. They aggregate physical air quality data from public data sources provided by the government, research-grade, and other sources.
40. Weather.gov – Provides weather, water, and climate data, forecasts, and warnings for the protection of life and enhancement of the national economy. This source provides historical weather data from the US.
Miscellaneous Data Sets
41. Amazon AWS – Amazon provides an open registry of all open data on AWS. From satellite images to web crawl and IRS data.
42. Data.World – Biology – Here, you can find open data about biology contributed by thousands of users and organizations across the world.
43. NASA Earth Data System – Since 1994, NASA’s Earth science data has been free and open to all users for any purpose. It provides near real-time data from cool-sounding measuring instruments like a Moderate Resolution Imaging Spectroradiometer or an Atmospheric Infrared Sounder. Neat.
44. FiveThirthyEight – FiveThirtyEight uses hard data and statistical analysis to tell stories about politics, sports, economics, and culture. In the name of transparency, they share the data used in their articles.
45. Google Public Data – More like a search engine for data.
46. Kaggle – A data science community that regularly shares data sets about the most varied topics and categories, including the complete FIFA19 player dataset, wine reviews, or chest X-ray images.
47. Pew Internet – Pew Research Center is a non-partisan fact tank aggregating the most varied data sources. They also offer the results of their own survey research and analysis for free, but only two years after reports are issued.
48. Reddit – Datasets – A subreddit for datasets. Some of the top ones this past year are: 480,000 Rotten Tomato critic reviews, UC Berkeley’s Self-Driving dataset, and 1,340 coffee bean reviews.
49. Reeep Data – Free-to-use clean energy data sets including actors, project outcome documents, country policy reports, and more than 3,000 clean energy terms.
50. USDA – Food Composition – The United States Department of Agriculture provides data about the composition and nutrient values of different foods.
Put Those Interesting Data Sets to Good Use
The right data presented in the right way can be the difference between a typical data visualization project and an amazing one.
Some publications, like the Economist Intelligence Unit or Bellingcat, have built their entire reputation on their great use of data and data sets in reporting. And while you don’t need to go as far as them or their data scientists, you can certainly learn from them for your data processing projects.
Make Your Data Set Findings Interesting and Understandable
Whether you want to use your own personal data or if you’re going to use data publicly accessible (from the sources above) for your data visualization projects, make sure you do it in a way other people can understand and learn from. A data set is only useful if you can present it in such a way that is digestible for your audience.
How to Ace Your Data Visualization Project
Whether you’re putting together a school project on one data set or completing a data science project and/or data science portfolio, you need to first do your research and explore data.
Once you’ve found what your focus is, hone in on the one single point you want to get across, use a simple design, and visualize your data in such a way that it becomes easy to understand.
How to Get Started Turning Your Data Set Into a Visual
Take a look at these free templates your can start with to create your own visuals using data pulled from credible data sets. Piktochart’s free infographic maker makes it easy for you to visualize data into beautiful visuals.
Data presented in the right way can be the difference between a mediocre presentation and an amazing one.
Some publications, like the Economist Intelligence Unit or Bellingcat, have built their entire reputation on their great use of data in reporting. And while you don’t need to go as far as them, you can certainly learn from them.
If you’re going to use data in your projects, make sure you do it in a way other people can understand. Focus on one single point you want to get across, use a simple design, and visualize your data in such a way that it becomes easy to understand. Take a look at these free templates your can start with to create your own visuals using data.