DATASETS AND PUBLIC INFORMATION

30 Links that can help you in research and data curation for your content

Investing time in curating data for your content is never a futile effort. Readers today are getting smarter and savvier than before. They are trained to filter contents based on data and sources. They had to because we are being overwhelmed with information every day. Hence, it is inevitable for content writers today to work extra hard in efforts to attract and retain readerships.

Where do you go when researching for content info?

There are many databases available online. And it is important for you to know these databases have two distinctive differences. Knowing the differences will help cut short your research time by a lot.

The two distinct types of data and resources are (1) validated datasets from reputable sources, and (2) unvalidated datasets from reputable sources. Note the keyword reputable sources. Data is the bread and butter of your content. You will not want to cite any not reputable sources, be it validated or unvalidated data.

There is also another matter need to be a concern of. While, most data are available for free on Web databases. For some datasets, you may need to pay an amount of chargeable fees to obtain the full dataset.

Reputable validated datasets

These databases or repositories are made public by governments, public organizations, i.e WHO and UN, or reputable medias, i.e. Reuters and CNN.

  1. World Health Organization (WHO) provides access to data and analyses for monitoring the global health situation.
  2. UN Data is a UN statistical database with a search engine. You can get world economic, development, and environmental statistics. It offers API for you to access the database as well.
  3. The Guardian Data Blog is a news blog that regularly posts visualizations and makes cleaned data available through a Google docs format.
  4. Microsoft Azure Marketplace offers data feed from the many data sources that includes demographic, environmental, financial, retail and sports data. Many of the sources are free, while some requires fees.
  5. UNICEF holds all kinds of data, from mortality rates to world hunger statistics.
  6. Data.gov database is from the US government providing data pertaining to the US.
  7. The Department of Statistic Malaysia is the repository of data and statistics pertaining to Malaysia.
  8. Data Market is a good place to explore data related to economics, healthcare, food and agriculture, and the automotive industry.
  9. Wunderground provides detailed weather info and lets you search historical data by zip code or city.
  10. Weatherbase provides detailed weather stats on temperature, rain and humidity of nearly 27,000 cities.
  11. WorldBank Where else to look for financial data of the world but the WorldBank? You can get virtually any country’s financial and economy standings here. Some other topics included are:
  12. Google Public data The Google Public Data Explorer makes large datasets easy to explore, visualize and communicate.
  13. Google Scholar is a free search engine that contains all kinds of academic literatures. Citing journal publishers, universities research papers, and other scholarly materials do not just make your content looks smarter, but as well as more trustworthy.

How to deal with unvalidated datasets from reputable sources

Sometimes researching for data on your topic, public or government databases may not be sufficient for you. You will need data from unvalidated datasets. You are advised to thread carefully when dealing with unvalidated datasets.

You will verify the data by checking with the datasets owners or compare with validated sources:

  • The methodology in which the data was collected
  • When was the data collected
  • The data sampling size and sample data used
  • The person behind the analysis
  • The research sponsor
  • The motives or objectives of the analysis

These are crucial steps to be taken seriously as some datasets may be biased. Biased not in a negative sense, but due to the objective of the analysis performed.

So, here are some unvalidated datasets from public or user submitted datasets.

  1. Factual is an open data platform developed to maximize data accuracy, transparency, and accessibility. It has data sets about local place information, entertainment and information derived from government sources. You can access to the datasets through web service APIs and reusable, customisable web applications.
  2. Freebase is a community-curated graph database of well-known people, places, and things and they are free. It consists of metadata composed in collection of structured data harvested from many sources, including individual Wiki contributions. Programmers will be delighted with the open API.
  3. Crunchbase is a free database of the startup ecosystem. It provides an invaluable source of information about major companies, startups, investors and executives in the tech ecosystem.
  4. Socrata is a company that provides social data discovery services for opening government data. It democratizes access to government data.
  5. Datamob is a public data repository. It highlights the connection between public data sources, i.e. goverments and public institutions, and the interfaces people are building for them. There is a wide selection of innovative maps, datasets, and data visualization interfaces drawing on a variety of topics.
  6. Quandl is a search engine for numerical data. It contains financial, economic, and social datasets. You can easily search, download, share or access via API.
  7. DataMarket provides access to and visually displays data from public and, to a lesser extent, private institutions and companies. It helps you to find and understand data by bringing complex and diverse data together in one place and one format.
  8. Infochimps is the place for big, fast and complex data. It’s a data marketplace that offers thousands of public and proprietary datasets for download and API access. It comes in a wide range of categories, from historical data to geo-locations data, in different formats.
  9. Datawrangling provides a long list of URLs to check out for datasets. However, you have to curate the list sensibly as some links are outdated or data is expired.

Apps for retrieving live information

Besides Web repositories, there are also apps and tools available for you to curate your data. Here are some of the apps and tools that can help you in your research.

  1. Web analytics: Google Analytics
  2. Social networks: FacebookTwitterPinterestLinkedIn
  3. Project management tool: Basecamp
  4. Sales management tool: Salesforce
  5. Survey tools: SurveyMonkey
  6. Photo sharing tool: Flickr
  7. Email marketing: MailChimp
  8. Content management tool: Buzzsumo is a search tool that makes content hunting easy. You can search for topics that are buzzing on the social web. Use this to analyse what type of content performs best for you or your client.

What do you think of our suggested sources? Do you have any to recommend?

Let us know what you use!