Finding datasets
Finding datasets
Datasets are held in many repositories worldwide. Universities and research organizations have their own repositories. There are also discipline-specific repositories.
Datasets are often assigned a Digital Object Identifier (DOI), a unique alphanumeric string, which persistently identifies the dataset. If you know the DOI you can search for it using a regular search engine.
There are many options for discovering datasets and data repositories.
General sources
- DataCiteDataCite is an organization that assigns Digital Object Identifiers (DOI) for datasets. To date it has assigned more than 16 million DOIs for datasets across all disciplines.
- Google Dataset SearchGoogle Dataset Search is a search engine for datasets. It searches data repositories across the Web, finding datasets with a simple keyword search. It is currently in Beta.
- FigshareFigshare is a repository that allows researchers to share datasets and other research outputs. It also assigns DOI. You can search by keyword or browse categories.
- DataverseDataverse was developed at Harvard University as an open source web repository application for archiving datasets. Scroll down the page for worldwide locations of Dataverses.
Finding data repositories
Many datasets are held in subject-specific repositories.
- re3data.orgThe Registry of Research Data Repositories includes information on more than 2000 research data repositories worldwide.
- Open Access DirectoriesOAD is a list of discipline-specific data repositories.