Google recently launched a new search engine that aids journalists, policymakers, scientists and other users by granting access to datasets needed for research. The Dataset Search platform sifts through digital libraries, various publishers and author’s personal websites, among other places, to display open data repositories.
The search engine relies on publishers to label their data correctly and with appropriate information, meaning discrepancies will not be easily tackled. To mitigate the potential spread of misinformation, Google research scientist Natasha Noy stated,
We [collect] and link this information, analyze where different versions of the same dataset might be, and find publications that may be describing or discussing the dataset. Our approach is based on an open standard for describing this information (schema.org) and anybody who publishes data can describe their dataset this way. We encourage dataset providers, large and small, to adopt this common standard so that all datasets are part of this robust ecosystem.
The search engine contains shortcuts similar to that of Google’s usual search tools, meaning typing ‘weather site:weather.gov’ retrieves information solely from the National Weather Service. For general weather trends, users may simply search “daily weather trends” to be redirected to the appropriate datasets.
The new search engine stands as one of many initiatives Google has taken towards promoting journalistic integrity this year. In July, the company improved the representation of tabular datasets in search results and launched a program that allows journalists to identify misinformation in India.