To get the big picture, data needs to be shareable
Today products, services, decisions, even reputations and identities depend on data. Data was used to guide the transport you took to work this morning, and to decide and design the adverts you will have seen along the way. Data is used to decide whether you’re capable of paying a mortgage or if you’re at risk of developing Alzheimer’s disease. With personalized medicine and self-driving cars, we can see the benefits of analysing big data and creating intelligent systems, but in order for these processes to function we constantly need to share data effectively between people, algorithms and machines.
Connecting disparate data sources is difficult – “like crawling through broken glass,” to quote entrepreneur Howard Look – if the data in question is not prepared in such a way as to enable those connections to be made. Even if data is made available to others, or published as open data, its presentation, format and structure usually mean that significant effort is required to piece data together and get the big picture. Exposing a problem like corruption is usually hindered not by the lack of data but by the difficulty in understanding the true meaning and connections in the data as a whole. The answer can be hidden in plain sight.
At Thomson Reuters, we’ve been working on and with ecosystems that facilitate data sharing for a long time. We’ve found that there are common key elements that make data shareable, no matter what kind of data you’re working on or what you aim to use that data for. Together with the Open Data Institute we’ve summarised these as how to make data Shareable by Default, unlocking its potential value by preparing it so as to be more easily shared, discovered, understood, used and connected.
A checklist for shareable data
One of the main challenges in sharing data is in communicating its meaning. Data must be well-described so that users can understand exactly what that data is about. Permanent machine-readable identifiers (such as Thomson Reuters Open PermIDs) can be used to describe entities such as products or companies in data and used as anchors to link between different data sets. It is also important to understand where data comes from and what authorities created or influenced the data – its provenance – as these characteristics can significantly affect its suitability for the use at hand. In addition to the technical challenges, success depends on adopting an open and collaborative culture that promotes the expectation that data must be prepared to be shareable.
The elements that make data shareable also underpin the creation of a robust data infrastructure. This infrastructure, like physical ones, form connections across sector and organisational boundaries and allow processes and capabilities to be built on top of those connections, supporting a wide range of applications.
Business transactions occur on the global scale. So the fight against corruption is also global, and this must be powered by global data that is built to be shareable by default. This is both an ethical and a commercial necessity.
In this environment, it is vital to understand your business’ exposure to risk, whether that be through joint ventures, mergers and acquisitions or through your own customer, supply chain and third party relationships. Assessing that risk depends on your ability to connect the dots. That means being able to combine internal data with external information about customers and third parties, gather insights and react accordingly. For example, in the recent inquiry regarding the Panama Papers, the importance of data infrastructures lies both in the ability for those journalists with access to the papers to link key pieces of information together and deliver the gathered knowledge to the wider community, and with other interested external stakeholders incorporating that knowledge into their own activities. Shareable data makes this process easier, by enabling access controls, improving discoverability and using common standards so that the pieces of the data puzzle just fit together.
The role of corruption extends from the smuggling of weapons to human trafficking and child and slave labour. Our success in the fight on corruption, like many other data-driven initiatives, will be determined by the effectiveness with which data is shared. Making data shareable is a big step forward in a battle we can’t afford to lose.
Download the white paper Shareable by Default: Creating resilient data ecosystems.
This is our second white paper in collaboration with the ODI. Our previous paper – Creating Value with Identifiers in an Open Data World – presents recommendations for using data identifiers in linked data.
Discover Thomson Reuters Labs™ and partner with us.