Skip to content
Big data

Extracting value from new sources of data

Dr. Andrew Fletcher  Director, Thomson Reuters Labs™ – London

Dr. Andrew Fletcher  Director, Thomson Reuters Labs™ – London

“Data is the new oil,” so goes the oft-used quote. The analogy works because many are positioning data as a vast resource to be tapped; a raw commodity that needs to be refined to bring value.

But the analogy also works because getting data comes at a price. That cost can be easy to forget because most of the focus is on data that we already have which thanks to declining costs of storing and manipulating data, is effectively free.

New sources of data come with some very real costs. There are costs to acquire (mine) the data; aggregate (trade) the data; and refine the data, by making it Shareable by Default. But these costs can be well worth paying.

Mining the data

One example of an exciting new data source is the “data exhaust” resulting from many data collection processes, as explained by my colleague Adam Baron. One person’s trash is another person’s treasure; odds are there will be plenty of people happy to mine your data even if you yourself see no value in it.

However, going out and collecting new data costs money, particularly when that data comes from sensors that cost money to buy, install, and maintain. The Internet of Things is an exciting new trend, but it is not free! In the same way oil is only tapped when it is economical, so new data sources are created and mined only when the insights gained outweigh the costs of acquiring them.

Take a company like Fungi Alert, which has a technology for farmers to monitor for particular pathogens in their fields. It is only worth it for a farmer to invest money, time and effort into the technology if it successfully identifies pathogens, preventing damage and potential financial loss.

Similarly, satellite monitoring, once the preserve of governments, is now accessible to even small firms and the number of micro-satellites in orbit has rocketed. But spending £20k to launch your own satellite (or swarm of satellites), whilst cheap compared to the millions it once might have cost, is still only a path you will take if you can see a return.

Trading the data

However, once data is aggregated, the cost of acquiring it can be shared — rather than it being a two-way transaction. Thomson Reuters Data Share, for example, is a mobile application used to collect information on crops directly from farmers in exchange for the aggregated insight normally only available to professional agriculture traders. The aggregated information is highly valuable and would never be accessible without all the different parties working together.

Similarly with the Internet of Things, we are just beginning to see platforms emerge that enable you to add in your sources of data and make it searchable. For example, rather than just being able to see air pollution from your own sensor station, why not look at that information combined with others to see the whole picture locally, nationally, or internationally? Companies such as Thingful and Open Sensors are emerging to be aggregators, either for general use or, increasingly, commercial application.

One of the biggest challenges faced in creating value from data is the difficulty of ‘seeing’ that data in context.

Refining; making new sources of data Shareable by Default

Aggregation is not always appropriate; a lot of new data that is collected is proprietary. Many companies will be monitoring their production systems, their freight, or other aspects of their supply chains (such as locations and conditions) and there is little incentive to share that data. In this instance, data is definitely not a commodity, but rather a competitive advantage.

Not sharing your data is one thing, but one of the biggest challenges faced in creating value from data is the difficulty of ‘seeing’ that data in context. The “refining” work that needs to be done to link different data sets together, or understand provenance, is a very real cost, as explained by another colleague Tharindi Hapuarachchi.

Even if you never plan to let your data be accessed by anyone else, designing it to be shareable unlocks your own ability to combine it with other sources of data and see it in a broader context. For this reason, in our work with the Open Data Institute, we argue that all data should be built to be Shareable by Default.

To take a more personal example, health and fitness data collected by your wearable device may be something you never want to release. But making it Shareable by Default opens the door to very deep and meaningful personalisation, such as understanding how the aggregated picture of air pollution mentioned above might impact your own health.

So, “data is the new oil,” but mining, aggregating and refining it comes at a cost. And that cost is only worth paying if you can view your data in context, by making it Shareable by Default. Otherwise the opportunity, and the value is thrown away.

Learn more

Creating resilient data ecosystems by making data shareable by default 

More answers